PDF XML Converter
Developer
Edition V4.0(COM)
PDF XML
Converter(P2X) extract the text information from the pdf
file and output them into a xml file. All the functions
were encapsulated into a COM component, the exposed
methods/interface is as same as PDF Plain Text
Extractor(P2T), but the output file is in XML format.
please check
PDF Plain Text Extractor(P2T) Server Edition (COM)
for detail technical information.
You can
integrate it into your own application and redistribute
it royalty free.
The output XML format
was defined in
PDFDocument.xsd
Output XML sample
Download Now
FAQ'S
Buy
It Now
<?xml version="1.0"
encoding="UTF-8"?>
<PDFDocument>
<PDFInfo>
<title><![CDATA[ PDF Reference ]]></Title>
<Subject><![CDATA[PDF Reference 1.4]]></Subject>
<Author><![CDATA[Smith.H]]></Author>
<Creator><![CDATA[PDF Writer]]></Creator>
<Producer><![CDATA[Adobe Acrobat]]></Producer>
<CreateDate><![CDATA[2002/06/15]]></CreateDate>
<KeyWords><![CDATA[PDF Reference]]></KeyWords>
</PDFInfo>
<Pages>
<Page>
<PageNumber>1</PageNumber>
<PDFElement>
<Coordinate_X>12</Coordinate_X>
<Coordinate_Y>34</Coordinate_Y>
<DataString>
<![CDATA[
Hello, this is a data chunk with
special chars "~@@^%^$(^#\''"'and
line break.CDATA will deal with
this kind of data perfectly.
]]>
</DataString>
</PDFElement>
.
.
.
</Page>
.
.
.
</Pages>
</PDFDocument>
|