This section describes the records that are produced when you ingest an image or document file.
The following sample XML shows a record produced when you ingest an image, multi-page image such as a TIFF file, or a presentation file (.PPT, .PPTX, .ODP).
<record>
<pageNumber>1</pageNumber>
<trackname>Image_1</trackname>
<Page>
<image>
<imagedata format="PNG">...</imagedata>
<width>222</width>
<height>140</height>
<pixelAspectRatio>1:1</pixelAspectRatio>
</image>
<pagetext/>
</Page>
</record>
The record contains the following information:
pagenumber element describes the page that the record is associated with. Most image files have a single page but formats such as TIFF support multiple pages.image element contains the image data, and provides the width and height of the image. The pixelAspectRatio element describes the shape of the pixels that make up the image, for example 1:1 pixels are square.If you ingest a document such as a PDF file, the output might also include the text extracted from text elements:
<record>
<pageNumber>1</pageNumber>
<trackname>Image_1</trackname>
<Page>
<image>
<imagedata format="PNG">...</imagedata>
<width>892</width>
<height>1260</height>
<pixelAspectRatio>1:1</pixelAspectRatio>
</image>
<pagetext>
<element>
<text>Some text</text>
<region>
<left>115</left>
<top>503</top>
<width>460</width>
<height>41</height>
</region>
<angle>0</angle>
</element>
...
</pagetext>
</Page>
</record>
The pagetext element contains information about associated text elements. If the ingested media was a PDF file, each record represents a page. If the ingested media was another type of document the record represents an embedded image and the text that follows it, up to the next embedded image.
Each element element describes a text element and contains the following data:
text element contains the text from the text element.The region element provides the position of the text element on the page.
Note: The region information is accurate only if the ingested document was an Adobe PDF file.
angle element provides the orientation of the text.Information about text elements is used by the OCR analysis engine, which automatically combines the text elements with the text extracted from images, to produce a complete transcript of the text that appears on the page.
|
|