Document
s
and their Page
s, as well as various PDFxStream
interfaces and implementations thereof that simply many PDF data extraction use cases.See: Description
Interface | Description |
---|---|
Document |
Interface implemented by all objects representing a PDF document.
|
DocumentLocation |
Represents a unique location within a
Document . |
Font |
Represents a PDF font.
|
Page |
Provides access to the text, images, and attributes of a page extracted
from a PDF document.
|
Class | Description |
---|---|
Bookmark |
Instances of this class form a singly-rooted tree available in some PDF documents.
|
Configuration |
Various configuration options for PDFxStream may be set using this class.
|
Console |
This class provides a command-line interface to PDFxStream and its capabilities.
|
EmbeddedFile |
Files in PDF documents may be associated either with
the document as a whole , or
with annotations that are located on a single page in a particular
location. |
EncryptionInfo |
Instances of this class provide information about the parameters used to encrypt a PDF document.
|
OutputHandler |
The base class for all PDF text event handlers.
|
OutputTarget |
This is a base
OutputHandler implementation that directs all text extraction output to an
Appendable of your choice, e.g. a Writer ,
StringBuilder , CharBuffer , and so on. |
PDFDateParser |
This class provides methods for parsing PDF-format date/time strings
into
Date s. |
PDFTextStream | Deprecated |
RegionOutputTarget |
This
OutputHandler implemenation is used to selectively extract text from certain regions of each PDF page. |
SelectionOutputTarget |
An
OutputTarget derivative that restricts the content added to the given
StringBuffer to that within the starting and ending selection points
specified in the constructor. |
VisualOutputTarget |
This OutputHandler implementation aims to preserve as much of a PDF's text layout as possible so
that text extracts will retain the visual arrangement of text as present
in the original document.
|
Enum | Description |
---|---|
EncryptedPDFException.ErrorType |
An enumeration of the set of possible
error types that can be indicated by a thrown
EncryptedPDFException . |
PDFVersion |
An enumeration corresponding to the PDF specification version levels that can be returned by
Document.getPDFVersion() . |
Exception | Description |
---|---|
EncryptedPDFException |
A subclass of IOException that is thrown by PDFxStream constructors if
one of the following conditions occurs:
a variety of encryption is encountered that PDFxStream does not support
an error occurs while decrypting PDF data
an incorrect password is provided to one of the PDFxStream constructors
|
FaultyPDFException |
Exceptions of this type are thrown by PDFxStream when it encounters such a serious error when attempting
to process a PDF file that no extraction can take place.
|
Document
s
and their Page
s, as well as various PDFxStream
interfaces and implementations thereof that simply many PDF data extraction use cases.
See PDF
to get started.