Package com.snowtide.pdf
This package contains a variety of core abstractions representing PDF
Document
s
and their Page
s, as well as various PDFxStream
interfaces and implementations thereof that simply many PDF data extraction use cases.
See PDF
to get started.
-
Interface Summary Interface Description Document Interface implemented by all objects representing a PDF document.DocumentLocation Represents a unique location within aDocument
.Font Represents a PDF font.OutputSource Page Provides access to the text, images, and attributes of a page extracted from a PDF document. -
Class Summary Class Description Bookmark Instances of this class form a singly-rooted tree available in some PDF documents.Configuration Various configuration options for PDFxStream may be set using this class.Console This class provides a command-line interface to PDFxStream and its capabilities.EmbeddedFile Files in PDF documents may be associated either withthe document as a whole
, or withannotations
that are located on a single page in a particular location.EncryptionInfo Instances of this class provide information about the parameters used to encrypt a PDF document.OutputHandler The base class for all PDF text event handlers.OutputTarget This is a baseOutputHandler
implementation that directs all text extraction output to anAppendable
of your choice, e.g. aWriter
,StringBuilder
,CharBuffer
, and so on.PDFDateParser This class provides methods for parsing PDF-format date/time strings intoDate
s.PDFTextStream Deprecated. RegionOutputTarget ThisOutputHandler
implemenation is used to selectively extract text from certain regions of each PDF page.SelectionOutputTarget AnOutputTarget
derivative that restricts the content added to the givenStringBuffer
to that within the starting and ending selection points specified in the constructor.VisualOutputTarget This OutputHandler implementation aims to preserve as much of a PDF's text layout as possible so that text extracts will retain the visual arrangement of text as present in the original document. -
Enum Summary Enum Description Configuration.TelemetryMode PDFxStream makes very limited use of remote telemetry, strictly to ensure licensing compliance and to aid Snowtide's technical support operations.EncryptedPDFException.ErrorType An enumeration of the set of possibleerror types
that can be indicated by a thrownEncryptedPDFException
.PDFVersion An enumeration corresponding to the PDF specification version levels that can be returned byDocument.getPDFVersion()
. -
Exception Summary Exception Description EncryptedPDFException A subclass of IOException that is thrown by PDFxStream constructors if one of the following conditions occurs: a variety of encryption is encountered that PDFxStream does not support an error occurs while decrypting PDF data an incorrect password is provided to one of the PDFxStream constructorsFaultyPDFException Exceptions of this type are thrown by PDFxStream when it encounters such a serious error when attempting to process a PDF file that no extraction can take place.