|
|||||||||
PREV PACKAGE NEXT PACKAGE | FRAMES NO FRAMES |
See:
Description
Interface Summary | |
---|---|
Font | Represents a PDF font. |
Page | Instances of this class provide access to the text and attributes of a page extracted from a PDF document. |
Class Summary | |
---|---|
Bookmark | Instances of this class form a singly-rooted tree available in some PDF documents. |
EncryptionInfo | Instances of this class provide information about the parameters used to encrypt a PDF document. |
OutputHandler | The base class for all PDF text event handlers. |
OutputTarget |
This is a base OutputHandler implementation that provides a common output
interface for Writer and Appendable instances (such as
StringBuilder s and StringBuffer s), allowing
PDFTextStream to easily redirect output to either type of object. |
PDFDateParser |
This class provides methods for parsing PDF-format date/time strings
into java.util.Date objects . |
PDFTextStream |
PDFTextStream gives your Java, .NET, and Python applications the ability to:
Extract text and metadata from PDF documents (including metadata like XMP data, bookmarks, and annotations)
Extract and update interactive AcroForm data
Merge PDF documents
Instances of this class can either access a PDF file directly, or process equivalent data
delivered via a java.io.InputStream or java.nio.ByteBuffer . |
PDFTextStreamConfig | Various configuration options for PDFTextStream may be set using this class. |
PDFVersion | A typesafe enumeration class that provides singleton objects corresponding to
each possible PDFVersion instance that might be returned by calls to
PDFTextStream.getPDFVersion() . |
RegionOutputTarget | This OutputHandler implemenation is used to selectively extract text from certain regions of each PDF page. |
VisualOutputTarget | This OutputHandler implementation aims to preserve as much of a PDF's text layout as possible so that text extracts yielded by this OutputHandler will retain the visual arrangement of text as present in the original document. |
Exception Summary | |
---|---|
EncryptedPDFException | A subclass of IOException that is thrown by PDFTextStream constructors if one of the following conditions occurs: a variety of encryption is encountered that PDFTextStream does not support an error occurs while decrypting PDF data an incorrect password is provided to one of the PDFTextStream constructors |
FaultyPDFException | Exceptions of this type are thrown by PDFTextStream when it encounters such a serious error when attempting to process a PDF file that no extraction can take place. |
PDFTextStream is a library that provides high performance, accurate text and metadata extraction,
and is easy to integrate with your applications and web services on Java, .NET, and Python environments.
This javadoc is the authoritative reference for PDFTextStream on all three platforms; its API is identical regardless of
your development environment.
The com.snowtide.pdf
package is where the main PDFTextStream
class resides.
In addition, PDFTextStream comes with an integration module
for use with the Jakarta Lucene indexing and search library.
|
|||||||||
PREV PACKAGE NEXT PACKAGE | FRAMES NO FRAMES |