public class PDFTextStream extends java.lang.Object implements Document
Please use the open methods on the PDF factory class for opening PDF documents, e.g.
PDF.open(java.io.File).
ATTR_AUTHOR, ATTR_CREATION_DATE, ATTR_CREATOR, ATTR_KEYWORDS, ATTR_MOD_DATE, ATTR_PRODUCER, ATTR_SUBJECT, ATTR_TITLE, ATTR_TRAPPED, ATTR_USES_GRAPH_FONTS| Constructor and Description |
|---|
PDFTextStream(java.nio.ByteBuffer pdfData,
java.lang.String pdfName)
Deprecated.
|
PDFTextStream(java.nio.ByteBuffer pdfData,
java.lang.String pdfName,
byte[] userPasswd)
Deprecated.
|
PDFTextStream(java.nio.ByteBuffer pdfData,
java.lang.String pdfName,
byte[] userPasswd,
Configuration config)
Deprecated.
|
PDFTextStream(java.io.File pdfFile)
Deprecated.
|
PDFTextStream(java.io.File pdfFile,
byte[] userPasswd)
Deprecated.
|
PDFTextStream(java.io.File pdfFile,
byte[] userPasswd,
Configuration config)
Deprecated.
|
PDFTextStream(java.io.InputStream is,
java.lang.String pdfName)
Deprecated.
|
PDFTextStream(java.io.InputStream is,
java.lang.String pdfName,
byte[] userPasswd)
Deprecated.
|
PDFTextStream(java.io.InputStream is,
java.lang.String pdfName,
byte[] userPasswd,
Configuration config)
Deprecated.
|
PDFTextStream(java.lang.String pdfFilePath)
Deprecated.
|
PDFTextStream(java.lang.String pdfFilePath,
byte[] userPasswd)
Deprecated.
|
PDFTextStream(java.lang.String pdfFilePath,
byte[] userPasswd,
Configuration config)
Deprecated.
|
| Modifier and Type | Method and Description |
|---|---|
void |
close()
Deprecated.
|
java.util.List<Annotation> |
getAllAnnotations()
Deprecated.
Returns a list containing all of the
Annotations contained in the
current PDF document. |
int |
getAllAnnotations(java.util.List tgt)
Deprecated.
Adds to the given List all of the
Annotations contained in the current PDF
document. |
java.util.List<EmbeddedFile> |
getAllEmbeddedFiles()
Deprecated.
Returns a list of all of
the embedded files available in the source PDF. |
java.util.List<Annotation> |
getAnnotations(int page)
Deprecated.
Returns a List of all annotations found on the page indicated by the given page number;
each object will be an instance of a class that implements the
Annotation interface. |
java.lang.Object |
getAttribute(java.lang.String attrName)
Deprecated.
Returns the value of the specified document-level metadata attribute.
|
java.util.Set |
getAttributeKeys()
Deprecated.
Returns a
Set containing the keys of all available document metadata attributes. |
java.util.Map |
getAttributeMap()
Deprecated.
Returns a
Map containing a copy of all keys and values of all available document
metadata attributes. |
Bookmark |
getBookmarks()
Deprecated.
If the current PDF document contains a bookmark tree, this function will return its root node.
|
Configuration |
getConfig()
Deprecated.
Returns the
Configuration instance that this Document is using
to govern its operation. |
java.util.List<EmbeddedFile> |
getEmbeddedFiles()
Deprecated.
Returns a list of
the embedded files associated with the source PDF document itself. |
EncryptionInfo |
getEncryptionInfo()
Deprecated.
Returns an EncryptionInfo object, which provides access to some of the parameters used for the current
PDF document's encryption.
|
Form |
getFormData()
Deprecated.
Loads the form data contained in the current document, and returns a
Form object
that represents that data. |
java.util.Collection<Image> |
getImages()
Deprecated.
|
java.lang.String |
getName()
Deprecated.
|
Page |
getPage(int n)
Deprecated.
Reads and returns a single page.
|
int |
getPageCnt()
Deprecated.
Returns the number of pages in the PDF document.
|
java.util.List<Page> |
getPages()
Deprecated.
|
java.io.File |
getPDFFile()
Deprecated.
Returns a reference to the file that this
Document is processing. |
long |
getPdfFileSize()
Deprecated.
Returns the size of the PDF file being read, in bytes.
|
PDFVersion |
getPDFVersion()
Deprecated.
Returns the
PDFVersion instance that corresponds with the version of the PDF file
specification to which current PDF file adheres. |
byte[] |
getXmlMetadata()
Deprecated.
Returns the XML metadata available from this
Document, or null if no XML metadata is available. |
static boolean |
isLicensed()
Deprecated.
Retained to maintain PDFTextStream v2.x API compatibility. Use
() instead. |
static boolean |
loadLicense(java.lang.String path)
Deprecated.
Retained to maintain PDFTextStream v2.x API compatibility. Use
(String) instead. |
static boolean |
loadLicense(java.net.URL licenseLocation)
Deprecated.
Retained to maintain PDFTextStream v2.x API compatibility. Use
PDF.loadLicense(java.net.URL)
instead. |
void |
pipe(OutputHandler handler)
Deprecated.
Extracts all available text from this
Document, sending all PDF text events
to the given OutputHandler. |
void |
setConfig(Configuration config)
Deprecated.
Sets the
Configuration instance that this Document will
use in various contexts to govern its operation. |
public PDFTextStream(java.io.InputStream is,
java.lang.String pdfName)
open" static method in PDF, provided to
ensure backwards compatibility with codebases using the PDFTextStream v2.x API.public PDFTextStream(java.io.File pdfFile)
open" static method in PDF, provided to
ensure backwards compatibility with codebases using the PDFTextStream v2.x API.public PDFTextStream(java.lang.String pdfFilePath)
open" static method in PDF, provided to
ensure backwards compatibility with codebases using the PDFTextStream v2.x API.public PDFTextStream(java.io.InputStream is,
java.lang.String pdfName,
byte[] userPasswd,
Configuration config)
open" static method in PDF, provided to
ensure backwards compatibility with codebases using the PDFTextStream v2.x API.public PDFTextStream(java.io.InputStream is,
java.lang.String pdfName,
byte[] userPasswd)
open" static method in PDF, provided to
ensure backwards compatibility with codebases using the PDFTextStream v2.x API.public PDFTextStream(java.io.File pdfFile,
byte[] userPasswd,
Configuration config)
open" static method in PDF, provided to
ensure backwards compatibility with codebases using the PDFTextStream v2.x API.public PDFTextStream(java.lang.String pdfFilePath,
byte[] userPasswd,
Configuration config)
open" static method in PDF, provided to
ensure backwards compatibility with codebases using the PDFTextStream v2.x API.public PDFTextStream(java.io.File pdfFile,
byte[] userPasswd)
open" static method in PDF, provided to
ensure backwards compatibility with codebases using the PDFTextStream v2.x API.public PDFTextStream(java.lang.String pdfFilePath,
byte[] userPasswd)
open" static method in PDF, provided to
ensure backwards compatibility with codebases using the PDFTextStream v2.x API.public PDFTextStream(java.nio.ByteBuffer pdfData,
java.lang.String pdfName,
byte[] userPasswd,
Configuration config)
open" static method in PDF, provided to
ensure backwards compatibility with codebases using the PDFTextStream v2.x API.public PDFTextStream(java.nio.ByteBuffer pdfData,
java.lang.String pdfName,
byte[] userPasswd)
open" static method in PDF, provided to
ensure backwards compatibility with codebases using the PDFTextStream v2.x API.public PDFTextStream(java.nio.ByteBuffer pdfData,
java.lang.String pdfName)
open" static method in PDF, provided to
ensure backwards compatibility with codebases using the PDFTextStream v2.x API.public static boolean loadLicense(java.lang.String path)
(String) instead.public static boolean loadLicense(java.net.URL licenseLocation)
PDF.loadLicense(java.net.URL)
instead.public static boolean isLicensed()
() instead.public void setConfig(Configuration config)
DocumentConfiguration instance that this Document will
use in various contexts to govern its operation.
Note that certain configuration options are utilized only when a Document is being opened.
In order for non-default settings for those such options to take effect, a customized Configuration
object must either be set as the default configuration,
or must be provided to any of the com.snowtide.PDF.open() static methods that accept a
Configuration object, e.g. PDF.open(java.io.File, byte[], Configuration).
public Configuration getConfig()
DocumentConfiguration instance that this Document is using
to govern its operation.public void pipe(OutputHandler handler)
Document
Extracts all available text from this Document, sending all PDF text events
to the given OutputHandler.
If no special PDF text event handling is needed (i.e. you just want a straight text extract),
then using an OutputTarget is recommended.
pipe in interface Documenthandler - an OutputHandler instance.OutputHandler,
OutputTargetpublic java.util.Collection<Image> getImages()
Documentpublic long getPdfFileSize()
DocumentgetPdfFileSize in interface Documentpublic int getPageCnt()
DocumentgetPageCnt in interface Documentpublic Page getPage(int n)
Documentpublic java.lang.String getName()
DocumentDocument is reading; this will be either the name
of the PDF
file that is being read, or the pdfName String that was provided if this
Document was opened using one of the com.snowtide.PDF.open() methods that
accepts an InputStream or ByteBuffer,
e.g. PDF.open(java.io.InputStream, String)
Nearly all of the logging messages generated by PDFxStream include the relevant
Document's name, making them easier to interpret in a multithreaded production
environment.
public java.io.File getPDFFile()
DocumentDocument is processing.
This reference may be null if the Document instance is not reading from a File or
InputStream.getPDFFile in interface Documentpublic void close()
public Form getFormData()
DocumentForm object
that represents that data. If the current PDF contains no forms, this function returns null.
The Form instance that is returned by this function is guaranteed to be an
AcroForm.
This function MUST NOT be called after this Document is closed.
getFormData in interface Documentpublic java.util.List<EmbeddedFile> getEmbeddedFiles()
Documentthe embedded files associated with the source PDF document itself.
Use Document.getAllEmbeddedFiles() to include all embedded files associated with annotations as well.getEmbeddedFiles in interface DocumentDocument.getAllEmbeddedFiles()public java.util.List<EmbeddedFile> getAllEmbeddedFiles()
Documentthe embedded files available in the source PDF.
This method includes all files associated with annotations as well; if you only want those
embedded files that are associated with the source document itself (and not annotations),
use Document.getEmbeddedFiles().getAllEmbeddedFiles in interface DocumentDocument.getEmbeddedFiles()public Bookmark getBookmarks()
Document
An exception will be thrown if this function is called after this Document instance
is closed.
getBookmarks in interface DocumentBookmarkpublic java.util.List<Annotation> getAnnotations(int page)
DocumentAnnotation interface.
This function will never return null; if a page contains no annotations, an empty list will be returned. The returned list is guaranteed to offer efficient random access to its elements.
getAnnotations in interface DocumentAnnotationpublic java.util.List<Annotation> getAllAnnotations()
DocumentAnnotations contained in the
current PDF document.
The returned list is guaranteed to offer efficient random access to its elements.getAllAnnotations in interface Documentpublic int getAllAnnotations(java.util.List tgt)
DocumentAnnotations contained in the current PDF
document.getAllAnnotations in interface DocumentAnnotationpublic PDFVersion getPDFVersion()
Document
Returns the PDFVersion instance that corresponds with the version of the PDF file
specification to which current PDF file adheres. PDF specification version numbers
correspond directly with particular versions of Adobe Acrobat:
This method may not be called after the Document is
closed.
getPDFVersion in interface Documentpublic EncryptionInfo getEncryptionInfo()
DocumentIf the current PDF document is not encrypted, this method will return null.
getEncryptionInfo in interface Documentpublic byte[] getXmlMetadata()
Document
Returns the XML metadata available from this Document, or null if no XML metadata is available.
Note: This method must be called before the Document is closed, and it should not
be called while text is being actively read out of it. (Supporting such concurrency would require synchronization
that would negatively impact performance.) Therefore, the best times to call this method are:
Document but before reading text out of itDocument, but before it is closedPDFxStream does not control the content returned by this method -- it just provides access to the data that is already stored in a PDF document. The schema of the the returned XML data is defined by Adobe, and is called the Extensible Metadata Platform (XMP). More information about XMP can be found on Adobe's website
getXmlMetadata in interface Documentpublic java.lang.Object getAttribute(java.lang.String attrName)
DocumentAll of the standard attribute names are defined in constants in this class, and are all prefixed with 'ATTR_'. A few notes should be kept in mind when accessing attribute values:
getAttributeKeys() method to get a
Set of the names of all available attributes.parseDateString(String) to get a Date object.
Note: the attributes available through this method are retrieved from the "classic" document /Info entry.
The document metadata in an XML format (which typically contains the same set of metadata attributes
that are available through this method) may be obtained via the
getXmlMetadata() method.
getAttribute in interface DocumentattrName - the name of the attribute to be retrievedgetXmlMetadata() for access to the XML-formatted document metadatapublic java.util.Set getAttributeKeys()
DocumentSet containing the keys of all available document metadata attributes.getAttributeKeys in interface Documentpublic java.util.Map getAttributeMap()
DocumentMap containing a copy of all keys and values of all available document
metadata attributes.getAttributeMap in interface Document