PDFTextStream is a library that provides high performance, accurate text and metadata extraction, and is easy to integrate with your applications and web services on Java, .NET, and Python environments.

See: Description

Packages 
Package Description
com.snowtide.pdf
PDFTextStream is a library that provides high performance, accurate text and metadata extraction, and is easy to integrate with your applications and web services on Java, .NET, and Python environments.
com.snowtide.pdf.annot
The com.snowtide.pdf.annot package contains interfaces and classes that PDFTextStream uses to represent various types of annotations present in PDF documents.
com.snowtide.pdf.forms
The com.snowtide.pdf.forms package is home to a variety of classes that support PDFTextStream's form extraction functionality.
com.snowtide.pdf.layout  
com.snowtide.pdf.lucene
The com.snowtide.pdf.lucene package provides a method of seamless integration between the Apache Lucene full-text indexing and search engine available for the Java environment.
com.snowtide.pdf.util  
com.snowtide.util.logging  
pdfts.examples