Package pdfts.examples
Class GoogleHTMLOutputHandler
- java.lang.Object
-
- com.snowtide.pdf.OutputHandler
-
- pdfts.examples.GoogleHTMLOutputHandler
-
public class GoogleHTMLOutputHandler extends OutputHandler
This class is an example
OutputHandler
implementation that builds an XHTML document to mimic the HTML view that Google offers for indexed PDF documents.Source for this class is included in every PDFxStream bundle.
- Version:
- ©2004-2024 Snowtide
-
-
Constructor Summary
Constructors Constructor Description GoogleHTMLOutputHandler()
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Deprecated Methods Modifier and Type Method Description void
endPage(Page page)
Invoked when PDFxStream has finished processing a pageDocument
getHTMLDocument()
Returns the XHTML document that is built up by this OutputHandler.static void
main(String[] args)
Deprecated.Command-line usage of this class may be moved or removed in future PDFxStream releases.void
startPage(Page page)
Invoked when a page is about to be processed.void
startPDF(String pdfName, File pdfFile)
Invoked when a new PDF is about to be processed.void
textUnit(TextUnit tu)
Invoked when a run of characters is to be outputted, as represented by the givenTextUnit
instance.-
Methods inherited from class com.snowtide.pdf.OutputHandler
endBlock, endLine, endPDF, endSpan, linebreaks, spaces, startBlock, startLine, startSpan
-
-
-
-
Constructor Detail
-
GoogleHTMLOutputHandler
public GoogleHTMLOutputHandler() throws ParserConfigurationException, FactoryConfigurationError
-
-
Method Detail
-
main
public static void main(String[] args) throws Exception
Deprecated.Command-line usage of this class may be moved or removed in future PDFxStream releases.Main method for command-line execution. Usage:java GoogleHTMLOutputHandler [input_pdf_file] [output_html_path]
- Throws:
Exception
-
getHTMLDocument
public Document getHTMLDocument()
Returns the XHTML document that is built up by this OutputHandler.
-
startPage
public void startPage(Page page)
Description copied from class:OutputHandler
Invoked when a page is about to be processed.- Overrides:
startPage
in classOutputHandler
- Parameters:
page
- a reference to thePage
that is about to be processed
-
endPage
public void endPage(Page page)
Description copied from class:OutputHandler
Invoked when PDFxStream has finished processing a page- Overrides:
endPage
in classOutputHandler
- Parameters:
page
- a reference to thePage
that has been processed
-
startPDF
public void startPDF(String pdfName, File pdfFile)
Description copied from class:OutputHandler
Invoked when a new PDF is about to be processed.- Overrides:
startPDF
in classOutputHandler
- Parameters:
pdfName
- the 'name' of the PDF document, as provided byDocument.getName()
}pdfFile
- the file reference PDFxStream is about to begin processing. This reference may be null if the sourceDocument
is not reading from aFile
orInputStream
.
-
textUnit
public void textUnit(TextUnit tu)
Description copied from class:OutputHandler
Invoked when a run of characters is to be outputted, as represented by the givenTextUnit
instance.- Overrides:
textUnit
in classOutputHandler
-
-