Package pdfts.examples
Class GoogleHTMLOutputHandler
- java.lang.Object
-
- com.snowtide.pdf.OutputHandler
-
- pdfts.examples.GoogleHTMLOutputHandler
-
public class GoogleHTMLOutputHandler extends OutputHandler
This class is an example
OutputHandlerimplementation that builds an XHTML document to mimic the HTML view that Google offers for indexed PDF documents.Source for this class is included in every PDFxStream bundle.
- Version:
- ©2004-2025 Snowtide
-
-
Constructor Summary
Constructors Constructor Description GoogleHTMLOutputHandler()
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Deprecated Methods Modifier and Type Method Description voidendPage(Page page)Invoked when PDFxStream has finished processing a pageDocumentgetHTMLDocument()Returns the XHTML document that is built up by this OutputHandler.static voidmain(String[] args)Deprecated.Command-line usage of this class may be moved or removed in future PDFxStream releases.voidstartPage(Page page)Invoked when a page is about to be processed.voidstartPDF(String pdfName, File pdfFile)Invoked when a new PDF is about to be processed.voidtextUnit(TextUnit tu)Invoked when a run of characters is to be outputted, as represented by the givenTextUnitinstance.-
Methods inherited from class com.snowtide.pdf.OutputHandler
endBlock, endLine, endPDF, endSpan, linebreaks, spaces, startBlock, startLine, startSpan
-
-
-
-
Constructor Detail
-
GoogleHTMLOutputHandler
public GoogleHTMLOutputHandler() throws ParserConfigurationException, FactoryConfigurationError
-
-
Method Detail
-
main
@Deprecated public static void main(String[] args) throws Exception
Deprecated.Command-line usage of this class may be moved or removed in future PDFxStream releases.Main method for command-line execution. Usage:java GoogleHTMLOutputHandler [input_pdf_file] [output_html_path]
- Throws:
Exception
-
getHTMLDocument
public Document getHTMLDocument()
Returns the XHTML document that is built up by this OutputHandler.
-
startPage
public void startPage(Page page)
Description copied from class:OutputHandlerInvoked when a page is about to be processed.- Overrides:
startPagein classOutputHandler- Parameters:
page- a reference to thePagethat is about to be processed
-
endPage
public void endPage(Page page)
Description copied from class:OutputHandlerInvoked when PDFxStream has finished processing a page- Overrides:
endPagein classOutputHandler- Parameters:
page- a reference to thePagethat has been processed
-
startPDF
public void startPDF(String pdfName, File pdfFile)
Description copied from class:OutputHandlerInvoked when a new PDF is about to be processed.- Overrides:
startPDFin classOutputHandler- Parameters:
pdfName- the 'name' of the PDF document, as provided byDocument.getName()}pdfFile- the file reference PDFxStream is about to begin processing. This reference may be null if the sourceDocumentis not reading from aFileorInputStream.
-
textUnit
public void textUnit(TextUnit tu)
Description copied from class:OutputHandlerInvoked when a run of characters is to be outputted, as represented by the givenTextUnitinstance.- Overrides:
textUnitin classOutputHandler
-
-