Class GoogleHTMLOutputHandler


  • public class GoogleHTMLOutputHandler
    extends OutputHandler

    This class is an example OutputHandler implementation that builds an XHTML document to mimic the HTML view that Google offers for indexed PDF documents.

    Source for this class is included in every PDFxStream bundle.

    Version:
    ©2004-2024 Snowtide
    • Method Detail

      • main

        public static void main​(String[] args)
                         throws Exception
        Deprecated.
        Command-line usage of this class may be moved or removed in future PDFxStream releases.
        Main method for command-line execution. Usage:

        java GoogleHTMLOutputHandler [input_pdf_file] [output_html_path]

        Throws:
        Exception
      • getHTMLDocument

        public Document getHTMLDocument()
        Returns the XHTML document that is built up by this OutputHandler.
      • startPage

        public void startPage​(Page page)
        Description copied from class: OutputHandler
        Invoked when a page is about to be processed.
        Overrides:
        startPage in class OutputHandler
        Parameters:
        page - a reference to the Page that is about to be processed
      • endPage

        public void endPage​(Page page)
        Description copied from class: OutputHandler
        Invoked when PDFxStream has finished processing a page
        Overrides:
        endPage in class OutputHandler
        Parameters:
        page - a reference to the Page that has been processed
      • startPDF

        public void startPDF​(String pdfName,
                             File pdfFile)
        Description copied from class: OutputHandler
        Invoked when a new PDF is about to be processed.
        Overrides:
        startPDF in class OutputHandler
        Parameters:
        pdfName - the 'name' of the PDF document, as provided by Document.getName() }
        pdfFile - the file reference PDFxStream is about to begin processing. This reference may be null if the source Document is not reading from a File or InputStream.