com.snowtide.pdf
Class OutputTarget

java.lang.Object
  extended by com.snowtide.pdf.OutputHandler
      extended by com.snowtide.pdf.OutputTarget
Direct Known Subclasses:
SelectionOutputTarget

public class OutputTarget
extends OutputHandler

This is a base OutputHandler implementation that provides a common output interface for Writer and Appendable instances (such as StringBuilders and StringBuffers), allowing PDFTextStream to easily redirect output to either type of object. Not only does using an OutputTarget simplify your code, but it also minimizes the internal buffering that PDFTextStream might otherwise perform when being used as a java.io.Reader implementation.

Note that since this class provides a baseline OutputHandler implementation that will direct all text content to the provided Writer or Appendable, it is the ideal place to start when building a custom OutputHandler implementation. See the XMLOutputTarget class as an example of how this can be done.

Finally, please note that OutputTarget does not make any attempt to retain the visual layout or formatting of the text extracted from PDF documents. This OutputHandler implementation is focused on:

If your application requires PDF text extracts that retain the visual appearance of the text as it is laid out on each page, then VisualOutputTarget would be more suitable.

Example usage:

 StringBuilder sb = new StringBuilder(1024);
 OutputTarget tgt = new OutputTarget(sb);
 
 PDFTextStream stream = new PDFTextStream();
 stream.pipe(tgt);
 stream.close();
 
  // do something with the extracted text...
 processText(sb);
 

Version:
©2004-2012 Snowtide Informatics Systems, Inc.
See Also:
PDFTextStream.pipe(OutputHandler), Page.pipe(OutputHandler), Block.pipe(OutputHandler), Line.pipe(OutputHandler)

Constructor Summary
OutputTarget(java.lang.Appendable sb)
          Creates a new OutputTarget that directs output to the given java.lang.Appendable instance.
OutputTarget(java.io.Writer w)
          Creates a new OutputTarget that directs output to the given java.io.Writer instance.
 
Method Summary
 void endPage(Page page)
          Invoked when PDFTextStream has finished processing a page
static OutputTarget forBuffer(java.lang.Appendable sb)
          Deprecated. use OutputTarget(Appendable)
static OutputTarget forWriter(java.io.Writer w)
          Deprecated. use OutputTarget(Writer) instead
 PDFTextStreamConfig getConfig()
           Returns the PDFTextStreamConfig instance that this OutputTarget is currently using.
 java.lang.Object getObject()
          Returns the output object that this instance wraps; will be an instance of either java.io.Writer or java.lang.Appendable.
 void linebreaks(int linebreakCnt)
          Default implementation that writes specified number of line breaks (using the linebreak String provided by the current configuration) to the java.io.Writer or java.lang.Appendable object that this OutputTarget wraps.
 void setConfig(PDFTextStreamConfig config)
          Sets the PDFTextStreamConfig instance this OutputTarget should use.
 void spaces(int spaceCnt)
          Default implementation that writes specified number of spaces to the java.io.Writer or java.lang.Appendable object that this OutputTarget wraps.
 void startPage(Page page)
          Invoked when a page is about to be processed.
 void textUnit(TextUnit tu)
           Default implementation that writes the character run specified by the given TextUnit instance to the java.io.Writer or java.lang.Appendable object that this OutputTarget wraps.
 void write(char c)
          Writes the provided character to the wrapped output object.
 void write(char[] buf)
          Writes the provided character data to the wrapped output object.
 void write(char[] buf, int start, int len)
          Writes the provided character data to the wrapped output object.
 void write(java.lang.CharSequence sb)
          Writes the provided CharSequence's character data to the wrapped output object.
 void write(java.lang.String str)
          Writes the provided String's character data to the wrapped output object.
 
Methods inherited from class com.snowtide.pdf.OutputHandler
endBlock, endLine, endPDF, startBlock, startLine, startPDF
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

OutputTarget

public OutputTarget(java.io.Writer w)
Creates a new OutputTarget that directs output to the given java.io.Writer instance.


OutputTarget

public OutputTarget(java.lang.Appendable sb)
Creates a new OutputTarget that directs output to the given java.lang.Appendable instance.

Method Detail

forWriter

public static OutputTarget forWriter(java.io.Writer w)
Deprecated. use OutputTarget(Writer) instead

Creates a new OutputTarget that wraps a java.io.Writer instance.


forBuffer

public static OutputTarget forBuffer(java.lang.Appendable sb)
Deprecated. use OutputTarget(Appendable)

Creates a new OutputTarget that wraps a java.lang.Appendable instance.


write

public void write(java.lang.String str)
           throws java.io.IOException
Writes the provided String's character data to the wrapped output object.

Throws:
java.io.IOException - - if an error occurs writing the character data; only possible in connection with an OutputTarget instance that wraps a java.io.Writer instance.

write

public void write(java.lang.CharSequence sb)
           throws java.io.IOException
Writes the provided CharSequence's character data to the wrapped output object.

Throws:
java.io.IOException - - if an error occurs writing the character data; only possible in connection with an OutputTarget instance that wraps a java.io.Writer instance.

write

public void write(char[] buf,
                  int start,
                  int len)
           throws java.io.IOException
Writes the provided character data to the wrapped output object.

Throws:
java.io.IOException - - if an error occurs writing the character data; only possible in connection with an OutputTarget instance that wraps a java.io.Writer instance.

write

public final void write(char[] buf)
                 throws java.io.IOException
Writes the provided character data to the wrapped output object.

Throws:
java.io.IOException - - if an error occurs writing the character data; only possible in connection with an OutputTarget instance that wraps a java.io.Writer instance.

write

public void write(char c)
           throws java.io.IOException
Writes the provided character to the wrapped output object.

Throws:
java.io.IOException - - if an error occurs writing the character; only possible in connection with an OutputTarget instance that wraps a java.io.Writer instance.

getObject

public java.lang.Object getObject()
Returns the output object that this instance wraps; will be an instance of either java.io.Writer or java.lang.Appendable.


textUnit

public void textUnit(TextUnit tu)

Default implementation that writes the character run specified by the given TextUnit instance to the java.io.Writer or java.lang.Appendable object that this OutputTarget wraps.

This implementation is very straightforward; it is provided here for illustrative purposes only:

 if (tu.getCharacterSequence() == null) {
     // no mapped sequence, append direct character code conversion
     int cc = tu.getCharCode();
     if (cc >= 32) write((char)cc);
 } else {
     write(tu.getCharacterSequence());
 }
 

Overrides:
textUnit in class OutputHandler

spaces

public void spaces(int spaceCnt)
Default implementation that writes specified number of spaces to the java.io.Writer or java.lang.Appendable object that this OutputTarget wraps.

Overrides:
spaces in class OutputHandler
Parameters:
spaceCnt - - the number of spaces that PDFTextStream recommends should be outputted

linebreaks

public void linebreaks(int linebreakCnt)
Default implementation that writes specified number of line breaks (using the linebreak String provided by the current configuration) to the java.io.Writer or java.lang.Appendable object that this OutputTarget wraps.

Overrides:
linebreaks in class OutputHandler
Parameters:
linebreakCnt - - the number of line breaks that PDFTextStream recommends should be outputted

startPage

public void startPage(Page page)
Description copied from class: OutputHandler
Invoked when a page is about to be processed.

Overrides:
startPage in class OutputHandler
Parameters:
page - - a reference to the Page that is about to be processed

endPage

public void endPage(Page page)
Description copied from class: OutputHandler
Invoked when PDFTextStream has finished processing a page

Overrides:
endPage in class OutputHandler
Parameters:
page - - a reference to the Page that has been processed

getConfig

public PDFTextStreamConfig getConfig()

Returns the PDFTextStreamConfig instance that this OutputTarget is currently using. Please note that unless an OutputTarget instance is explicitly provided with a particular configuration via setConfig(PDFTextStreamConfig), it will synchronize its configuration with the configuration of a PDFTextStream instance any time an OutputTarget is provided to either PDFTextStream.pipe(OutputHandler) or Page.pipe(OutputHandler).

If an OutputTarget is to be used to pipe content only from Block contexts, then it will use the default PDFTextStreamConfig instance until a different configuration is set via setConfig(PDFTextStreamConfig).


setConfig

public void setConfig(PDFTextStreamConfig config)
Sets the PDFTextStreamConfig instance this OutputTarget should use. Once this OutputTarget instance's configuration is set using this function, it will cease to synchronize its configuration with the configuration provided by PDFTextStream and Page objects from which it is used to pipe content.