public class SelectionOutputTarget extends OutputTarget
An OutputTarget
derivative that restricts the content added to the given
StringBuffer
to that within the starting and ending selection points
specified in the constructor. This implementation would most commonly be paired with
an interactive UI, where a user would be able to specify a range of content to be
selected.
Example:
float x1 = 0, y1 = 450, x2 = 390, y2 = 72; PDFTextStream stream = new PDFTextStream(pdfFile); StringBuffer sb = new StringBuffer(); SelectionOutputTarget tgt = new SelectionOutputTarget(sb, x1, y1, x2, y2); stream.getPage(0).pipe(tgt); p.pipe(tgt); stream.close(); String selectedText = sb.toString();Note that this
OutputHandler
retains OutputTarget
's read-ordering
and segmentation semantics, so the selection's start and end points are interpreted
within that context; they are not the corners of a bounding or crop box as with
RegionOutputTarget
. That is, a start point that is within the first column of a page
and an end point in the second column will result in all of the intervening text
being extracted.
Constructor and Description |
---|
SelectionOutputTarget(java.lang.StringBuffer sb,
float x1,
float y1,
float x2,
float y2) |
Modifier and Type | Method and Description |
---|---|
void |
endPage(Page p)
Invoked when PDFTextStream has finished processing a page
|
void |
startPage(Page p)
Invoked when a page is about to be processed.
|
void |
textUnit(TextUnit ch)
Default implementation that writes the character run specified by the given
TextUnit instance to the java.io.Writer or java.lang.Appendable
object that this OutputTarget wraps. |
forBuffer, forWriter, getConfig, getObject, linebreaks, setConfig, spaces, write, write, write, write, write
endBlock, endLine, endPDF, startBlock, startLine, startPDF
public SelectionOutputTarget(java.lang.StringBuffer sb, float x1, float y1, float x2, float y2)
public void startPage(Page p)
OutputHandler
startPage
in class OutputTarget
p
- - a reference to the Page
that is about to be processedpublic void endPage(Page p)
OutputHandler
endPage
in class OutputTarget
p
- - a reference to the Page
that has been processedpublic void textUnit(TextUnit ch)
OutputTarget
Default implementation that writes the character run specified by the given
TextUnit
instance to the java.io.Writer
or java.lang.Appendable
object that this OutputTarget wraps.
This implementation is very straightforward; it is provided here for illustrative purposes only:
if (tu.getCharacterSequence() == null) { // no mapped sequence, append direct character code conversion int cc = tu.getCharCode(); if (cc >= 32) write((char)cc); } else { write(tu.getCharacterSequence()); }
textUnit
in class OutputTarget