com.snowtide.pdf
Class PDFTextStreamConfig

java.lang.Object
  extended by com.snowtide.pdf.PDFTextStreamConfig

public class PDFTextStreamConfig
extends java.lang.Object

Various configuration options for PDFTextStream may be set using this class. A custom configuration may be registered with PDFTextStream in any of three ways:

Since:
v2.2
Version:
©2004-2012 Snowtide Informatics Systems, Inc.

Constructor Summary
PDFTextStreamConfig()
           
PDFTextStreamConfig(PDFTextStreamConfig other)
          Creates a copy of the given PDFTextStreamConfig instance.
 
Method Summary
static PDFTextStreamConfig getDefaultConfig()
          Returns the configuration that new PDFTextStream instances use by default (which is settable via setDefaultConfig(PDFTextStreamConfig).
 java.lang.String getLinebreakString()
          Returns the string that OutputTarget (and its subclasses) output for each linebreak identified in extracted PDF content.
 int getMinTableCellCount()
          Returns the minimum number of adjacent cells that must be present in order for PDFTextStream to recognize those cells collectively as a Table.
static boolean isCJKSupportEnabled()
          Returns true if this configuration will cause PDFTextStream to extract and decode Chinese, Japanese, and Korean content.
 boolean isDeriveType3Fonts()
          Returns true if this configuration will cause PDFTextStream to derive the Unicode encodings of Type3 PDF fonts.
 boolean isImplicitLineDetectionEnabled()
           
 boolean isMemoryMappingEnabled()
          Deprecated. Memory-mapping of opened PDF files is disabled by default, and will be removed as an option in future PDFTextStream releases.
 boolean isStripXFAFormDataEnabled()
           
static void setCJKSupportEnabled(boolean enableCJK)
          Changes the setting that controls whether or not PDFTextStream extracts and decodes Chinese, Japanese, and Korean content.
static void setDefaultConfig(PDFTextStreamConfig defaultConfig)
          Sets the configuration that new PDFTextStream instances use by default.
 void setDeriveType3Fonts(boolean deriveType3Fonts)
          Changes the setting that controls whether or not PDFTextStream derives the Unicode encodings of Type3 PDF fonts.
 void setImplicitLineDetectionEnabled(boolean detectImplicitLines)
           
 void setLinebreakString(java.lang.String linebreak)
          Sets the string that OutputTarget (and its subclasses) output for each linebreak identified in extracted PDF content.
 void setMemoryMappingEnabled(boolean memoryMappingEnabled)
          Deprecated. Memory-mapping of opened PDF files is disabled by default, and will be removed as an option in future PDFTextStream releases.
 void setMinTableCellCount(int minTableCellCount)
          Changes the setting that controls the minimum number of adjacent cells that must be present in order for PDFTextStream to recognize those cells collectively as a Table.
 void setStripXFAFormDataEnabled(boolean stripXFAFormData)
           
 java.lang.String toString()
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Constructor Detail

PDFTextStreamConfig

public PDFTextStreamConfig(PDFTextStreamConfig other)
Creates a copy of the given PDFTextStreamConfig instance.


PDFTextStreamConfig

public PDFTextStreamConfig()
Method Detail

getDefaultConfig

public static PDFTextStreamConfig getDefaultConfig()
Returns the configuration that new PDFTextStream instances use by default (which is settable via setDefaultConfig(PDFTextStreamConfig).


setDefaultConfig

public static void setDefaultConfig(PDFTextStreamConfig defaultConfig)
Sets the configuration that new PDFTextStream instances use by default.


toString

public java.lang.String toString()
Overrides:
toString in class java.lang.Object

isStripXFAFormDataEnabled

public boolean isStripXFAFormDataEnabled()

setStripXFAFormDataEnabled

public void setStripXFAFormDataEnabled(boolean stripXFAFormData)

getMinTableCellCount

public int getMinTableCellCount()
Returns the minimum number of adjacent cells that must be present in order for PDFTextStream to recognize those cells collectively as a Table. This setting defaults to 4.


setMinTableCellCount

public void setMinTableCellCount(int minTableCellCount)
Changes the setting that controls the minimum number of adjacent cells that must be present in order for PDFTextStream to recognize those cells collectively as a Table. This setting defaults to 4.


isMemoryMappingEnabled

public boolean isMemoryMappingEnabled()
Deprecated. Memory-mapping of opened PDF files is disabled by default, and will be removed as an option in future PDFTextStream releases.

Returns true if this configuration will cause PDFTextStream to memory map PDF files. This setting defaults to false.


setMemoryMappingEnabled

public void setMemoryMappingEnabled(boolean memoryMappingEnabled)
Deprecated. Memory-mapping of opened PDF files is disabled by default, and will be removed as an option in future PDFTextStream releases.

Changes the setting that controls whether or not PDFTextStream memory-maps PDF files. This setting defaults to false.


isImplicitLineDetectionEnabled

public boolean isImplicitLineDetectionEnabled()

setImplicitLineDetectionEnabled

public void setImplicitLineDetectionEnabled(boolean detectImplicitLines)

isCJKSupportEnabled

public static boolean isCJKSupportEnabled()
Returns true if this configuration will cause PDFTextStream to extract and decode Chinese, Japanese, and Korean content. This setting defaults to true.


setCJKSupportEnabled

public static void setCJKSupportEnabled(boolean enableCJK)
Changes the setting that controls whether or not PDFTextStream extracts and decodes Chinese, Japanese, and Korean content. This setting defaults to true. Changing it to false will minimize PDFTextStream's memory utilization, but no CJK content will be extracted.


isDeriveType3Fonts

public boolean isDeriveType3Fonts()
Returns true if this configuration will cause PDFTextStream to derive the Unicode encodings of Type3 PDF fonts. This setting defaults to true.


setDeriveType3Fonts

public void setDeriveType3Fonts(boolean deriveType3Fonts)
Changes the setting that controls whether or not PDFTextStream derives the Unicode encodings of Type3 PDF fonts. This setting defaults to true. Changing it to false will result in a small performance improvement, but any PDF content rendered using Type3 fonts that lack a Unicode encoding will not be extracted by PDFTextStream.


getLinebreakString

public java.lang.String getLinebreakString()
Returns the string that OutputTarget (and its subclasses) output for each linebreak identified in extracted PDF content. This value defaults to the current platform's line break string, as identified by the line.separator system property.


setLinebreakString

public void setLinebreakString(java.lang.String linebreak)
Sets the string that OutputTarget (and its subclasses) output for each linebreak identified in extracted PDF content. This value defaults to the current platform's line break string, as identified by the line.separator system property.