Interface TextUnit

  • All Superinterfaces:
    Bounded

    public interface TextUnit
    extends Bounded
    A single character or discrete character grouping positioned within a Line.

    Note that space characters are typically not encoded in PDF documents; rather, they are implicit in the spacing between the bounding boxes of adjacent TextUnits.

    Since:
    v1.4
    Version:
    ©2004-2024 Snowtide
    • Nested Class Summary

      Nested Classes 
      Modifier and Type Interface Description
      static interface  TextUnit.Predicate
      Type to be satisfied when implementing a TextUnit predicate for filtering characters in a Page.
    • Method Summary

      All Methods Instance Methods Abstract Methods 
      Modifier and Type Method Description
      char[] getCharacterSequence()
      Returns the characters that should be rendered for this TextUnit instead of the 'raw' character code provided by getCharCode().
      int getCharCode()
      Returns the 'raw' character code used to encode this TextUnit in the source PDF document.
      Font getFont()
      Returns the Font that was in force when this TextUnit was outputted.
      float getFontSize()
      Returns the size of the font used to render this TextUnit.
      float getTheta()
      Returns the angle (in degrees) by which this TextUnit's baseline is rotated.
      boolean isStruckThrough()
      Returns true if this TextUnit is struck through (like this).
      boolean isUnderlined()
      Returns true if this TextUnit is underlined (like this).
      • Methods inherited from interface com.snowtide.pdf.layout.Bounded

        bounds
    • Method Detail

      • getCharCode

        int getCharCode()
        Returns the 'raw' character code used to encode this TextUnit in the source PDF document.

        In many cases, this character code is equivalent to the Unicode character id. Otherwise, the font and encoding information in force when the character code was read from the PDF document dictates that a particular character sequence be rendered instead of the Unicode character corresponding to the character code returned by this function.

      • getCharacterSequence

        char[] getCharacterSequence()
        Returns the characters that should be rendered for this TextUnit instead of the 'raw' character code provided by getCharCode().

        This function may return null, in which case the Unicode character corresponding with the 'raw' character code should be used when rendering this TextUnit.

      • getFont

        Font getFont()
        Returns the Font that was in force when this TextUnit was outputted.
      • getFontSize

        float getFontSize()
        Returns the size of the font used to render this TextUnit.
      • isUnderlined

        boolean isUnderlined()
        Returns true if this TextUnit is underlined (like this). While this will report an appropriate value for text that is rotated by a "regular" angle (90º, -90º, 180º), it will always return false for text that is rotated by any other angle (i.e. 30º, -45º, 16º, etc).
      • isStruckThrough

        boolean isStruckThrough()
        Returns true if this TextUnit is struck through (like this). This will report an appropriate value for for text that is not rotated, and will return always false otherwise.
      • getTheta

        float getTheta()
        Returns the angle (in degrees) by which this TextUnit's baseline is rotated.