Class LazyTextExtractorField

  • All Implemented Interfaces:
    Serializable, org.apache.lucene.document.Fieldable

    public class LazyTextExtractorField
    extends org.apache.lucene.document.AbstractField
    LazyTextExtractorField implements a Lucene field with a String value that is lazily initialized from a given Reader. In addition this class provides a method to find out whether the purpose of the reader is to extract text and whether the extraction process is already finished.
    See Also:
    isExtractorFinished(), Serialized Form
    • Field Summary

      • Fields inherited from class org.apache.lucene.document.AbstractField

        binaryLength, binaryOffset, boost, fieldsData, indexOptions, isBinary, isIndexed, isStored, isTokenized, lazy, name, omitNorms, storeOffsetWithTermVector, storePositionWithTermVector, storeTermVector, tokenStream
    • Constructor Summary

      Constructors 
      Constructor Description
      LazyTextExtractorField​(org.apache.tika.parser.Parser parser, InternalValue value, org.apache.tika.metadata.Metadata metadata, Executor executor, boolean highlighting, int maxFieldLength, boolean withNorms)
      Creates a new LazyTextExtractorField.
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      byte[] binaryValue()  
      void dispose()
      Releases all resources associated with this field.
      boolean isExtractorFinished()
      Checks whether the text extraction task has finished.
      Reader readerValue()  
      String stringValue()
      Returns the extracted text.
      org.apache.lucene.analysis.TokenStream tokenStreamValue()  
      • Methods inherited from class org.apache.lucene.document.AbstractField

        getBinaryLength, getBinaryOffset, getBinaryValue, getBinaryValue, getBoost, getIndexOptions, getOmitNorms, getOmitTermFreqAndPositions, isBinary, isIndexed, isLazy, isStored, isStoreOffsetWithTermVector, isStorePositionWithTermVector, isTermVectorStored, isTokenized, name, setBoost, setIndexOptions, setOmitNorms, setOmitTermFreqAndPositions, setStoreTermVector, toString
    • Constructor Detail

      • LazyTextExtractorField

        public LazyTextExtractorField​(org.apache.tika.parser.Parser parser,
                                      InternalValue value,
                                      org.apache.tika.metadata.Metadata metadata,
                                      Executor executor,
                                      boolean highlighting,
                                      int maxFieldLength,
                                      boolean withNorms)
        Creates a new LazyTextExtractorField.
        Parameters:
        parser -
        value -
        metadata -
        executor -
        highlighting - set to true to enable result highlighting support
        maxFieldLength -
        withNorms -
    • Method Detail

      • stringValue

        public String stringValue()
        Returns the extracted text. This method blocks until the text extraction task has been completed.
        Returns:
        the string value of this field
      • readerValue

        public Reader readerValue()
        Returns:
        always null
      • binaryValue

        public byte[] binaryValue()
        Returns:
        always null
      • tokenStreamValue

        public org.apache.lucene.analysis.TokenStream tokenStreamValue()
        Returns:
        always null
      • isExtractorFinished

        public boolean isExtractorFinished()
        Checks whether the text extraction task has finished.
        Returns:
        true if the extracted text is available
      • dispose

        public void dispose()
        Releases all resources associated with this field.