Class PostingsWriterBase

java.lang.Object
org.apache.lucene.codecs.PostingsConsumer
org.apache.lucene.codecs.PostingsWriterBase
All Implemented Interfaces:
Closeable, AutoCloseable
Direct Known Subclasses:
Lucene41PostingsWriter

public abstract class PostingsWriterBase extends PostingsConsumer implements Closeable
Extension of PostingsConsumer to support pluggable term dictionaries.

This class contains additional hooks to interact with the provided term dictionaries such as BlockTreeTermsWriter. If you want to re-use an existing implementation and are only interested in customizing the format of the postings list, extend this class instead.

See Also:
  • Constructor Details

    • PostingsWriterBase

      protected PostingsWriterBase()
      Sole constructor. (For invocation by subclass constructors, typically implicit.)
  • Method Details

    • init

      public abstract void init(IndexOutput termsOut) throws IOException
      Called once after startup, before any terms have been added. Implementations typically write a header to the provided termsOut.
      Throws:
      IOException
    • newTermState

      public abstract BlockTermState newTermState() throws IOException
      Return a newly created empty TermState
      Throws:
      IOException
    • startTerm

      public abstract void startTerm() throws IOException
      Start a new term. Note that a matching call to finishTerm(BlockTermState) is done, only if the term has at least one document.
      Throws:
      IOException
    • finishTerm

      public abstract void finishTerm(BlockTermState state) throws IOException
      Finishes the current term. The provided BlockTermState contains the term's summary statistics, and will holds metadata from PBF when returned
      Throws:
      IOException
    • encodeTerm

      public abstract void encodeTerm(long[] longs, DataOutput out, FieldInfo fieldInfo, BlockTermState state, boolean absolute) throws IOException
      Encode metadata as long[] and byte[]. absolute controls whether current term is delta encoded according to latest term. Usually elements in longs are file pointers, so each one always increases when a new term is consumed. out is used to write generic bytes, which are not monotonic. NOTE: sometimes long[] might contain "don't care" values that are unused, e.g. the pointer to postings list may not be defined for some terms but is defined for others, if it is designed to inline some postings data in term dictionary. In this case, the postings writer should always use the last value, so that each element in metadata long[] remains monotonic.
      Throws:
      IOException
    • setField

      public abstract int setField(FieldInfo fieldInfo)
      Sets the current field for writing, and returns the fixed length of long[] metadata (which is fixed per field), called when the writing switches to another field.
    • close

      public abstract void close() throws IOException
      Specified by:
      close in interface AutoCloseable
      Specified by:
      close in interface Closeable
      Throws:
      IOException