Interface TermToBytesRefAttribute

All Superinterfaces:
Attribute
All Known Implementing Classes:
CharTermAttributeImpl, NumericTokenStream.NumericTermAttributeImpl, Token

public interface TermToBytesRefAttribute extends Attribute
This attribute is requested by TermsHashPerField to index the contents. This attribute can be used to customize the final byte[] encoding of terms.

Consumers of this attribute call getBytesRef() up-front, and then invoke fillBytesRef() for each term. Example:

   final TermToBytesRefAttribute termAtt = tokenStream.getAttribute(TermToBytesRefAttribute.class);
   final BytesRef bytes = termAtt.getBytesRef();

   while (tokenStream.incrementToken() {

     // you must call termAtt.fillBytesRef() before doing something with the bytes.
     // this encodes the term value (internally it might be a char[], etc) into the bytes.
     int hashCode = termAtt.fillBytesRef();

     if (isInteresting(bytes)) {
     
       // because the bytes are reused by the attribute (like CharTermAttribute's char[] buffer),
       // you should make a copy if you need persistent access to the bytes, otherwise they will
       // be rewritten across calls to incrementToken()

       doSomethingWith(new BytesRef(bytes));
     }
   }
   ...
 
  • Method Summary

    Modifier and Type
    Method
    Description
    int
    Updates the bytes getBytesRef() to contain this term's final encoding, and returns its hashcode.
    Retrieve this attribute's BytesRef.
  • Method Details

    • fillBytesRef

      int fillBytesRef()
      Updates the bytes getBytesRef() to contain this term's final encoding, and returns its hashcode.
      Returns:
      the hashcode as defined by BytesRef.hashCode():
        int hash = 0;
        for (int i = termBytes.offset; i < termBytes.offset+termBytes.length; i++) {
          hash = 31*hash + termBytes.bytes[i];
        }
       
      Implement this for performance reasons, if your code can calculate the hash on-the-fly. If this is not the case, just return termBytes.hashCode().
    • getBytesRef

      BytesRef getBytesRef()
      Retrieve this attribute's BytesRef. The bytes are updated from the current term when the consumer calls fillBytesRef().
      Returns:
      this Attributes internal BytesRef.