Interface TermToBytesRefAttribute
- All Superinterfaces:
Attribute
- All Known Implementing Classes:
CharTermAttributeImpl
,NumericTokenStream.NumericTermAttributeImpl
,Token
This attribute is requested by TermsHashPerField to index the contents.
This attribute can be used to customize the final byte[] encoding of terms.
Consumers of this attribute call getBytesRef()
up-front, and then
invoke fillBytesRef()
for each term. Example:
final TermToBytesRefAttribute termAtt = tokenStream.getAttribute(TermToBytesRefAttribute.class); final BytesRef bytes = termAtt.getBytesRef(); while (tokenStream.incrementToken() { // you must call termAtt.fillBytesRef() before doing something with the bytes. // this encodes the term value (internally it might be a char[], etc) into the bytes. int hashCode = termAtt.fillBytesRef(); if (isInteresting(bytes)) { // because the bytes are reused by the attribute (like CharTermAttribute's char[] buffer), // you should make a copy if you need persistent access to the bytes, otherwise they will // be rewritten across calls to incrementToken() doSomethingWith(new BytesRef(bytes)); } } ...
-
Method Summary
Modifier and TypeMethodDescriptionint
Updates the bytesgetBytesRef()
to contain this term's final encoding, and returns its hashcode.Retrieve this attribute's BytesRef.
-
Method Details
-
fillBytesRef
int fillBytesRef()Updates the bytesgetBytesRef()
to contain this term's final encoding, and returns its hashcode.- Returns:
- the hashcode as defined by
BytesRef.hashCode()
:int hash = 0; for (int i = termBytes.offset; i < termBytes.offset+termBytes.length; i++) { hash = 31*hash + termBytes.bytes[i]; }
Implement this for performance reasons, if your code can calculate the hash on-the-fly. If this is not the case, just returntermBytes.hashCode()
.
-
getBytesRef
BytesRef getBytesRef()Retrieve this attribute's BytesRef. The bytes are updated from the current term when the consumer callsfillBytesRef()
.- Returns:
- this Attributes internal BytesRef.
-