Class DefaultHighlighter

  • Direct Known Subclasses:
    WeightedHighlighter

    public class DefaultHighlighter
    extends Object
    This is an adapted version of the FulltextHighlighter posted in issue: LUCENE-644.

    Important: for this highlighter to function properly, field must be stored with token offsets.
    Use Field constructor Field(String, String, Field.Store, Field.Index, Field.TermVector) where the last argument is either Field.TermVector.WITH_POSITIONS_OFFSETS or Field.TermVector.WITH_OFFSETS

    See Also:
    TermPositionVector, TermFreqVector
    • Constructor Detail

      • DefaultHighlighter

        protected DefaultHighlighter()
    • Method Detail

      • highlight

        public static String highlight​(org.apache.lucene.index.TermPositionVector tvec,
                                       Set<org.apache.lucene.index.Term[]> queryTerms,
                                       String text,
                                       String excerptStart,
                                       String excerptEnd,
                                       String fragmentStart,
                                       String fragmentEnd,
                                       String hlStart,
                                       String hlEnd,
                                       int maxFragments,
                                       int surround)
                                throws IOException
        Parameters:
        tvec - the term position vector for this hit
        queryTerms - the query terms.
        text - the original text that was used to create the tokens.
        excerptStart - this string is prepended to the excerpt
        excerptEnd - this string is appended to the excerpt
        fragmentStart - this string is prepended to every fragment
        fragmentEnd - this string is appended to the end of every fragement.
        hlStart - the string used to prepend a highlighted token, for example &quot;&lt;b&gt;&quot;
        hlEnd - the string used to append a highlighted token, for example &quot;&lt;/b&gt;&quot;
        maxFragments - the maximum number of fragments
        surround - the maximum number of chars surrounding a highlighted token
        Returns:
        a String with text fragments where tokens from the query are highlighted
        Throws:
        IOException
      • highlight

        public static String highlight​(org.apache.lucene.index.TermPositionVector tvec,
                                       Set<org.apache.lucene.index.Term[]> queryTerms,
                                       String text,
                                       int maxFragments,
                                       int surround)
                                throws IOException
        Parameters:
        tvec - the term position vector for this hit
        queryTerms - the query terms.
        text - the original text that was used to create the tokens.
        maxFragments - the maximum number of fragments
        surround - the maximum number of chars surrounding a highlighted token
        Returns:
        a String with text fragments where tokens from the query are highlighted
        Throws:
        IOException
      • createDefaultExcerpt

        protected String createDefaultExcerpt​(String text,
                                              String excerptStart,
                                              String excerptEnd,
                                              String fragmentStart,
                                              String fragmentEnd,
                                              int maxLength)
                                       throws IOException
        Creates a default excerpt with the given text.
        Parameters:
        text - the text.
        excerptStart - the excerpt start.
        excerptEnd - the excerpt end.
        fragmentStart - the fragment start.
        fragmentEnd - the fragment end.
        maxLength - the maximum length of the fragment.
        Returns:
        a default excerpt.
        Throws:
        IOException - if an error occurs while reading from the text.
      • escape

        protected String escape​(String input)
        Escapes input text suitable for the output format.

        By default does XML-escaping. Can be overridden for other formats.

        Parameters:
        input - raw text.
        Returns:
        text suitably escaped.