Package org.apache.lucene.search.spans
package org.apache.lucene.search.spans
The calculus of spans.
A span is a <doc,startPosition,endPosition> tuple.
The following span query operators are implemented:
- A
SpanTermQuerymatches all spans containing a particularTerm. - A
SpanNearQuerymatches spans which occur near one another, and can be used to implement things like phrase search (when constructed fromSpanTermQuerys) and inter-phrase proximity (when constructed from otherSpanNearQuerys). - A
SpanOrQuerymerges spans from a number of otherSpanQuerys. - A
SpanNotQueryremoves spans matching oneSpanQuerywhich overlap (or comes near) another. This can be used, e.g., to implement within-paragraph search. - A
SpanFirstQuerymatches spans matchingqwhose end position is less thann. This can be used to constrain matches to the first part of the document. - A
SpanPositionRangeQueryis a more general form of SpanFirstQuery that can constrain matches to arbitrary portions of the document.
For example, a span query which matches "John Kerry" within ten words of "George Bush" within the first 100 words of the document could be constructed with:
SpanQuery john = new SpanTermQuery(new Term("content", "john"));
SpanQuery kerry = new SpanTermQuery(new Term("content", "kerry"));
SpanQuery george = new SpanTermQuery(new Term("content", "george"));
SpanQuery bush = new SpanTermQuery(new Term("content", "bush"));
SpanQuery johnKerry =
new SpanNearQuery(new SpanQuery[] {john, kerry}, 0, true);
SpanQuery georgeBush =
new SpanNearQuery(new SpanQuery[] {george, bush}, 0, true);
SpanQuery johnKerryNearGeorgeBush =
new SpanNearQuery(new SpanQuery[] {johnKerry, georgeBush}, 10, false);
SpanQuery johnKerryNearGeorgeBushAtStart =
new SpanFirstQuery(johnKerryNearGeorgeBush, 100);
Span queries may be freely intermixed with other Lucene queries. So, for example, the above query can be restricted to documents which also use the word "iraq" with:
Query query = new BooleanQuery();
query.add(johnKerryNearGeorgeBushAtStart, true, false);
query.add(new TermQuery("content", "iraq"), true, false);
-
ClassDescriptionWrapper to allow
SpanQueryobjects participate in composite single-field SpanQueries by 'lying' about their search field.A Spans that is formed from the ordered subspans of a SpanNearQuery where the subspans do not overlap and have a maximum slop between them.Similar toNearSpansOrdered, but for the unordered case.Matches spans near the beginning of a field.SpanMultiTermQueryWrapper<Q extends MultiTermQuery>Wraps anyMultiTermQueryas aSpanQuery, so it can be nested within other SpanQuery classes.Abstract class that defines how the query is rewritten.A rewrite method that first translates each term into a SpanTermQuery in aBooleanClause.Occur.SHOULDclause in a BooleanQuery, and keeps the scores as computed by the query.Only return those matches that have a specific payload at the given position.Matches spans which are near one another.Removes matches which overlap with another SpanQuery or within a x tokens before or y tokens after another SpanQuery.Matches the union of its clauses.Only return those matches that have a specific payload at the given position.Base class for filtering a SpanQuery based on the position of a match.Return value forSpanPositionCheckQuery.acceptPosition(Spans).Checks to see if theSpanPositionCheckQuery.getMatch()lies between a start and end positionBase class for span-based queries.Expert: an enumeration of span matches.Public for extension only.Matches spans containing a term.Expert-only.Expert: Public for extension only