Apache Jackrabbit : SpellChecker

The lucene based query handler implementation supports a pluggable spell checker mechanism. Per default spell checking is not available and you have to configure it first. See parameter spellCheckerClass on page lucene-spellchecker]. Jackrabbit currently provides an implementation in the sandbox area, which uses the [http://wiki.apache.org/jakarta-lucene/SpellChecker contrib. The dictionary is derived from the fulltext indexed content of the workspace and updated periodically. You can configure the refresh interval by picking one of the available inner classes of org.apache.jackrabbit.core.query.lucene.spell.LuceneSpellChecker:

  • One{{`Minute}}Refresh`Interval
  • Five{{`Minutes}}Refresh`Interval
  • Thirty{{`Minutes}}Refresh`Interval
  • One{{`Hour}}Refresh`Interval
  • Six{{`Hours}}Refresh`Interval
  • Twelve{{`Hours}}Refresh`Interval
  • One{{`Day}}Refresh`Interval

E.g. if you want a refresh interval of six hours the class name is: org.apache.jackrabbit.core.query.lucene.spell.LuceneSpellChecker$SixHoursRefreshInterval. If you use org.apache.jackrabbit.core.query.lucene.spell.LuceneSpellChecker the refresh interval will be one hour.

The spell checker dictionary is stored as a lucene index under <workspace-name>/index/spellchecker. If it does not exist, a background thread will create it on startup. Similarly the dictionary refresh is also done in a background thread to not block regular queries.

How do I use it?

You can spell check a fulltext statement either with an XPath or a SQL query:

    // rep:spellcheck('jackrabit') will always evaluate to true
    Query query = qm.createQuery("/jcr:root[rep:spellcheck('jackrabit')]/(rep:spellcheck())", Query.XPATH);
    RowIterator rows = query.execute().getRows();
    // the above query will always return the root node no matter what string we check
    Row r = rows.nextRow();
    // get the result of the spell checking
    Value v = r.getValue("rep:spellcheck()");
    if (v == null) {
        // no suggestion returned, the spelling is correct or the spell checker
        // does not know how to correct it.
    } else {
        String suggestion = v.getString();
    }

And the same using SQL:

    // SPELLCHECK('jackrabit') will always evaluate to true
    Query query = qm.createQuery("SELECT rep:spellcheck() FROM nt:base WHERE jcr:path = '/' AND SPELLCHECK('jackrabit')", Query.SQL);
    RowIterator rows = query.execute().getRows();
    // the above query will always return the root node no matter what string we check
    Row r = rows.nextRow();
    // get the result of the spell checking
    Value v = r.getValue("rep:spellcheck()");
    if (v == null) {
        // no suggestion returned, the spelling is correct or the spell checker
        // does not know how to correct it.
    } else {
        String suggestion = v.getString();
    }