Class LevenshteinAutomata

java.lang.Object
org.apache.lucene.util.automaton.LevenshteinAutomata

public class LevenshteinAutomata extends Object
Class to construct DFAs that match a word within some edit distance.

Implements the algorithm described in: Schulz and Mihov: Fast String Correction with Levenshtein Automata

  • Field Details

    • MAXIMUM_SUPPORTED_DISTANCE

      public static final int MAXIMUM_SUPPORTED_DISTANCE
      See Also:
  • Constructor Details

    • LevenshteinAutomata

      public LevenshteinAutomata(String input, boolean withTranspositions)
      Create a new LevenshteinAutomata for some input String. Optionally count transpositions as a primitive edit.
    • LevenshteinAutomata

      public LevenshteinAutomata(int[] word, int alphaMax, boolean withTranspositions)
      Expert: specify a custom maximum possible symbol (alphaMax); default is Character.MAX_CODE_POINT.
  • Method Details

    • toAutomaton

      public Automaton toAutomaton(int n)
      Compute a DFA that accepts all strings within an edit distance of n.

      All automata have the following properties:

      • They are deterministic (DFA).
      • There are no transitions to dead states.
      • They are not minimal (some transitions could be combined).