Class Automaton

java.lang.Object
org.apache.lucene.util.automaton.Automaton
All Implemented Interfaces:
Cloneable

public class Automaton extends Object implements Cloneable
Finite-state automaton with regular expression operations.

Class invariants:

  • An automaton is either represented explicitly (with State and Transition objects) or with a singleton string (see getSingleton() and expandSingleton()) in case the automaton is known to accept exactly one string. (Implicitly, all states and transitions of an automaton are reachable from its initial state.)
  • Automata are always reduced (see reduce()) and have no transitions to dead states (see removeDeadTransitions()).
  • If an automaton is nondeterministic, then isDeterministic() returns false (but the converse is not required).
  • Automata provided as input to operations are generally assumed to be disjoint.

If the states or transitions are manipulated manually, the restoreInvariant() and setDeterministic(boolean) methods should be used afterwards to restore representation invariants that are assumed by the built-in automata operations.

Note: This class has internal mutable state and is not thread safe. It is the caller's responsibility to ensure any necessary synchronization if you wish to use the same Automaton from multiple threads. In general it is instead recommended to use a RunAutomaton for multithreaded matching: it is immutable, thread safe, and much faster.

  • Field Details

    • MINIMIZE_HOPCROFT

      public static final int MINIMIZE_HOPCROFT
      Minimize using Hopcroft's O(n log n) algorithm. This is regarded as one of the most generally efficient algorithms that exist.
      See Also:
  • Constructor Details

    • Automaton

      public Automaton(State initial)
      Constructs a new automaton that accepts the empty language. Using this constructor, automata can be constructed manually from State and Transition objects.
      See Also:
    • Automaton

      public Automaton()
  • Method Details

    • setMinimization

      public static void setMinimization(int algorithm)
      Selects minimization algorithm (default: MINIMIZE_HOPCROFT).
      Parameters:
      algorithm - minimization algorithm
    • setMinimizeAlways

      public static void setMinimizeAlways(boolean flag)
      Sets or resets minimize always flag. If this flag is set, then MinimizationOperations.minimize(Automaton) will automatically be invoked after all operations that otherwise may produce non-minimal automata. By default, the flag is not set.
      Parameters:
      flag - if true, the flag is set
    • setAllowMutate

      public static boolean setAllowMutate(boolean flag)
      Sets or resets allow mutate flag. If this flag is set, then all automata operations may modify automata given as input; otherwise, operations will always leave input automata languages unmodified. By default, the flag is not set.
      Parameters:
      flag - if true, the flag is set
      Returns:
      previous value of the flag
    • getSingleton

      public String getSingleton()
      Returns the singleton string for this automaton. An automaton that accepts exactly one string may be represented in singleton mode. In that case, this method may be used to obtain the string.
      Returns:
      string, null if this automaton is not in singleton mode.
    • getInitialState

      public State getInitialState()
      Gets initial state.
      Returns:
      state
    • isDeterministic

      public boolean isDeterministic()
      Returns deterministic flag for this automaton.
      Returns:
      true if the automaton is definitely deterministic, false if the automaton may be nondeterministic
    • setDeterministic

      public void setDeterministic(boolean deterministic)
      Sets deterministic flag for this automaton. This method should (only) be used if automata are constructed manually.
      Parameters:
      deterministic - true if the automaton is definitely deterministic, false if the automaton may be nondeterministic
    • setInfo

      public void setInfo(Object info)
      Associates extra information with this automaton.
      Parameters:
      info - extra information
    • getInfo

      public Object getInfo()
      Returns extra information associated with this automaton.
      Returns:
      extra information
      See Also:
    • getNumberedStates

      public State[] getNumberedStates()
    • setNumberedStates

      public void setNumberedStates(State[] states)
    • setNumberedStates

      public void setNumberedStates(State[] states, int count)
    • clearNumberedStates

      public void clearNumberedStates()
    • getAcceptStates

      public Set<State> getAcceptStates()
      Returns the set of reachable accept states.
      Returns:
      set of State objects
    • restoreInvariant

      public void restoreInvariant()
      Restores representation invariant. This method must be invoked before any built-in automata operation is performed if automaton states or transitions are manipulated manually.
      See Also:
    • reduce

      public void reduce()
      Reduces this automaton. An automaton is "reduced" by combining overlapping and adjacent edge intervals with same destination.
    • removeDeadTransitions

      public void removeDeadTransitions()
      Removes transitions to dead states and calls reduce(). (A state is "dead" if no accept state is reachable from it.)
    • getSortedTransitions

      public Transition[][] getSortedTransitions()
      Returns a sorted array of transitions for each state (and sets state numbers).
    • expandSingleton

      public void expandSingleton()
      Expands singleton representation to normal representation. Does nothing if not in singleton representation.
    • getNumberOfStates

      public int getNumberOfStates()
      Returns the number of states in this automaton.
    • getNumberOfTransitions

      public int getNumberOfTransitions()
      Returns the number of transitions in this automaton. This number is counted as the total number of edges, where one edge may be a character interval.
    • equals

      public boolean equals(Object obj)
      Overrides:
      equals in class Object
    • hashCode

      public int hashCode()
      Overrides:
      hashCode in class Object
    • toString

      public String toString()
      Returns a string representation of this automaton.
      Overrides:
      toString in class Object
    • toDot

      public String toDot()
      Returns Graphviz Dot representation of this automaton.
    • clone

      public Automaton clone()
      Returns a clone of this automaton.
      Overrides:
      clone in class Object
    • concatenate

      public Automaton concatenate(Automaton a)
    • concatenate

      public static Automaton concatenate(List<Automaton> l)
    • optional

      public Automaton optional()
    • repeat

      public Automaton repeat()
    • repeat

      public Automaton repeat(int min)
    • repeat

      public Automaton repeat(int min, int max)
    • complement

      public Automaton complement()
    • minus

      public Automaton minus(Automaton a)
    • intersection

      public Automaton intersection(Automaton a)
    • subsetOf

      public boolean subsetOf(Automaton a)
    • union

      public Automaton union(Automaton a)
    • union

      public static Automaton union(Collection<Automaton> l)
    • determinize

      public void determinize()
    • isEmptyString

      public boolean isEmptyString()
    • minimize

      public static Automaton minimize(Automaton a)
      See MinimizationOperations.minimize(Automaton). Returns the automaton being given as argument.