# Namespace Lucene.Net.Util.Automaton

### Classes

#### Automaton

Finite-state automaton with regular expression operations.

Class invariants:

- An automaton is either represented explicitly (with State and Transition objects) or with a singleton string (see Singleton and ExpandSingleton()) in case the automaton is known to accept exactly one string. (Implicitly, all states and transitions of an automaton are reachable from its initial state.)
- Automata are always reduced (see Reduce()) and have no transitions to dead states (see RemoveDeadTransitions()).
- If an automaton is nondeterministic, then IsDeterministic
returns
`false`

(but the converse is not required). - Automata provided as input to operations are generally assumed to be disjoint.

If the states or transitions are manipulated manually, the RestoreInvariant() method and IsDeterministic setter should be used afterwards to restore representation invariants that are assumed by the built-in automata operations.

Note: this class has internal mutable state and is not thread safe. It is the caller's responsibility to ensure any necessary synchronization if you wish to use the same Automaton from multiple threads. In general it is instead recommended to use a RunAutomaton for multithreaded matching: it is immutable, thread safe, and much faster.

@lucene.experimental#### AutomatonTestUtil

#### AutomatonTestUtil.RandomAcceptedStrings

#### BasicAutomata

Construction of basic automata.

@lucene.experimental

#### BasicOperations

Basic automata operations.

@lucene.experimental

#### ByteRunAutomaton

Automaton representation for matching UTF-8 byte[].

#### CharacterRunAutomaton

Automaton representation for matching char[].

#### CompiledAutomaton

Immutable class holding compiled details for a given Automaton. The Automaton is deterministic, must not have dead states but is not necessarily minimal.

@lucene.experimental

#### LevenshteinAutomata

Class to construct DFAs that match a word within some edit distance.

Implements the algorithm described in: Schulz and Mihov: Fast String Correction with Levenshtein Automata

@lucene.experimental

#### MinimizationOperations

Operations for minimizing automata.

@lucene.experimental

#### RegExp

Regular Expression extension to Automaton.

Regular expressions are built from the following abstract syntax:

regexp::=unionexp | |

| | |

unionexp::=interexp |unionexp(union) | |

|interexp | |

interexp::=concatexp &interexp(intersection)[OPTIONAL] | |

|concatexp | |

concatexp::=repeatexp concatexp(concatenation) | |

|repeatexp | |

repeatexp::=repeatexp (zero or one occurrence)? | |

|repeatexp (zero or more occurrences)* | |

|repeatexp (one or more occurrences)+ | |

|repeatexp ({n} occurrences)n | |

|repeatexp ({n,} or more occurrences)n | |

|repeatexp ({n,m} to n occurrences, including both)m | |

|complexp | |

complexp::=~complexp(complement)[OPTIONAL] | |

|charclassexp | |

charclassexp::=[charclasses (character class)] | |

|[^charclasses (negated character class)] | |

|simpleexp | |

charclasses::=charclass charclasses | |

|charclass | |

charclass::=charexp -charexp(character range, including end-points) | |

|charexp | |

simpleexp::=charexp | |

|(any single character). | |

|(the empty language)[OPTIONAL]# | |

|(any string)[OPTIONAL]@ | |

| <Unicode string without double-quotes> "(a string)" | |

|((the empty string)) | |

|(unionexp (precedence override)) | |

| <identifier> <(named automaton)[OPTIONAL]> | |

|(numerical interval)[OPTIONAL]<n-m> | |

charexp::=<Unicode character>(a single non-reserved character) | |

| <Unicode character> (a single character)</strong> |

The productions marked [OPTIONAL] are only allowed if
specified by the syntax flags passed to the RegExp constructor.
The reserved characters used in the (enabled) syntax must be escaped with
backslash (`</code>) or double-quotes (`

`"..."`

). (In
contrast to other regexp syntaxes, this is required also in character
classes.) Be aware that dash (`-`

) has a special meaning in
*charclass* expressions. An identifier is a string not containing right
angle bracket (`>`

) or dash (`-`

). Numerical
intervals are specified by non-negative decimal integers and include both end
points, and if `n`

and `m`

have the same number
of digits, then the conforming strings must have that length (i.e. prefixed
by 0's).

```
@lucene.experimental
```

```
```

#### RunAutomaton

Finite-state automaton with fast run operation.

@lucene.experimental

#### SpecialOperations

Special automata operations.

@lucene.experimental

#### State

Automaton state.

@lucene.experimental

#### StatePair

Pair of states.

@lucene.experimental

#### Transition

Automaton transition.

A transition, which belongs to a source state, consists of a Unicode codepoint interval and a destination state.

@lucene.experimental

#### UTF32ToUTF8

Converts UTF-32 automata to the equivalent UTF-8 representation.

@lucene.internal

### Interfaces

#### IAutomatonProvider

Automaton provider for RegExp.

@lucene.experimental

### Enums

#### CompiledAutomaton.AUTOMATON_TYPE

Automata are compiled into different internal forms for the most efficient execution depending upon the language they accept.