XFuzzySuggester

java.lang.Object
- org.apache.lucene.search.suggest.Lookup
- - org.apache.lucene.search.suggest.analyzing.XAnalyzingSuggester
  - - org.apache.lucene.search.suggest.analyzing.XFuzzySuggester

All Implemented Interfaces:

org.apache.lucene.util.Accountable
```
public final class XFuzzySuggester
extends XAnalyzingSuggester
```
Implements a fuzzy AnalyzingSuggester. The similarity measurement is based on the Damerau-Levenshtein (optimal string alignment) algorithm, though you can explicitly choose classic Levenshtein by passing false for the transpositions parameter.
At most, this query will match terms up to 2 edits. Higher distances are not supported. Note that the fuzzy distance is measured in "byte space" on the bytes returned by the TokenStream's TermToBytesRefAttribute, usually UTF8. By default the analyzed bytes must be at least 3 DEFAULT_MIN_FUZZY_LENGTH bytes before any edits are considered. Furthermore, the first 1 DEFAULT_NON_FUZZY_PREFIX byte is not allowed to be edited. We allow up to 1 (@link #DEFAULT_MAX_EDITS} edit. If unicodeAware parameter in the constructor is set to true, maxEdits, minFuzzyLength, transpositions and nonFuzzyPrefix are measured in Unicode code points (actual letters) instead of bytes.*

NOTE: This suggester does not boost suggestions that required no edits over suggestions that did require edits. This is a known limitation.

Note: complex query analyzers can have a significant impact on the lookup performance. It's recommended to not use analyzers that drop or inject terms like synonyms to keep the complexity of the prefix intersection low for good lookup performance. At index time, complex analyzers can safely be used.

Nested Class Summary
- Nested classes/interfaces inherited from class org.apache.lucene.search.suggest.analyzing.XAnalyzingSuggester
  XAnalyzingSuggester.XBuilder
- Nested classes/interfaces inherited from class org.apache.lucene.search.suggest.Lookup
  org.apache.lucene.search.suggest.Lookup.LookupPriorityQueue, org.apache.lucene.search.suggest.Lookup.LookupResult

Field Summary

Fields

Modifier and Type	Field and Description
`static int`	`DEFAULT_MAX_EDITS` The default maximum number of edits for fuzzy suggestions.
`static int`	`DEFAULT_MIN_FUZZY_LENGTH` The default minimum length of the key passed to `XAnalyzingSuggester.lookup(java.lang.CharSequence, java.util.Set<org.apache.lucene.util.BytesRef>, boolean, int)` before any edits are allowed.
`static int`	`DEFAULT_NON_FUZZY_PREFIX` The default prefix length where edits are not allowed.
`static boolean`	`DEFAULT_TRANSPOSITIONS` The default transposition value passed to `LevenshteinAutomata`
`static boolean`	`DEFAULT_UNICODE_AWARE` Measure maxEdits, minFuzzyLength, transpositions and nonFuzzyPrefix parameters in Unicode code points (actual letters) instead of bytes.

Fields inherited from class org.apache.lucene.search.suggest.analyzing.XAnalyzingSuggester
END_BYTE, EXACT_FIRST, HOLE_CHARACTER, PAYLOAD_SEP, PRESERVE_SEP, SEP_LABEL

Fields inherited from class org.apache.lucene.search.suggest.Lookup
CHARSEQUENCE_COMPARATOR

Constructor Summary

Constructors

Constructor and Description
`XFuzzySuggester(org.apache.lucene.analysis.Analyzer analyzer)` Creates a `FuzzySuggester` instance initialized with default values.
`XFuzzySuggester(org.apache.lucene.analysis.Analyzer indexAnalyzer, org.apache.lucene.analysis.Analyzer queryAnalyzer)` Creates a `FuzzySuggester` instance with an index & a query analyzer initialized with default values.
XFuzzySuggester(org.apache.lucene.analysis.Analyzer indexAnalyzer, org.apache.lucene.util.automaton.Automaton queryPrefix, org.apache.lucene.analysis.Analyzer queryAnalyzer, int options, int maxSurfaceFormsPerAnalyzedForm, int maxGraphExpansions, int maxEdits, boolean transpositions, int nonFuzzyPrefix, int minFuzzyLength, boolean unicodeAware, org.apache.lucene.util.fst.FST<org.apache.lucene.util.fst.PairOutputs.Pair<Long,org.apache.lucene.util.BytesRef>> fst, boolean hasPayloads, int maxAnalyzedPathsForOneInput, int sepLabel, int payloadSep, int endByte, int holeCharacter) Creates a `FuzzySuggester` instance.

Method Summary

All Methods

Instance Methods

Concrete Methods
Modifier and Type	Method and Description
`protected org.apache.lucene.util.automaton.Automaton`	`convertAutomaton(org.apache.lucene.util.automaton.Automaton a)`
`protected List<org.apache.lucene.search.suggest.analyzing.FSTUtil.Path<org.apache.lucene.util.fst.PairOutputs.Pair<Long,org.apache.lucene.util.BytesRef>>>`	`getFullPrefixPaths(List<org.apache.lucene.search.suggest.analyzing.FSTUtil.Path<org.apache.lucene.util.fst.PairOutputs.Pair<Long,org.apache.lucene.util.BytesRef>>> prefixPaths, org.apache.lucene.util.automaton.Automaton lookupAutomaton, org.apache.lucene.util.fst.FST<org.apache.lucene.util.fst.PairOutputs.Pair<Long,org.apache.lucene.util.BytesRef>> fst)` Returns all completion paths to initialize the search.
`org.apache.lucene.analysis.TokenStreamToAutomaton`	`getTokenStreamToAutomaton()`

Methods inherited from class org.apache.lucene.search.suggest.analyzing.XAnalyzingSuggester
build, decodeWeight, encodeWeight, get, getCount, load, load, lookup, ramBytesUsed, store, store, toFiniteStrings, toFiniteStrings

Methods inherited from class org.apache.lucene.search.suggest.Lookup
build, lookup

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Field Detail
- DEFAULT_UNICODE_AWARE
```
public static final boolean DEFAULT_UNICODE_AWARE
```
  Measure maxEdits, minFuzzyLength, transpositions and nonFuzzyPrefix parameters in Unicode code points (actual letters) instead of bytes.
  
  See Also:
  
  Constant Field Values
- DEFAULT_MIN_FUZZY_LENGTH
```
public static final int DEFAULT_MIN_FUZZY_LENGTH
```
  The default minimum length of the key passed to XAnalyzingSuggester.lookup(java.lang.CharSequence, java.util.Set<org.apache.lucene.util.BytesRef>, boolean, int) before any edits are allowed.
  
  See Also:
  
  Constant Field Values
- DEFAULT_NON_FUZZY_PREFIX
```
public static final int DEFAULT_NON_FUZZY_PREFIX
```
  The default prefix length where edits are not allowed.
  
  See Also:
  
  Constant Field Values
- DEFAULT_MAX_EDITS
```
public static final int DEFAULT_MAX_EDITS
```
  The default maximum number of edits for fuzzy suggestions.
  
  See Also:
  
  Constant Field Values
- DEFAULT_TRANSPOSITIONS
```
public static final boolean DEFAULT_TRANSPOSITIONS
```
  The default transposition value passed to LevenshteinAutomata
  
  See Also:
  
  Constant Field Values

Constructor Detail
- XFuzzySuggester
```
public XFuzzySuggester(org.apache.lucene.analysis.Analyzer analyzer)
```
  Creates a FuzzySuggester instance initialized with default values.
  
  Parameters:
  
  analyzer - the analyzer used for this suggester
- XFuzzySuggester
```
public XFuzzySuggester(org.apache.lucene.analysis.Analyzer indexAnalyzer,
                       org.apache.lucene.analysis.Analyzer queryAnalyzer)
```
  Creates a FuzzySuggester instance with an index & a query analyzer initialized with default values.
  
  Parameters:
  
  indexAnalyzer - Analyzer that will be used for analyzing suggestions while building the index.
  
  queryAnalyzer - Analyzer that will be used for analyzing query text during lookup
- XFuzzySuggester
```
public XFuzzySuggester(org.apache.lucene.analysis.Analyzer indexAnalyzer,
                       org.apache.lucene.util.automaton.Automaton queryPrefix,
                       org.apache.lucene.analysis.Analyzer queryAnalyzer,
                       int options,
                       int maxSurfaceFormsPerAnalyzedForm,
                       int maxGraphExpansions,
                       int maxEdits,
                       boolean transpositions,
                       int nonFuzzyPrefix,
                       int minFuzzyLength,
                       boolean unicodeAware,
                       org.apache.lucene.util.fst.FST<org.apache.lucene.util.fst.PairOutputs.Pair<Long,org.apache.lucene.util.BytesRef>> fst,
                       boolean hasPayloads,
                       int maxAnalyzedPathsForOneInput,
                       int sepLabel,
                       int payloadSep,
                       int endByte,
                       int holeCharacter)
```
  Creates a FuzzySuggester instance.
  
  Parameters:
  
  indexAnalyzer - Analyzer that will be used for analyzing suggestions while building the index.
  
  queryAnalyzer - Analyzer that will be used for analyzing query text during lookup
  
  options - see XAnalyzingSuggester.EXACT_FIRST, XAnalyzingSuggester.PRESERVE_SEP
  
  maxSurfaceFormsPerAnalyzedForm - Maximum number of surface forms to keep for a single analyzed form. When there are too many surface forms we discard the lowest weighted ones.
  
  maxGraphExpansions - Maximum number of graph paths to expand from the analyzed form. Set this to -1 for no limit.
  
  maxEdits - must be >= 0 and <= LevenshteinAutomata.MAXIMUM_SUPPORTED_DISTANCE .
  
  transpositions - true if transpositions should be treated as a primitive edit operation. If this is false, comparisons will implement the classic Levenshtein algorithm.
  
  nonFuzzyPrefix - length of common (non-fuzzy) prefix (see default DEFAULT_NON_FUZZY_PREFIX
  
  minFuzzyLength - minimum length of lookup key before any edits are allowed (see default DEFAULT_MIN_FUZZY_LENGTH)
  
  sepLabel - separation label
  
  payloadSep - payload separator byte
  
  endByte - end byte marker byte

Method Detail

getFullPrefixPaths

protected List<org.apache.lucene.search.suggest.analyzing.FSTUtil.Path<org.apache.lucene.util.fst.PairOutputs.Pair<Long,org.apache.lucene.util.BytesRef>>> getFullPrefixPaths(List<org.apache.lucene.search.suggest.analyzing.FSTUtil.Path<org.apache.lucene.util.fst.PairOutputs.Pair<Long,org.apache.lucene.util.BytesRef>>> prefixPaths,
                                                                                                                                                                              org.apache.lucene.util.automaton.Automaton lookupAutomaton,
                                                                                                                                                                              org.apache.lucene.util.fst.FST<org.apache.lucene.util.fst.PairOutputs.Pair<Long,org.apache.lucene.util.BytesRef>> fst)
                                                                                                                                                                       throws IOException

Description copied from class: XAnalyzingSuggester

Returns all completion paths to initialize the search.

Overrides:: getFullPrefixPaths in class XAnalyzingSuggester
Throws:: IOException

convertAutomaton

protected org.apache.lucene.util.automaton.Automaton convertAutomaton(org.apache.lucene.util.automaton.Automaton a)

getTokenStreamToAutomaton

public org.apache.lucene.analysis.TokenStreamToAutomaton getTokenStreamToAutomaton()

Class XFuzzySuggester

Nested Class Summary

Nested classes/interfaces inherited from class org.apache.lucene.search.suggest.analyzing.XAnalyzingSuggester

Nested classes/interfaces inherited from class org.apache.lucene.search.suggest.Lookup

Field Summary

Fields

Fields inherited from class org.apache.lucene.search.suggest.analyzing.XAnalyzingSuggester

Fields inherited from class org.apache.lucene.search.suggest.Lookup

Constructor Summary

Constructors

Method Summary

Methods inherited from class org.apache.lucene.search.suggest.analyzing.XAnalyzingSuggester

Methods inherited from class org.apache.lucene.search.suggest.Lookup

Methods inherited from class java.lang.Object

Field Detail

DEFAULT_UNICODE_AWARE

DEFAULT_MIN_FUZZY_LENGTH

DEFAULT_NON_FUZZY_PREFIX

DEFAULT_MAX_EDITS

DEFAULT_TRANSPOSITIONS

Constructor Detail

XFuzzySuggester

XFuzzySuggester

XFuzzySuggester

Method Detail

getFullPrefixPaths

convertAutomaton

getTokenStreamToAutomaton