Motivations

To allow typos on a word, Meilisearch must register it in a specialized data structure: the FST. Registering words in this FST can be time-consuming during the indexing process, and limiting the number of registered words will limit the time spent on recomputing the data structure. There are two kinds of terms that could be interesting to remove from the FST:

  1. The numbers, because typo tolerance on numbers is not consistent with how numbers work; indeed, typo tolerance will freely replace any digit with another without computing the difference: 2024 will equally match 2025, 2004, and 20240, which doesn't really improve the search accuracy.
  2. (Postponed to v1.16) The rare terms, if a term appears only in a few documents or is unique in the whole dataset, it means that this term must be a “randomly generated term” like UUIDs, or an “encoded term” like base64, or a Hash. The chances that an end-user searches these kinds of terms with a typo are unlikely; on the other hand, these terms can overload the FST if there are a lot of unique terms, for example, one UUID per document in the database.

API/behavior changes

The typoTolerance setting will be extended with additional sub-settings, allowing the user to deactivate the typo tolerance on kinds of terms automatically detected by Meilisearch. Below is the list of the new sub-settings:

disableOnNumbers (boolean): If set to true, Meilisearch will auto-detect numbers in any field listed in the searchableAttributes; these numbers will no longer match a search query with 1 or 2 typos, even if the query word is long enough to allow typos.

curl \\
-X PATCH 'MEILISEARCH_URL/indexes/movies/settings/typo-tolerance' \\
-H 'Content-Type: application/json' \\
--data-binary '{ "disableOnNumbers": false }'

Telemetry

List all the new or updated telemetry

**

Name Description Example
e.g. infos.log_level e.g. “value of --log-level” e.g. “debug”

Error handling