So I am trying to build something using bloodhound search engine and I noticed that it has these two tokenisers, datum and query.
The initializer code example given in the documentation looks like this:
var engine = new Bloodhound({
local: ['dog', 'pig', 'moose'],
queryTokenizer: Bloodhound.tokenizers.whitespace,
datumTokenizer: Bloodhound.tokenizers.whitespace
});
What do these two Tokenizers do?
EDIT
Bloodhound documentation defines these two as follows:
datumTokenizer – A function with the signature (datum) that transforms a datum into an array of string tokens. Required.
queryTokenizer – A function with the signature (query) that transforms a query into an array of string tokens. Required.
It still doesn't explain what is the difference between a Datum and a Query.
datum
are the elements of the index that is searched thru and the query
is what is being searched for. If either contain more than one token(s) (or word when whitespace
is used), the engine needs some function to split characters on. See more info on why tokenization is needed.