![]() now it will return each cjk character as a separate token this will pair well with a cjk bigram filter for indexing |
||
---|---|---|
.. | ||
regexp_tokenizer | ||
single_token | ||
unicode_word_boundary | ||
whitespace_tokenizer |
![]() now it will return each cjk character as a separate token this will pair well with a cjk bigram filter for indexing |
||
---|---|---|
.. | ||
regexp_tokenizer | ||
single_token | ||
unicode_word_boundary | ||
whitespace_tokenizer |