0
0
Commit Graph

7 Commits

Author SHA1 Message Date
Marty Schoch
6b4c86b35a changed whitespace tokenizer to work better on cjk input
now it will return each cjk character as a separate token
this will pair well with a cjk bigram filter for indexing
2014-09-07 14:11:01 -04:00
Marty Schoch
7a7eb2e94c add newline between license and package
this avoids cluttering godocs with the license
2014-09-02 10:54:50 -04:00
Marty Schoch
1161361bea rename imports from couchbaselabs to blevesearch 2014-08-28 15:38:57 -04:00
Marty Schoch
b48dc87afa added test case clarifying whitespace tokenizer on empty input 2014-08-19 10:43:52 -04:00
Marty Schoch
c526a38369 major refactor of analysis files, now wired up to registry
ultimately this is make it more convenient for us to wire up
different elements of the analysis pipeline, without having to
preload everything into memory before we need it

separately the index layer now has a mechanism for storing
internal key/value pairs.  this is expected to be used to
store the mapping, and possibly other pieces of data by the
top layer, but not exposed to the user at the top.
2014-08-13 21:14:47 -04:00
Marty Schoch
25540c736a introduced token type 2014-07-31 13:54:12 -04:00
Marty Schoch
3d842dfaf2 initial commit 2014-04-17 16:55:53 -04:00