0
0
bleve/analysis
Steve Yen 918732f3d8 unicode.Tokenize() allocs backing array of Tokens
Previously, unicode.Tokenize() would allocate a Token one-by-one, on
an as-needed basis.

This change allocates a "backing array" of Tokens, so that it goes to
the runtime object allocator much less often.  It takes a heuristic
guess as to the backing array size by using the average token
(segment) length seen so far.

Results from micro-benchmark (null-firestorm, bleve-blast) seem to
give perhaps less than ~0.5 MB/second throughput improvement.
2016-01-02 12:21:25 -08:00
..
analyzers add support for a "web" tokenizer/analyzer 2015-11-30 14:27:18 -05:00
byte_array_converters major refactor of bleve configuration 2015-09-16 17:10:59 -04:00
char_filters add newline between license and package 2014-09-02 10:54:50 -04:00
datetime_parsers major refactor of bleve configuration 2015-09-16 17:10:59 -04:00
language Merge pull request #238 from ikawaha/ja-morph-analyzer 2015-09-28 17:05:46 -04:00
token_filters token_map: document it along with stop_token_filter 2015-11-05 14:07:54 +01:00
token_map token_map: document it along with stop_token_filter 2015-11-05 14:07:54 +01:00
tokenizers unicode.Tokenize() allocs backing array of Tokens 2016-01-02 12:21:25 -08:00
benchmark_test.go minor speed up in token frequency calculations 2015-09-04 18:57:39 -04:00
freq_test.go minor speed up in token frequency calculations 2015-09-04 18:57:39 -04:00
freq.go TokenFrequency() alloc's all TokenLocations up front 2016-01-02 12:21:17 -08:00
test_words.txt major refactor of analysis files, now wired up to registry 2014-08-13 21:14:47 -04:00
token_map_test.go fix issues identified by errcheck 2015-04-07 14:52:00 -04:00
token_map.go first pass at checking errors that were ignored 2015-03-06 14:46:29 -05:00
type.go token_map: document it along with stop_token_filter 2015-11-05 14:07:54 +01:00
util_test.go add newline between license and package 2014-09-02 10:54:50 -04:00
util.go fix issues with lucene stemmer 2015-03-11 11:14:29 -04:00