0
0
bleve/analysis/token_filters/porter
Marty Schoch 44ff6ced8a improve perf of porter stemmer
1.  porter stemmer offers method to NOT do lowercasing, however
to use this we must convert to runes first ourself, so we did this

2.  now we can invoke the version that skips lowercasing, we
already do this ourselves before stemming through separate filter

due to the fact that the stemmer modifies the runes in place
we have no way to know if there were changes, thus we must
always encode back into the term byte slice

added unit test which catches the problem found

NOTE this uses analysis.BuildTermFromRunes so perf gain is
only visible with other PR also merged

future gains are possible if we udpate the stemmer to let us
know if changes were made, thus skipping re-encoding to
[]byte when no changes were actually made
2016-09-11 20:13:15 -04:00
..
porter_test.go improve perf of porter stemmer 2016-09-11 20:13:15 -04:00
porter.go improve perf of porter stemmer 2016-09-11 20:13:15 -04:00