0
0
bleve/analysis
Marty Schoch e472b3e807 add support for a "web" tokenizer/analyzer
The goal of the "web" tokenizer is to recognize web things like
- email addresses
- URLs
- twitter @handles and #hashtags

This implementation uses regexp exceptions.  There will most
likely be endless debate about the regular expressions. These
were chosein as "good enough for now".

There is also a "web" analyzer.  This is just the "standard"
analyzer, but using the "web" tokenizer instead of the "unicode"
one.  NOTE: after processing the exceptions, it still falls back
to the standard "unicode" one.

For many users, you can simply set your mapping's default analyzer
to be "web".

closes #269
2015-11-30 14:27:18 -05:00
..
analyzers add support for a "web" tokenizer/analyzer 2015-11-30 14:27:18 -05:00
byte_array_converters major refactor of bleve configuration 2015-09-16 17:10:59 -04:00
char_filters add newline between license and package 2014-09-02 10:54:50 -04:00
datetime_parsers major refactor of bleve configuration 2015-09-16 17:10:59 -04:00
language Merge pull request #238 from ikawaha/ja-morph-analyzer 2015-09-28 17:05:46 -04:00
token_filters token_map: document it along with stop_token_filter 2015-11-05 14:07:54 +01:00
token_map token_map: document it along with stop_token_filter 2015-11-05 14:07:54 +01:00
tokenizers add support for a "web" tokenizer/analyzer 2015-11-30 14:27:18 -05:00
benchmark_test.go minor speed up in token frequency calculations 2015-09-04 18:57:39 -04:00
freq_test.go minor speed up in token frequency calculations 2015-09-04 18:57:39 -04:00
freq.go doc: document Token, TokenFrequencies and Field structs 2015-10-09 12:32:44 +02:00
test_words.txt major refactor of analysis files, now wired up to registry 2014-08-13 21:14:47 -04:00
token_map_test.go fix issues identified by errcheck 2015-04-07 14:52:00 -04:00
token_map.go first pass at checking errors that were ignored 2015-03-06 14:46:29 -05:00
type.go token_map: document it along with stop_token_filter 2015-11-05 14:07:54 +01:00
util_test.go add newline between license and package 2014-09-02 10:54:50 -04:00
util.go fix issues with lucene stemmer 2015-03-11 11:14:29 -04:00