0
0
Commit Graph

32 Commits

Author SHA1 Message Date
Marty Schoch
1dcd06e412 add ability to define custom analysis as part of index mapping
now, as part of your index mapping you can create custom
analysis components.  these custome analysis components
are serialized as part of the mapping, and reused
as you would expect on subsequent accesses.
2014-09-01 13:55:23 -04:00
Marty Schoch
2ee7289bc8 major refactor of search package
this started initially to relocate highlighting into
a self contained package, which would then also use
the registry
however, it turned into a much larger refactor in
order to avoid cyclic imports
now facets, searchers, scorers and collectors
are also broken out into subpackages of search
2014-09-01 11:15:38 -04:00
Marty Schoch
209f808722 improve go docs at the top level
part of #79
2014-08-31 10:55:22 -04:00
Marty Schoch
7bfad18d40 moved byte array converts into the analysis package 2014-08-29 19:23:21 -04:00
Marty Schoch
77c998a7a2 made config private and fixed broken test 2014-08-29 15:32:36 -04:00
Marty Schoch
37d3f0205d cleanup spacing between license and package 2014-08-29 14:18:36 -04:00
Marty Schoch
1161361bea rename imports from couchbaselabs to blevesearch 2014-08-28 15:38:57 -04:00
Marty Schoch
ef59abe4c9 added build tag 'leveldb' to enable this kv store
by default we now use the pure go boltdb kv store
it is less tested at this point but appears to work
test pass, and moves us closer to the goal of being
able to just "go get" bleve
2014-08-25 15:18:24 -04:00
Marty Schoch
e8959d03ae added build tag 'icu' to enable functionality dependent on it 2014-08-25 12:22:01 -04:00
Marty Schoch
21ef6e9878 added build tag for things depending on libstemmer 2014-08-25 12:06:10 -04:00
Marty Schoch
f37bb77794 added build tag to enable cld2 2014-08-25 11:24:20 -04:00
Marty Schoch
27f001bc14 overhauled top-level New/Open API
New is now used to create new indexes
Open is used to open existing indexes
calls to Open no longer specify a mapping because the mapping
is serialized and stored along with the index
2014-08-20 16:58:20 -04:00
Marty Schoch
c526a38369 major refactor of analysis files, now wired up to registry
ultimately this is make it more convenient for us to wire up
different elements of the analysis pipeline, without having to
preload everything into memory before we need it

separately the index layer now has a mechanism for storing
internal key/value pairs.  this is expected to be used to
store the mapping, and possibly other pieces of data by the
top layer, but not exposed to the user at the top.
2014-08-13 21:14:47 -04:00
Marty Schoch
3481ec9cef added hindi stemmer
closes #40
2014-08-11 22:29:47 -04:00
Marty Schoch
c65f7415ff added hindi normalizer
closes #64
2014-08-11 19:51:47 -04:00
Marty Schoch
cd0e3fd85b added german normalizer
updated german analyzer to use this normalizer
closes #65
2014-08-11 19:25:37 -04:00
Marty Schoch
a4707ebb4e configured zero width non joiner char filter, and persian analyzer 2014-08-11 18:57:04 -04:00
Marty Schoch
4ccd69ed45 added arabic normalizer
closes #63
2014-08-11 18:35:35 -04:00
Marty Schoch
73b252f6a6 added persian normalizer
closes #67
2014-08-11 18:15:41 -04:00
Marty Schoch
42895649de further streamlined the API
introduced concept of byte array converters
right now only wired up to top-level index mapping
allowing the removal of the JSON methods, now at the top level
we default to parsing []byte as JSON, override if thats not
the behavior you want.

future enhancements will allow use of these byte array converters
to control how byte arrays are handled elsewhere in documents
this would allow for handing binary attachments, etc in the future

closes #59
2014-08-11 12:47:29 -04:00
Marty Schoch
e21b7f4436 added sorani normalizer and stemmer, now have analyzer
closes #43
2014-08-08 09:38:28 -04:00
Marty Schoch
ef35ea1985 added czech stop word list
closes #36
2014-08-07 22:32:49 -04:00
Marty Schoch
9a777aaa80 added token truncate filter
closes #49
2014-08-06 20:39:42 -04:00
Marty Schoch
d84187fd24 added apostrophe filter to improve turkish analyzer
closes #27
2014-08-06 08:50:00 -04:00
Marty Schoch
78da6fd65d added support for a default field
this works at the config and index mapping levels
2014-08-06 08:23:29 -04:00
Marty Schoch
79ab2b9b3d added unicode normalization filter 2014-08-04 21:59:57 -04:00
Marty Schoch
2c0bf23fac added elision filter
defined article word maps for french, italian, irish and catalan
defined elision filters for these same languages
updated analyers for french and italian to use this new filter
irish and catalan still depend on other missing pieces
closes #25
2014-08-03 19:17:35 -04:00
Marty Schoch
0960cab0ae refactored StopWordsMap into WordMap so it can be reused
the ElisionFilter will need a word list of articles and plan to reuse this
2014-08-03 17:46:35 -04:00
Marty Schoch
00d6f9700b added support for date range fields and queries
closes #9 and closes #11
2014-08-03 17:19:04 -04:00
Marty Schoch
3eb63a887b improved stop word support and related config
stop words can be loaded from files/bytes, closes #19
stop words loaded for large list of languages, closes #20
defined language specific analyzers for as much as possible right now, closes #21
opened new issues for some of the remaining gaps
2014-07-30 19:29:52 -04:00
Marty Schoch
216767953c introduced a config option to disable creating indexes if they don't already exist
closes #23 and closes #24
2014-07-30 14:29:26 -04:00
Marty Schoch
2968d3538a major refactor, apologies for the large commit
removed analyzers (these are now built as needed through config)
removed html chacter filter (now built as needed through config)
added missing license header
changed constructor signature of filters that cannot return errors
filter constructors that can have errors, now have Must variant which panics
change cdl2 tokenizer into filter (should only see lower-case input)
new top level index api, closes #5
refactored index tests to not rely directly on analyzers
moved query objects to top-level
new top level search api, closes #12
top score collector allows skipping results
index mapping supports _all by default, closes #3 and closes #6
index mapping supports disabled sections, closes #7
new http sub package with reusable http.Handler's, closes #22
2014-07-30 12:30:38 -04:00