bleve

Author	SHA1	Message	Date
Steve Yen	ea0a8657f3	added cznicb in-memory kvstore (no reader isolation)	2015-01-13 17:35:28 -08:00
Steve Yen	db82eae3f4	go fmt	2015-01-13 11:04:45 -08:00
Steve Yen	603c3af8bb	added gtreap in-memory, copy-on-write KVStore	2015-01-12 11:26:21 -08:00
Marty Schoch	5978f50b8c	added ability to log slow searches closes #88	2014-12-28 19:34:16 -08:00
Marty Schoch	0ddfa774ec	clean up logging to use package level *log.Logger by default messages go to ioutil.Discard	2014-12-28 12:14:48 -08:00
Marty Schoch	d452b2a10e	add support for dictionary based compound word filter partially addresses #115	2014-11-18 15:18:42 -05:00
Marty Schoch	cf3643f292	added pure go tokenizer to do unicode word boundary segmentation	2014-10-17 18:07:48 -04:00
Marty Schoch	97902e2619	text analysis now moved out of index write lock onto goroutine 1. text analysis is now done before the write lock is acquired 2. there is now a pool of analysis workers 3. the size of this pool is configurable 4. this allows for documents in a batch to be analyzed concurrently as a part of benchmarking these changes i've also introduce a new null storage implementation. this should never be used, as it does not actualy build an index. it does however let us go through all the normal indexing machinery, without incuring any indexing I/O. this is very helpful in measuring improvements made to the text analsysis pipeline, which are often overshadowed by indexing times in benchmarks actually building an index.	2014-09-24 08:13:14 -04:00
Marty Schoch	8c16d68c00	include cjk analyzer in default config	2014-09-11 10:44:14 -04:00
Marty Schoch	8debf26cb7	changed many components to not have defaults many of these defaults were arbitrary, and not having defaults lets us more easily flag them for configuration added a shingle filter introduce new toke type for shingles	2014-09-09 18:15:14 -04:00
Marty Schoch	933d99c576	rename the configurable token map from standard to custom this makes it consistent with the "custom" analyzer which operates similarly also, added it to the config.go so its registerd and available for use	2014-09-07 14:09:38 -04:00
Marty Schoch	1dcd06e412	add ability to define custom analysis as part of index mapping now, as part of your index mapping you can create custom analysis components. these custome analysis components are serialized as part of the mapping, and reused as you would expect on subsequent accesses.	2014-09-01 13:55:23 -04:00
Marty Schoch	2ee7289bc8	major refactor of search package this started initially to relocate highlighting into a self contained package, which would then also use the registry however, it turned into a much larger refactor in order to avoid cyclic imports now facets, searchers, scorers and collectors are also broken out into subpackages of search	2014-09-01 11:15:38 -04:00
Marty Schoch	209f808722	improve go docs at the top level part of #79	2014-08-31 10:55:22 -04:00
Marty Schoch	7bfad18d40	moved byte array converts into the analysis package	2014-08-29 19:23:21 -04:00
Marty Schoch	77c998a7a2	made config private and fixed broken test	2014-08-29 15:32:36 -04:00
Marty Schoch	37d3f0205d	cleanup spacing between license and package	2014-08-29 14:18:36 -04:00
Marty Schoch	1161361bea	rename imports from couchbaselabs to blevesearch	2014-08-28 15:38:57 -04:00
Marty Schoch	ef59abe4c9	added build tag 'leveldb' to enable this kv store by default we now use the pure go boltdb kv store it is less tested at this point but appears to work test pass, and moves us closer to the goal of being able to just "go get" bleve	2014-08-25 15:18:24 -04:00
Marty Schoch	e8959d03ae	added build tag 'icu' to enable functionality dependent on it	2014-08-25 12:22:01 -04:00
Marty Schoch	21ef6e9878	added build tag for things depending on libstemmer	2014-08-25 12:06:10 -04:00
Marty Schoch	f37bb77794	added build tag to enable cld2	2014-08-25 11:24:20 -04:00
Marty Schoch	27f001bc14	overhauled top-level New/Open API New is now used to create new indexes Open is used to open existing indexes calls to Open no longer specify a mapping because the mapping is serialized and stored along with the index	2014-08-20 16:58:20 -04:00
Marty Schoch	c526a38369	major refactor of analysis files, now wired up to registry ultimately this is make it more convenient for us to wire up different elements of the analysis pipeline, without having to preload everything into memory before we need it separately the index layer now has a mechanism for storing internal key/value pairs. this is expected to be used to store the mapping, and possibly other pieces of data by the top layer, but not exposed to the user at the top.	2014-08-13 21:14:47 -04:00
Marty Schoch	3481ec9cef	added hindi stemmer closes #40	2014-08-11 22:29:47 -04:00
Marty Schoch	c65f7415ff	added hindi normalizer closes #64	2014-08-11 19:51:47 -04:00
Marty Schoch	cd0e3fd85b	added german normalizer updated german analyzer to use this normalizer closes #65	2014-08-11 19:25:37 -04:00
Marty Schoch	a4707ebb4e	configured zero width non joiner char filter, and persian analyzer	2014-08-11 18:57:04 -04:00
Marty Schoch	4ccd69ed45	added arabic normalizer closes #63	2014-08-11 18:35:35 -04:00
Marty Schoch	73b252f6a6	added persian normalizer closes #67	2014-08-11 18:15:41 -04:00
Marty Schoch	42895649de	further streamlined the API introduced concept of byte array converters right now only wired up to top-level index mapping allowing the removal of the JSON methods, now at the top level we default to parsing []byte as JSON, override if thats not the behavior you want. future enhancements will allow use of these byte array converters to control how byte arrays are handled elsewhere in documents this would allow for handing binary attachments, etc in the future closes #59	2014-08-11 12:47:29 -04:00
Marty Schoch	e21b7f4436	added sorani normalizer and stemmer, now have analyzer closes #43	2014-08-08 09:38:28 -04:00
Marty Schoch	ef35ea1985	added czech stop word list closes #36	2014-08-07 22:32:49 -04:00
Marty Schoch	9a777aaa80	added token truncate filter closes #49	2014-08-06 20:39:42 -04:00
Marty Schoch	d84187fd24	added apostrophe filter to improve turkish analyzer closes #27	2014-08-06 08:50:00 -04:00
Marty Schoch	78da6fd65d	added support for a default field this works at the config and index mapping levels	2014-08-06 08:23:29 -04:00
Marty Schoch	79ab2b9b3d	added unicode normalization filter	2014-08-04 21:59:57 -04:00
Marty Schoch	2c0bf23fac	added elision filter defined article word maps for french, italian, irish and catalan defined elision filters for these same languages updated analyers for french and italian to use this new filter irish and catalan still depend on other missing pieces closes #25	2014-08-03 19:17:35 -04:00
Marty Schoch	0960cab0ae	refactored StopWordsMap into WordMap so it can be reused the ElisionFilter will need a word list of articles and plan to reuse this	2014-08-03 17:46:35 -04:00
Marty Schoch	00d6f9700b	added support for date range fields and queries closes #9 and closes #11	2014-08-03 17:19:04 -04:00
Marty Schoch	3eb63a887b	improved stop word support and related config stop words can be loaded from files/bytes, closes #19 stop words loaded for large list of languages, closes #20 defined language specific analyzers for as much as possible right now, closes #21 opened new issues for some of the remaining gaps	2014-07-30 19:29:52 -04:00
Marty Schoch	216767953c	introduced a config option to disable creating indexes if they don't already exist closes #23 and closes #24	2014-07-30 14:29:26 -04:00
Marty Schoch	2968d3538a	major refactor, apologies for the large commit removed analyzers (these are now built as needed through config) removed html chacter filter (now built as needed through config) added missing license header changed constructor signature of filters that cannot return errors filter constructors that can have errors, now have Must variant which panics change cdl2 tokenizer into filter (should only see lower-case input) new top level index api, closes #5 refactored index tests to not rely directly on analyzers moved query objects to top-level new top level search api, closes #12 top score collector allows skipping results index mapping supports _all by default, closes #3 and closes #6 index mapping supports disabled sections, closes #7 new http sub package with reusable http.Handler's, closes #22	2014-07-30 12:30:38 -04:00

43 Commits