bleve

Author	SHA1	Message	Date
Rob McColl	414d21a541	Add comment about JSON serialization of kvconfig	2016-10-19 10:19:14 -04:00
Marty Schoch	2332455bd2	nicer formatting of license header	2016-10-02 10:13:14 -04:00
Marty Schoch	79cc39a67e	refactor mapping to inteface and move into separate package the index mapping contains some relatively messy logic and the top-level bleve package only cares about a relatively small portion of this the motivation for this change is to codify the part that the top-level bleve package cares about into an interface then move all the details into its own package NOTE: the top-level bleve package still has hard dependency on the actual implementation (for now) because it must deserialize mappings from JSON and simply assumes it is this one instance. this is seen as OK for now, and this issue could be revisited in a future change. moving the logic into a separate package is seen as a simplification of top-level bleve, even though we still depend on the one particular implementation.	2016-09-29 14:53:18 -04:00
Marty Schoch	fb0f4bbecd	BREAKING CHANGE - new method to create memory only index Previously bleve allowed you to create a memory-only index by simply passing "" as the path argument to the New() method. This was not clear when reading the code, and led to some problematic error cases as well. Now, to create a memory-only index one should use the NewMemOnly() method. Passing "" as the path argument to the New() method will now return os.ErrInvalid. Advanced users calling NewUsing() can create disk-based or memory-only indexes, but the change here is that pass "" as the path argument no longer defaults you into getting a memory-only index. Instead, the KV store is selected manually, just as it is for the disk-based solutions. Here is an example use of the NewUsing() method to create a memory-only index: NewUsing("", indexMapping, Config.DefaultIndexType, Config.DefaultMemKVStore, nil) Config.DefaultMemKVStore is just a new default value added to the configuration, it currently points to gtreap.Name (which could have been used directly instead for more control) closes #427	2016-09-27 14:11:40 -04:00
Marty Schoch	3fd2a64872	BREAKING CHANGE - removed DumpXXX() methods from bleve.Index The DumpXXX() methods were always documented as internal and unsupported. However, now they are being removed from the public top-level API. They are still available on the internal IndexReader, which can be accessed using the Advanced() method. The DocCount() and DumpXXX() methods on the internal index have moved to the internal index reader, since they logically operate on a snapshot of an index.	2016-09-13 12:40:01 -04:00
Marty Schoch	d7292ed891	add support for gathering stats via map for easier consumption	2016-03-07 18:37:46 -05:00
Marty Schoch	0b2380d9bf	introduce ability for searches to timeout or be cancelled our implementation uses: golang.org/x/net/context New method SearchInContext() allows the user to run a search in the provided context. If that context is cancelled or exceeds its deadline Bleve will attempt to stop and return as soon as possible. This is a best effort attempt at this time and may not be in a timely manner. If the caller must return very near the timeout, the call should also be wrapped in a goroutine. The IndexAlias implementation is affected in a slightly more complex way. In order to return partial results when a timeout occurs on some indexes, the timeout is strictly enforced, and at the moment this does introduce an additional goroutine. The Bleve implementation honoring the context is currently very course-grained. Specifically we check the Done() channel between each DocumentMatch produced during the search. In the future we will propogate the context deeper into the internals of Bleve, and this will allow finer-grained timeout behavior.	2016-03-02 17:30:21 -05:00
opennota	8517feb1c6	Fix some typos	2016-01-15 05:46:27 +07:00
Marty Schoch	ab67b2f642	Merge pull request #267 from pmezard/doc-dump-methods index: document DumpAll, DumpDoc and DumpFields methods	2016-01-05 09:55:35 -05:00
Marty Schoch	aa7658bbb0	give indexes names, make stats available via expvar by default	2015-12-06 14:01:03 -05:00
Patrick Mezard	03b78deb5c	index: do not mention locking in DumpAll documentation The behaviour depends on the nature of the KVStore.	2015-11-13 17:01:18 +01:00
Patrick Mezard	97529b1925	index: document DumpAll, DumpDoc and DumpFields methods	2015-11-03 18:11:02 +01:00
Patrick Mezard	2fa334fc27	doc: talk about "documents" not "indexed or stored documents"	2015-10-20 20:24:24 +02:00
Patrick Mezard	b174c137fd	doc: document DocIDReader, and some Index bits	2015-10-20 20:24:24 +02:00
Patrick Mezard	ed1bdbf599	doc: document field analyzer resolution	2015-10-02 17:00:45 +02:00
Marty Schoch	09bde6ca87	Merge pull request #237 from pmezard/document-mapping-rules doc: document indexed value/mappings interactions	2015-09-29 12:51:46 -04:00
Marty Schoch	cddf90c0ee	don't allow operations on empty doc id fixes #239	2015-09-28 17:00:08 -04:00
Patrick Mezard	f72172a902	doc: document indexed value/mappings interactions This is not the final word on this but it would have helped me a lot starting with bleve. I left out details about value processing and custom parsers. I also ignored that named FieldMapping can directly resolve value fields without having a parent SubDocumentMapping because it did not appear in any example I read. Let's consider this as a starting point for documentation improvements.	2015-09-23 19:57:38 +02:00
Marty Schoch	dbb93b75a4	refactoring to allow pluggable index encodings this lays the foundation for supporting the new firestorm indexing scheme. i'm merging these changes ahead of the rest of the firestorm branch so i can continue to make changes to the analysis pipeline in parallel	2015-09-02 13:12:08 -04:00
Marty Schoch	328bc73ed0	clarify Batch is not threadsafe in docs in some limited cases we can detect unsafe usage in these cases, do not trip over ourselves and panic instead return a strongly typed error upside_down.UnsafeBatchUseDetected also, introduced Batch.Reset() to allow batch reuse this is currently still experimental closes #195	2015-05-15 15:04:52 -04:00
Marty Schoch	8581e73cef	added String method for Batch also changed Batch methods to pointer receiver closes #180	2015-04-08 10:41:42 -04:00
Marty Schoch	522f9d5cc7	significant change to index format, support dictionary rows this introduces disk format v4 now the summary rows for a term are stored in their own "dictionary row" format, previously the same information was stored in special term frequency rows this now allows us to easily iterate all the terms for a field in sorted order (useful for many other fuzzy data structures) at the top-level of bleve you can now browse terms within a field using the following api on the Index interface: FieldDict(field string) (index.FieldDict, error) FieldDictRange(field string, startTerm []byte, endTerm []byte) (index.FieldDict, error) FieldDictPrefix(field string, termPrefix []byte) (index.FieldDict, error) fixes #127	2015-03-10 16:22:19 -04:00
Marty Schoch	af356acff0	changed batch behavior now created through the index itself mapping problems reported early at the time data is added to the batch, previously these were not reported until the batch was executed	2015-03-09 08:20:39 -04:00
Marty Schoch	1368d7b3b4	NewUsing persists the provided config to index meta new method OpenUsing allows user to override values in the persisted config example would be opening the index, but using a different buffer size for leveldb (not actually supported yet, but that is the idea) closes #138	2015-01-06 17:19:46 -05:00
Marty Schoch	68712cd142	support for accessing the underlying index/store impls now you can access the underlying index/store implementations using the Advanced() method. this is intedned for advanced usage only, and can lead to problems if misused. also, there is a new method NewUsing(...) which allows callers of the top-level API to choose which underlying k/v store they want to use.	2014-12-27 13:23:46 -08:00
Marty Schoch	6141a5aad3	make batch internals private closes #119	2014-11-25 11:11:28 -05:00
Marty Schoch	c7443fe52b	refactored API a bit more things can return error now in a couple of places we had to swallow errors because they didn't fit the existing API. in these case and proactively in a few others we now return error as well. also the batch API has been updated to allow performing set/delete internal within the batch	2014-10-31 09:40:23 -04:00
Marty Schoch	0500a572af	exposed Get/Set/Delete Internal methods these are to be used to store side-channel information along with the index	2014-10-22 16:03:55 -04:00
Marty Schoch	64b0066121	added support for tracking index stats and exposing via expvar closes #83	2014-10-02 11:12:49 -07:00
Marty Schoch	209f808722	improve go docs at the top level part of #79	2014-08-31 10:55:22 -04:00
Marty Schoch	37d3f0205d	cleanup spacing between license and package	2014-08-29 14:18:36 -04:00
Marty Schoch	1161361bea	rename imports from couchbaselabs to blevesearch	2014-08-28 15:38:57 -04:00
Marty Schoch	34afb0929e	made it safe to use bleve.Index object from multiple threads an RWMutext ensures that only one write op is done at a time, and that all other ops have finished prior to closing	2014-08-25 09:06:53 -04:00
Marty Schoch	27f001bc14	overhauled top-level New/Open API New is now used to create new indexes Open is used to open existing indexes calls to Open no longer specify a mapping because the mapping is serialized and stored along with the index	2014-08-20 16:58:20 -04:00
Marty Schoch	c33f1668f7	refactor dump methods improved test coverage	2014-08-15 13:12:55 -04:00
Marty Schoch	e5d4e6f1e4	refactored index layer to support batch operations this change was then exposed at the higher levels also the beer-sample app was upgraded to index in batches of 100 by default. this yieled an indexing speed up from 27s to 16s. closes #57	2014-08-11 16:27:18 -04:00
Marty Schoch	42895649de	further streamlined the API introduced concept of byte array converters right now only wired up to top-level index mapping allowing the removal of the JSON methods, now at the top level we default to parsing []byte as JSON, override if thats not the behavior you want. future enhancements will allow use of these byte array converters to control how byte arrays are handled elsewhere in documents this would allow for handing binary attachments, etc in the future closes #59	2014-08-11 12:47:29 -04:00
Marty Schoch	4ae9eb895c	added method to list fields in the index also added a corresponding http handler	2014-07-31 11:47:36 -04:00
Marty Schoch	216767953c	introduced a config option to disable creating indexes if they don't already exist closes #23 and closes #24	2014-07-30 14:29:26 -04:00
Marty Schoch	2968d3538a	major refactor, apologies for the large commit removed analyzers (these are now built as needed through config) removed html chacter filter (now built as needed through config) added missing license header changed constructor signature of filters that cannot return errors filter constructors that can have errors, now have Must variant which panics change cdl2 tokenizer into filter (should only see lower-case input) new top level index api, closes #5 refactored index tests to not rely directly on analyzers moved query objects to top-level new top level search api, closes #12 top score collector allows skipping results index mapping supports _all by default, closes #3 and closes #6 index mapping supports disabled sections, closes #7 new http sub package with reusable http.Handler's, closes #22	2014-07-30 12:30:38 -04:00

40 Commits