bleve

Author	SHA1	Message	Date
Marty Schoch	79cc39a67e	refactor mapping to inteface and move into separate package the index mapping contains some relatively messy logic and the top-level bleve package only cares about a relatively small portion of this the motivation for this change is to codify the part that the top-level bleve package cares about into an interface then move all the details into its own package NOTE: the top-level bleve package still has hard dependency on the actual implementation (for now) because it must deserialize mappings from JSON and simply assumes it is this one instance. this is seen as OK for now, and this issue could be revisited in a future change. moving the logic into a separate package is seen as a simplification of top-level bleve, even though we still depend on the one particular implementation.	2016-09-29 14:53:18 -04:00
Marty Schoch	fb0f4bbecd	BREAKING CHANGE - new method to create memory only index Previously bleve allowed you to create a memory-only index by simply passing "" as the path argument to the New() method. This was not clear when reading the code, and led to some problematic error cases as well. Now, to create a memory-only index one should use the NewMemOnly() method. Passing "" as the path argument to the New() method will now return os.ErrInvalid. Advanced users calling NewUsing() can create disk-based or memory-only indexes, but the change here is that pass "" as the path argument no longer defaults you into getting a memory-only index. Instead, the KV store is selected manually, just as it is for the disk-based solutions. Here is an example use of the NewUsing() method to create a memory-only index: NewUsing("", indexMapping, Config.DefaultIndexType, Config.DefaultMemKVStore, nil) Config.DefaultMemKVStore is just a new default value added to the configuration, it currently points to gtreap.Name (which could have been used directly instead for more control) closes #427	2016-09-27 14:11:40 -04:00
Marty Schoch	3fd2a64872	BREAKING CHANGE - removed DumpXXX() methods from bleve.Index The DumpXXX() methods were always documented as internal and unsupported. However, now they are being removed from the public top-level API. They are still available on the internal IndexReader, which can be accessed using the Advanced() method. The DocCount() and DumpXXX() methods on the internal index have moved to the internal index reader, since they logically operate on a snapshot of an index.	2016-09-13 12:40:01 -04:00
Marty Schoch	47c239ca7b	refactored data structure out of collector the TopNCollector now can either use a heap or a list i did not code it to use an interface, because this is a very hot loop during searching. rather, it lets bleve developers easily toggle between the two (or other ideas) by changing 2 lines The list is faster in the benchmark, but causes more allocations. The list is once again the default (for now). To switch to the heap implementation, change: store collectStoreList to store collectStoreHeap and newStoreList(... to newStoreHeap(...	2016-08-26 10:29:50 -04:00
Marty Schoch	0bb69a9a1c	Merge branch 'master' of https://github.com/dtylman/bleve into sort-by-field-try2	2016-08-12 14:23:55 -04:00
Danny Tylman	0d6a2b565f	closes #110	2016-08-10 11:44:31 +03:00
Danny Tylman	5164e70f6e	Adding sort to SearchRequest.	2016-08-09 16:18:53 +03:00
Marty Schoch	5aa9e95468	major refactor of index/search API index id's are now opaque (until finally returned to top-level user) - the TermFieldDoc's returned by TermFieldReader no longer contain doc id - instead they return an opaque IndexInternalID - items returned are still in the "natural index order" - but that is no longer guaranteed to be "doc id order" - correct behavior requires that they all follow the same order - but not any particular order - new API FinalizeDocID which converts index internal ID's to public string ID - APIs used internally which previously took doc id now take IndexInternalID - that is DocumentFieldTerms() and DocumentFieldTermsForFields() - however, APIs that are used externally do not reflect this change - that is Document() - DocumentIDReader follows the same changes, but this is less obvious - behavior clarified, used to iterate doc ids, BUT NOT in doc id order - method STILL available to iterate doc ids in range - but again, you won't get them in any meaningful order - new method to iterate actual doc ids from list of possible ids - this was introduced to make the DocIDSearcher continue working searchers now work with the new opaque index internal doc ids - they return new DocumentMatchInternal (which does not have string ID) scorerers also work with these opaque index internal doc ids - they return DocumentMatchInternal (which does not have string ID) collectors now also perform a final step of converting the final result - they STILL return traditional DocumentMatch (with string ID) - but they now also require an IndexReader (so that they can do the conversion)	2016-07-31 13:46:18 -04:00
Marty Schoch	389e18a779	attempt to support google app engine the default configuration, which sets the default kv engine to boltdb is now done in file protected with the !appengine build tag. this at least lets the analysis-wizzard app run locally in the appengine simulator. this still has not been tested on the real appengine, and further changes may be required.	2016-07-29 21:29:05 -04:00
slavikm	6d830a9f3e	Load the document only once for both fields and highlighter	2016-04-28 11:12:33 -07:00
Marty Schoch	d0c6dbc9cf	unregister index from expvar stats on close	2016-04-20 11:43:14 -04:00
Marty Schoch	709b418823	properly initialize index stats object for in memory indexes	2016-04-20 11:37:51 -04:00
Marty Schoch	2b82387eae	export the Validate method on mapping objects	2016-03-28 17:14:41 -04:00
Marty Schoch	e76b1dd8f3	better logging on index mapping corruption	2016-03-11 16:36:43 -05:00
Marty Schoch	d7292ed891	add support for gathering stats via map for easier consumption	2016-03-07 18:37:46 -05:00
Marty Schoch	0b2380d9bf	introduce ability for searches to timeout or be cancelled our implementation uses: golang.org/x/net/context New method SearchInContext() allows the user to run a search in the provided context. If that context is cancelled or exceeds its deadline Bleve will attempt to stop and return as soon as possible. This is a best effort attempt at this time and may not be in a timely manner. If the caller must return very near the timeout, the call should also be wrapped in a goroutine. The IndexAlias implementation is affected in a slightly more complex way. In order to return partial results when a timeout occurs on some indexes, the timeout is strictly enforced, and at the moment this does introduce an additional goroutine. The Bleve implementation honoring the context is currently very course-grained. Specifically we check the Done() channel between each DocumentMatch produced during the search. In the future we will propogate the context deeper into the internals of Bleve, and this will allow finer-grained timeout behavior.	2016-03-02 17:30:21 -05:00
Marty Schoch	214b67ad66	SearchResult now includes a Status section the Status section can report on the number of total/fail/success indexes when querying across multiple indexes through IndexAlias Further, searching an IndexAlias will now return partial results, the burden is on the caller to check the number of failed indexes and decide how to handle this situation.	2016-02-22 16:50:40 -05:00
Marty Schoch	f38e3e1b24	remove temporary error and replace with permanent check	2016-02-03 10:23:49 -05:00
Marty Schoch	c5dea9e882	fix accessing store via Advanced() method which was broken	2016-02-02 11:54:18 -05:00
Marty Schoch	a236737a68	temporary workaround to avoid crashing when an index is not behaving consistent with the API contracts	2016-02-01 12:31:26 -05:00
Marty Schoch	10e2207179	adding logging for unexplained observed behavior MB-17298 it would appear that a document lookup for an id fails but that is a document id that was returned as a search hit since we're using a stable snapshot, this should not happen	2016-01-25 10:45:58 -05:00
opennota	8517feb1c6	Fix some typos	2016-01-15 05:46:27 +07:00
slavikm	680be52f87	Implemented boolean field support	2016-01-11 17:18:03 -08:00
Marty Schoch	d73beac3b9	search result hits now have a field with the name of the index this allows you to figure out where a result actually came from when using aliases	2015-12-08 13:55:04 -05:00
Marty Schoch	aa7658bbb0	give indexes names, make stats available via expvar by default	2015-12-06 14:01:03 -05:00
Marty Schoch	64ce81c283	Merge branch 'master' into newkvstore	2015-09-29 14:06:27 -04:00
Marty Schoch	da40935e22	Merge branch 'codesimplification' of https://github.com/Shugyousha/bleve into Shugyousha-codesimplification	2015-09-29 13:02:56 -04:00
Marty Schoch	cddf90c0ee	don't allow operations on empty doc id fixes #239	2015-09-28 17:00:08 -04:00
Marty Schoch	1c9feaf792	fix backwards compatibility when index meta does not specify the index type	2015-09-25 09:57:09 -07:00
Marty Schoch	900f1b4a67	major kvstore interface and impl overhaul clarified the interface contract	2015-09-23 11:25:47 -07:00
Silvan Jegen	3414701fca	Simplify returns	2015-09-21 20:47:10 +02:00
Marty Schoch	f81b2be334	major refactor of bleve configuration see #221 for full details	2015-09-16 17:10:59 -04:00
Marty Schoch	dbb93b75a4	refactoring to allow pluggable index encodings this lays the foundation for supporting the new firestorm indexing scheme. i'm merging these changes ahead of the rest of the firestorm branch so i can continue to make changes to the analysis pipeline in parallel	2015-09-02 13:12:08 -04:00
Marty Schoch	a9c07acbfa	refactor of kvstore api to support native merge in rocksdb refactor to share code in emulated batch refactor to share code in emulated merge refactor index kvstore benchmarks to share more code refactor index kvstore benchmarks to be more repeatable	2015-04-24 17:13:50 -04:00
Marty Schoch	11262c793f	fix bug, internal ops must check that index is open possibly fixes https://github.com/couchbaselabs/cbft/issues/49	2015-04-03 18:05:24 -04:00
Sathyanarayanan Gunasekaran	93e749bc0c	Check all return errors - Fix the following errors found by errcheck : $ bleve git:(master) errcheck github.com/blevesearch/bleve github.com/blevesearch/bleve/index_impl.go:206:25 defer indexReader.Close() github.com/blevesearch/bleve/index_impl.go:317:25 defer indexReader.Close() github.com/blevesearch/bleve/index_impl.go:353:25 defer indexReader.Close() github.com/blevesearch/bleve/index_impl.go:359:22 defer searcher.Close() github.com/blevesearch/bleve/index_impl.go:497:25 defer indexReader.Close() github.com/blevesearch/bleve/index_impl.go:644:20 defer reader.Close() github.com/blevesearch/bleve/index_meta.go:67:27 defer indexMetaFile.Close()	2015-03-11 01:28:51 -04:00
Marty Schoch	522f9d5cc7	significant change to index format, support dictionary rows this introduces disk format v4 now the summary rows for a term are stored in their own "dictionary row" format, previously the same information was stored in special term frequency rows this now allows us to easily iterate all the terms for a field in sorted order (useful for many other fuzzy data structures) at the top-level of bleve you can now browse terms within a field using the following api on the Index interface: FieldDict(field string) (index.FieldDict, error) FieldDictRange(field string, startTerm []byte, endTerm []byte) (index.FieldDict, error) FieldDictPrefix(field string, termPrefix []byte) (index.FieldDict, error) fixes #127	2015-03-10 16:22:19 -04:00
Marty Schoch	af356acff0	changed batch behavior now created through the index itself mapping problems reported early at the time data is added to the batch, previously these were not reported until the batch was executed	2015-03-09 08:20:39 -04:00
Marty Schoch	0771f813ce	SearchResult Took field now returns full time in Search() likewise, MultiSearch used by aliases spanning multiple will also return full time in MultiSearch() closes #163	2015-02-19 12:11:40 +05:30
Marty Schoch	ba978ea27e	improving log messages	2015-01-16 14:07:47 -05:00
Marty Schoch	1368d7b3b4	NewUsing persists the provided config to index meta new method OpenUsing allows user to override values in the persisted config example would be opening the index, but using a different buffer size for leveldb (not actually supported yet, but that is the idea) closes #138	2015-01-06 17:19:46 -05:00
Marty Schoch	435058a928	fix go vet issue	2014-12-28 19:44:03 -08:00
Marty Schoch	5978f50b8c	added ability to log slow searches closes #88	2014-12-28 19:34:16 -08:00
Marty Schoch	68712cd142	support for accessing the underlying index/store impls now you can access the underlying index/store implementations using the Advanced() method. this is intedned for advanced usage only, and can lead to problems if misused. also, there is a new method NewUsing(...) which allows callers of the top-level API to choose which underlying k/v store they want to use.	2014-12-27 13:23:46 -08:00
Silvan Jegen	ef18dfe4cd	Fix typos in comments and strings	2014-12-18 18:43:12 +01:00
Marty Schoch	6141a5aad3	make batch internals private closes #119	2014-11-25 11:11:28 -05:00
Marty Schoch	c7443fe52b	refactored API a bit more things can return error now in a couple of places we had to swallow errors because they didn't fit the existing API. in these case and proactively in a few others we now return error as well. also the batch API has been updated to allow performing set/delete internal within the batch	2014-10-31 09:40:23 -04:00
Marty Schoch	0500a572af	exposed Get/Set/Delete Internal methods these are to be used to store side-channel information along with the index	2014-10-22 16:03:55 -04:00
Marty Schoch	7bf44e1ba7	added ability to return all document fields by requesting field *	2014-10-15 19:16:16 -04:00
Marty Schoch	64b0066121	added support for tracking index stats and exposing via expvar closes #83	2014-10-02 11:12:49 -07:00

1 2

76 Commits