bleve

Author	SHA1	Message	Date
indraniel	a88d714778	+ add a goleveldb index updside-down benchmark test	2015-04-10 11:08:02 -05:00
Marty Schoch	539aeb8dc7	fix errors identified by errcheck part of #169	2015-04-07 18:05:41 -04:00
Marty Schoch	ba6b3c8bb3	fix more issues identified by errcheck part of #169	2015-04-07 16:45:23 -04:00
Marty Schoch	56c4a09de1	fix issues identified by errcheck part of #169	2015-04-07 15:39:56 -04:00
Marty Schoch	93e01a803e	fix issues identified by errcheck part of #169	2015-04-07 14:52:00 -04:00
Marty Schoch	f1ec73e764	fix issues identified by errcheck part of #169	2015-04-07 13:26:54 -04:00
Marty Schoch	56a30a3574	fix issues identified by errcheck part of #169	2015-04-07 13:05:47 -04:00
Marty Schoch	d2e9409413	fix issues identified by errcheck part of #169	2015-04-07 12:04:59 -04:00
Marty Schoch	867110e03b	major improvements to index row encoding improvements uncovered some issues with how k/v data was copied or not. to address this, kv abstraction layer now lets impl specify if the bytes returned are safe to use after a reader (or writer since writers are also readers) are closed See index/store/KVReader - BytesSafeAfterClose() bool false is the safe value if you're not sure it will cause index impls to copy the data Some kv impls already have created a copy a the C-api barrier in which case they can safely return true. Overall this yields ~25% speedup for searches with leveldb. It yields ~10% speedup for boltdb. Returning stored fields is now slower with boltdb, as previously we were returning unsafe bytes.	2015-04-03 16:50:48 -04:00
Marty Schoch	a44a7c01af	rewrite to used fixed size []byte instead of buffer removes unchecked errors in calls to buffer.Write and also benchmarks considerably faster	2015-03-11 15:12:13 -04:00
Marty Schoch	522f9d5cc7	significant change to index format, support dictionary rows this introduces disk format v4 now the summary rows for a term are stored in their own "dictionary row" format, previously the same information was stored in special term frequency rows this now allows us to easily iterate all the terms for a field in sorted order (useful for many other fuzzy data structures) at the top-level of bleve you can now browse terms within a field using the following api on the Index interface: FieldDict(field string) (index.FieldDict, error) FieldDictRange(field string, startTerm []byte, endTerm []byte) (index.FieldDict, error) FieldDictPrefix(field string, termPrefix []byte) (index.FieldDict, error) fixes #127	2015-03-10 16:22:19 -04:00
Marty Schoch	4e14f4e4ef	change path for forestdb test to correctly cleanup this is due to forestdb auto-compaction using the provided path as just the prefix, so if we're not careful we end up with many stray files laying around here, we create a sub-directory first, and just nuke the whole subdir when we're done	2015-03-10 14:05:58 -04:00
Marty Schoch	300ec79c96	first pass at checking errors that were ignored part of #169	2015-03-06 14:46:29 -05:00
Marty Schoch	a2ad7634f2	update term freq rows to use varint where possible benchmark old ns/op new ns/op delta BenchmarkLevelDBIndexing1Workers 1138292 657901 -42.20% BenchmarkLevelDBIndexing2Workers 1619323 647628 -60.01% BenchmarkLevelDBIndexing4Workers 1172845 636478 -45.73% BenchmarkLevelDBIndexing1Workers10Batch 465556545 448153394 -3.74% BenchmarkLevelDBIndexing2Workers10Batch 504203911 449657355 -10.82% BenchmarkLevelDBIndexing4Workers10Batch 510766435 439839335 -13.89% BenchmarkLevelDBIndexing1Workers100Batch 307657846 268976464 -12.57% BenchmarkLevelDBIndexing2Workers100Batch 302257400 269110215 -10.97% BenchmarkLevelDBIndexing4Workers100Batch 305320485 259084902 -15.14% BenchmarkLevelDBIndexing1Workers1000Batch 301320576 258070231 -14.35% BenchmarkLevelDBIndexing2Workers1000Batch 334174454 261175641 -21.84% BenchmarkLevelDBIndexing4Workers1000Batch 267732436 261461739 -2.34% closes #165	2015-03-06 13:00:53 -05:00
Marty Schoch	c566d34264	bump index format version number, start checking version on open	2015-02-17 17:16:31 +05:30
Steve Yen	38ee9be353	added some batch size 1000 microbenchmarks	2015-01-30 15:58:39 -08:00
Marty Schoch	ba978ea27e	improving log messages	2015-01-16 14:07:47 -05:00
Steve Yen	ea0a8657f3	added cznicb in-memory kvstore (no reader isolation)	2015-01-13 17:35:28 -08:00
Marty Schoch	362d240b09	added configurable options to leveldb	2015-01-13 16:24:51 -05:00
Steve Yen	1fa80ffc40	pass config to forestdb Open()	2015-01-13 11:04:02 -08:00
Steve Yen	603c3af8bb	added gtreap in-memory, copy-on-write KVStore	2015-01-12 11:26:21 -08:00
Marty Schoch	d68c52e621	adding forestdb benchmark	2015-01-12 12:56:37 -05:00
Silvan Jegen	ef18dfe4cd	Fix typos in comments and strings	2014-12-18 18:43:12 +01:00
Sergey Avseyev	a8351be5a6	Update protobuf imports	2014-12-10 01:24:59 +03:00
Marty Schoch	453d4cf770	change to always return stored fields in UTC	2014-11-26 15:36:34 -05:00
Silvan Jegen	e3a2d3b58b	Remove unneeded else clauses	2014-11-20 20:34:05 +01:00
Marty Schoch	c7443fe52b	refactored API a bit more things can return error now in a couple of places we had to swallow errors because they didn't fit the existing API. in these case and proactively in a few others we now return error as well. also the batch API has been updated to allow performing set/delete internal within the batch	2014-10-31 09:40:23 -04:00
Marty Schoch	64b0066121	added support for tracking index stats and exposing via expvar closes #83	2014-10-02 11:12:49 -07:00
Marty Schoch	97902e2619	text analysis now moved out of index write lock onto goroutine 1. text analysis is now done before the write lock is acquired 2. there is now a pool of analysis workers 3. the size of this pool is configurable 4. this allows for documents in a batch to be analyzed concurrently as a part of benchmarking these changes i've also introduce a new null storage implementation. this should never be used, as it does not actualy build an index. it does however let us go through all the normal indexing machinery, without incuring any indexing I/O. this is very helpful in measuring improvements made to the text analsysis pipeline, which are often overshadowed by indexing times in benchmarks actually building an index.	2014-09-24 08:13:14 -04:00
Marty Schoch	198ca1ad4d	major refactor of kvstore/index internals, see below In the index/store package introduce KVReader creates snapshot all read operations consistent from this snapshot must close to release introduce KVWriter only one writer active access to all operations allows for consisten read-modify-write must close to release introduce AssociativeMerge operation on batch allows efficient read-modify-write for associative operations used to consolidate updates to the term summary rows saves 1 set and 1 get op per shared instance of term in field In the index package introduced an IndexReader exposes a consisten snapshot of the index for searching At top level All searches now operate on a consisten snapshot of the index	2014-09-12 17:21:35 -04:00
Marty Schoch	7819deb447	added boltdb benchmark, same as others	2014-09-12 16:55:50 -04:00
Marty Schoch	2294b24b9d	remove forestdb for now not any benfefit in maintaining this for the time being	2014-09-12 16:55:11 -04:00
Marty Schoch	9d2187706e	another round of golint	2014-09-03 19:53:59 -04:00
Marty Schoch	e1b77956d4	more golint cleanups	2014-09-03 18:47:02 -04:00
Marty Schoch	377ae090d0	additional golint issues resolved	2014-09-03 18:17:26 -04:00
Marty Schoch	d534b0836b	converted ALL_CAPS constants to CamelCase	2014-09-03 17:48:40 -04:00
Marty Schoch	8e6c8e5644	continued refactoring of the mapping code also renamed some constant that didnt follow go convetions	2014-09-03 13:02:10 -04:00
Marty Schoch	45e1b2dfc6	removing gouchstore store impl this implementation didn't really adhere to the contract and now that we have boltdb we have a better pure go impl	2014-09-02 13:56:35 -04:00
Marty Schoch	7a7eb2e94c	add newline between license and package this avoids cluttering godocs with the license	2014-09-02 10:54:50 -04:00
Marty Schoch	1161361bea	rename imports from couchbaselabs to blevesearch	2014-08-28 15:38:57 -04:00
Marty Schoch	ef59abe4c9	added build tag 'leveldb' to enable this kv store by default we now use the pure go boltdb kv store it is less tested at this point but appears to work test pass, and moves us closer to the goal of being able to just "go get" bleve	2014-08-25 15:18:24 -04:00
Marty Schoch	45a7a6dd8e	fix two missing Close calls holding iterators open	2014-08-25 15:13:15 -04:00
Marty Schoch	3309c698f8	fixed Document() behavior ot return nil when doc doesn't exist	2014-08-25 08:55:14 -04:00
Marty Schoch	27f001bc14	overhauled top-level New/Open API New is now used to create new indexes Open is used to open existing indexes calls to Open no longer specify a mapping because the mapping is serialized and stored along with the index	2014-08-20 16:58:20 -04:00
Marty Schoch	a08a7f5b2a	fix broken tests	2014-08-19 10:02:33 -04:00
Marty Schoch	082a5b0b03	major change to fields now can track array positions for field values stored fields now include this in the key and the back index now uses protobufs to simplify serialization closes #73	2014-08-19 08:58:26 -04:00
Marty Schoch	c33f1668f7	refactor dump methods improved test coverage	2014-08-15 13:12:55 -04:00
Marty Schoch	4d53db9fc8	fixed bug with internal get/set/delete, added tests	2014-08-15 09:39:41 -04:00
Marty Schoch	c526a38369	major refactor of analysis files, now wired up to registry ultimately this is make it more convenient for us to wire up different elements of the analysis pipeline, without having to preload everything into memory before we need it separately the index layer now has a mechanism for storing internal key/value pairs. this is expected to be used to store the mapping, and possibly other pieces of data by the top layer, but not exposed to the user at the top.	2014-08-13 21:14:47 -04:00
Marty Schoch	e5d4e6f1e4	refactored index layer to support batch operations this change was then exposed at the higher levels also the beer-sample app was upgraded to index in batches of 100 by default. this yieled an indexing speed up from 27s to 16s. closes #57	2014-08-11 16:27:18 -04:00

1 2

74 Commits