bleve

Author	SHA1	Message	Date
Marty Schoch	9db850a53e	Merge branch 'fix/MaxVarintLen64' of https://github.com/tukdesk/bleve into tukdesk-fix/MaxVarintLen64	2015-07-31 15:16:16 -04:00
Marty Schoch	3682c25467	update to correctly work with composite fields also updated search results to return array positions	2015-07-31 11:16:11 -04:00
Marty Schoch	c1c4941dde	Merge branch 'feature/term_vector' of https://github.com/tukdesk/bleve into tukdesk-feature/term_vector	2015-07-29 14:31:15 -04:00
Marty Schoch	1b28f6218b	additional row validation	2015-07-13 15:22:54 -04:00
Marty Schoch	7be7ecdf8e	fix batch indexing bug, incremented docCount before commit fixes #211	2015-06-08 14:14:05 -04:00
Marty Schoch	2768c2da3c	fix previous sloppy fix which hadn't been adequately tested	2015-05-27 19:15:55 -07:00
Marty Schoch	201fb91171	fix up to correctly trim off separator even though it should never be present	2015-05-27 19:10:12 -07:00
Marty Schoch	a58592ceff	fix case where NewBackIndexRowKV returns nil, nil the logic for reading the docID from the keys in this row relies on the keys NEVER containing the byte separator character (0xff), this is OK as we require that all keys be valid utf-8 however, it turns out that in the case where this rule was violated, we would panic, because we return nil, nil and later try to print the doc id	2015-05-27 19:04:57 -07:00
dtynn	59c97ae577	use binary.MaxVarintLen64	2015-05-26 15:35:31 +08:00
Marty Schoch	e0887f9113	fix tests which deadlock boltdb due to deferred cleanup fixes #209	2015-05-21 12:29:31 -04:00
dtynn	b4f7496031	update the index format version number	2015-05-18 15:16:35 +08:00
dtynn	89dc2c22bc	update TermVector	2015-05-17 13:07:14 +08:00
Marty Schoch	8f70def63b	properly use the stored array positions when loading a document fixes #205	2015-05-15 15:47:54 -04:00
Marty Schoch	328bc73ed0	clarify Batch is not threadsafe in docs in some limited cases we can detect unsafe usage in these cases, do not trip over ourselves and panic instead return a strongly typed error upside_down.UnsafeBatchUseDetected also, introduced Batch.Reset() to allow batch reuse this is currently still experimental closes #195	2015-05-15 15:04:52 -04:00
Marty Schoch	57cd67fa88	fix data race on index metadata (docCount) closes #198	2015-05-08 08:07:20 -04:00
Marty Schoch	57358088ec	fix row merging bug trying to be clever, we reused the memory allocated for the left operand when doing partial merges this had been tested to be safe, in general. however, the implementation was then written such that we always reused globally defined operands, this meant that we mutated the operands which were intended to always represent +1/-1 this then cascades quickly to making increment/decrement values much larger/smaller than they should be related to #197	2015-05-06 11:00:04 -04:00
Marty Schoch	30a0ba1f9b	fix bug, dictionary row encoding buffer too small we incorrectly created a []byte of length 8 but the max for a uvarint is 10 closes #197	2015-05-06 10:04:02 -04:00
Marty Schoch	ee47d1c21a	standardize on including 1000 sized batches	2015-04-24 17:31:34 -04:00
Marty Schoch	452fea6a24	adding initial impl of rocksdb kv store	2015-04-24 17:19:44 -04:00
Marty Schoch	a9c07acbfa	refactor of kvstore api to support native merge in rocksdb refactor to share code in emulated batch refactor to share code in emulated merge refactor index kvstore benchmarks to share more code refactor index kvstore benchmarks to be more repeatable	2015-04-24 17:13:50 -04:00
Marty Schoch	d5dc66313f	change variable name conflicting when both LevelDB bencharmks run	2015-04-10 15:03:44 -04:00
Marty Schoch	d5caad4405	changed GoLevelDB benchmark names to be different from LevelDB this will allow for easier comparision when running both versions at the same time	2015-04-10 15:00:56 -04:00
Marty Schoch	5f66bd84c7	fix issues identified by errcheck	2015-04-10 14:59:05 -04:00
indraniel	a88d714778	+ add a goleveldb index updside-down benchmark test	2015-04-10 11:08:02 -05:00
Marty Schoch	539aeb8dc7	fix errors identified by errcheck part of #169	2015-04-07 18:05:41 -04:00
Marty Schoch	ba6b3c8bb3	fix more issues identified by errcheck part of #169	2015-04-07 16:45:23 -04:00
Marty Schoch	56c4a09de1	fix issues identified by errcheck part of #169	2015-04-07 15:39:56 -04:00
Marty Schoch	93e01a803e	fix issues identified by errcheck part of #169	2015-04-07 14:52:00 -04:00
Marty Schoch	f1ec73e764	fix issues identified by errcheck part of #169	2015-04-07 13:26:54 -04:00
Marty Schoch	56a30a3574	fix issues identified by errcheck part of #169	2015-04-07 13:05:47 -04:00
Marty Schoch	d2e9409413	fix issues identified by errcheck part of #169	2015-04-07 12:04:59 -04:00
Marty Schoch	867110e03b	major improvements to index row encoding improvements uncovered some issues with how k/v data was copied or not. to address this, kv abstraction layer now lets impl specify if the bytes returned are safe to use after a reader (or writer since writers are also readers) are closed See index/store/KVReader - BytesSafeAfterClose() bool false is the safe value if you're not sure it will cause index impls to copy the data Some kv impls already have created a copy a the C-api barrier in which case they can safely return true. Overall this yields ~25% speedup for searches with leveldb. It yields ~10% speedup for boltdb. Returning stored fields is now slower with boltdb, as previously we were returning unsafe bytes.	2015-04-03 16:50:48 -04:00
Marty Schoch	a44a7c01af	rewrite to used fixed size []byte instead of buffer removes unchecked errors in calls to buffer.Write and also benchmarks considerably faster	2015-03-11 15:12:13 -04:00
Marty Schoch	522f9d5cc7	significant change to index format, support dictionary rows this introduces disk format v4 now the summary rows for a term are stored in their own "dictionary row" format, previously the same information was stored in special term frequency rows this now allows us to easily iterate all the terms for a field in sorted order (useful for many other fuzzy data structures) at the top-level of bleve you can now browse terms within a field using the following api on the Index interface: FieldDict(field string) (index.FieldDict, error) FieldDictRange(field string, startTerm []byte, endTerm []byte) (index.FieldDict, error) FieldDictPrefix(field string, termPrefix []byte) (index.FieldDict, error) fixes #127	2015-03-10 16:22:19 -04:00
Marty Schoch	4e14f4e4ef	change path for forestdb test to correctly cleanup this is due to forestdb auto-compaction using the provided path as just the prefix, so if we're not careful we end up with many stray files laying around here, we create a sub-directory first, and just nuke the whole subdir when we're done	2015-03-10 14:05:58 -04:00
Marty Schoch	300ec79c96	first pass at checking errors that were ignored part of #169	2015-03-06 14:46:29 -05:00
Marty Schoch	a2ad7634f2	update term freq rows to use varint where possible benchmark old ns/op new ns/op delta BenchmarkLevelDBIndexing1Workers 1138292 657901 -42.20% BenchmarkLevelDBIndexing2Workers 1619323 647628 -60.01% BenchmarkLevelDBIndexing4Workers 1172845 636478 -45.73% BenchmarkLevelDBIndexing1Workers10Batch 465556545 448153394 -3.74% BenchmarkLevelDBIndexing2Workers10Batch 504203911 449657355 -10.82% BenchmarkLevelDBIndexing4Workers10Batch 510766435 439839335 -13.89% BenchmarkLevelDBIndexing1Workers100Batch 307657846 268976464 -12.57% BenchmarkLevelDBIndexing2Workers100Batch 302257400 269110215 -10.97% BenchmarkLevelDBIndexing4Workers100Batch 305320485 259084902 -15.14% BenchmarkLevelDBIndexing1Workers1000Batch 301320576 258070231 -14.35% BenchmarkLevelDBIndexing2Workers1000Batch 334174454 261175641 -21.84% BenchmarkLevelDBIndexing4Workers1000Batch 267732436 261461739 -2.34% closes #165	2015-03-06 13:00:53 -05:00
Marty Schoch	c566d34264	bump index format version number, start checking version on open	2015-02-17 17:16:31 +05:30
Steve Yen	38ee9be353	added some batch size 1000 microbenchmarks	2015-01-30 15:58:39 -08:00
Marty Schoch	ba978ea27e	improving log messages	2015-01-16 14:07:47 -05:00
Steve Yen	ea0a8657f3	added cznicb in-memory kvstore (no reader isolation)	2015-01-13 17:35:28 -08:00
Marty Schoch	362d240b09	added configurable options to leveldb	2015-01-13 16:24:51 -05:00
Steve Yen	1fa80ffc40	pass config to forestdb Open()	2015-01-13 11:04:02 -08:00
Steve Yen	603c3af8bb	added gtreap in-memory, copy-on-write KVStore	2015-01-12 11:26:21 -08:00
Marty Schoch	d68c52e621	adding forestdb benchmark	2015-01-12 12:56:37 -05:00
Silvan Jegen	ef18dfe4cd	Fix typos in comments and strings	2014-12-18 18:43:12 +01:00
Sergey Avseyev	a8351be5a6	Update protobuf imports	2014-12-10 01:24:59 +03:00
Marty Schoch	453d4cf770	change to always return stored fields in UTC	2014-11-26 15:36:34 -05:00
Silvan Jegen	e3a2d3b58b	Remove unneeded else clauses	2014-11-20 20:34:05 +01:00
Marty Schoch	c7443fe52b	refactored API a bit more things can return error now in a couple of places we had to swallow errors because they didn't fit the existing API. in these case and proactively in a few others we now return error as well. also the batch API has been updated to allow performing set/delete internal within the batch	2014-10-31 09:40:23 -04:00

1 2

97 Commits