bleve

Author	SHA1	Message	Date
Steve Yen	846912d083	upside_down udc.termVectorsFromTokenFreq rows append optimization	2016-01-07 00:48:34 -08:00
Steve Yen	8b980bd2ef	firestorm avoid extra goroutine, similar to upside_down	2016-01-07 00:43:27 -08:00
Steve Yen	4eee8821f9	upside_down storeField/indexField append to provided arrays Taking another optimization from firestorm, upside_down's storeField()/indexField() funcs now also append() to passed-in arrays rather than always allocating their own arrays.	2016-01-07 00:13:46 -08:00
Steve Yen	82b8b3468e	upside_down analysis converts to docIDBytes once	2016-01-06 23:38:02 -08:00
Steve Yen	89d17f01ef	analyze locations only if includeTermVectors enabled With this change, TermLocations are computed and maintained only if includeTermVectors is enabled, for higher performance.	2016-01-05 12:46:46 -08:00
Marty Schoch	8efbd556a3	fix indexing bug with data coming from arrays fixes #295	2015-12-21 14:59:32 -05:00
Marty Schoch	30651065e9	fix panic on insufficiently sized buffer adds test case to reproduce original problem fixes #264	2015-10-30 18:25:38 -04:00
Marty Schoch	817c317c90	Merge branch 'master' into newkvstore	2015-10-19 12:04:07 -04:00
Marty Schoch	faceecf87b	make row buffer size constant/configurable also handle case where it is insufficiently sized	2015-10-19 12:03:38 -04:00
Marty Schoch	c9471d5739	Merge pull request #244 from kevgs/master reducing allocation count	2015-10-16 15:51:30 -04:00
Marty Schoch	4c6bc23043	rewrite to keep using same buffer when possible	2015-10-13 14:04:56 -07:00
Marty Schoch	8de860bf12	2 more places that used old Key()	2015-10-13 12:35:08 -07:00
Patrick Mezard	8c928539ee	upside_down: no need for a goroutine to enqueue AnalysisWork It boils down to: 1. client sends some work and a notification channel to a single worker, then waits. 2. worker processes the work 3. worker sends the result to the client using the notification channel I do not see any problem with this, even with unbuffered channels.	2015-10-12 10:42:14 +02:00
Marty Schoch	0f05d1d3ca	Merge branch 'master' into newkvstore	2015-10-09 10:33:41 -04:00
Patrick Mezard	aee82f8b49	upside_down: simplify return code in batchRows()	2015-10-09 09:57:12 +02:00
Marty Schoch	e28eb749d7	bump up buffer size	2015-10-06 16:45:38 -04:00
Marty Schoch	71cbb13e07	modify code to reuse buffer for kv generation	2015-10-05 17:49:50 -04:00
Kosov Eugene	a61c350888	reducing allocation count	2015-10-05 22:57:10 +03:00
Marty Schoch	d06b526cbf	more refactoring	2015-09-28 16:50:27 -04:00
Marty Schoch	900f1b4a67	major kvstore interface and impl overhaul clarified the interface contract	2015-09-23 11:25:47 -07:00
Marty Schoch	dbb93b75a4	refactoring to allow pluggable index encodings this lays the foundation for supporting the new firestorm indexing scheme. i'm merging these changes ahead of the rest of the firestorm branch so i can continue to make changes to the analysis pipeline in parallel	2015-09-02 13:12:08 -04:00
Marty Schoch	3682c25467	update to correctly work with composite fields also updated search results to return array positions	2015-07-31 11:16:11 -04:00
Marty Schoch	c1c4941dde	Merge branch 'feature/term_vector' of https://github.com/tukdesk/bleve into tukdesk-feature/term_vector	2015-07-29 14:31:15 -04:00
Marty Schoch	7be7ecdf8e	fix batch indexing bug, incremented docCount before commit fixes #211	2015-06-08 14:14:05 -04:00
dtynn	b4f7496031	update the index format version number	2015-05-18 15:16:35 +08:00
dtynn	89dc2c22bc	update TermVector	2015-05-17 13:07:14 +08:00
Marty Schoch	8f70def63b	properly use the stored array positions when loading a document fixes #205	2015-05-15 15:47:54 -04:00
Marty Schoch	328bc73ed0	clarify Batch is not threadsafe in docs in some limited cases we can detect unsafe usage in these cases, do not trip over ourselves and panic instead return a strongly typed error upside_down.UnsafeBatchUseDetected also, introduced Batch.Reset() to allow batch reuse this is currently still experimental closes #195	2015-05-15 15:04:52 -04:00
Marty Schoch	57cd67fa88	fix data race on index metadata (docCount) closes #198	2015-05-08 08:07:20 -04:00
Marty Schoch	a9c07acbfa	refactor of kvstore api to support native merge in rocksdb refactor to share code in emulated batch refactor to share code in emulated merge refactor index kvstore benchmarks to share more code refactor index kvstore benchmarks to be more repeatable	2015-04-24 17:13:50 -04:00
Marty Schoch	f1ec73e764	fix issues identified by errcheck part of #169	2015-04-07 13:26:54 -04:00
Marty Schoch	522f9d5cc7	significant change to index format, support dictionary rows this introduces disk format v4 now the summary rows for a term are stored in their own "dictionary row" format, previously the same information was stored in special term frequency rows this now allows us to easily iterate all the terms for a field in sorted order (useful for many other fuzzy data structures) at the top-level of bleve you can now browse terms within a field using the following api on the Index interface: FieldDict(field string) (index.FieldDict, error) FieldDictRange(field string, startTerm []byte, endTerm []byte) (index.FieldDict, error) FieldDictPrefix(field string, termPrefix []byte) (index.FieldDict, error) fixes #127	2015-03-10 16:22:19 -04:00
Marty Schoch	a2ad7634f2	update term freq rows to use varint where possible benchmark old ns/op new ns/op delta BenchmarkLevelDBIndexing1Workers 1138292 657901 -42.20% BenchmarkLevelDBIndexing2Workers 1619323 647628 -60.01% BenchmarkLevelDBIndexing4Workers 1172845 636478 -45.73% BenchmarkLevelDBIndexing1Workers10Batch 465556545 448153394 -3.74% BenchmarkLevelDBIndexing2Workers10Batch 504203911 449657355 -10.82% BenchmarkLevelDBIndexing4Workers10Batch 510766435 439839335 -13.89% BenchmarkLevelDBIndexing1Workers100Batch 307657846 268976464 -12.57% BenchmarkLevelDBIndexing2Workers100Batch 302257400 269110215 -10.97% BenchmarkLevelDBIndexing4Workers100Batch 305320485 259084902 -15.14% BenchmarkLevelDBIndexing1Workers1000Batch 301320576 258070231 -14.35% BenchmarkLevelDBIndexing2Workers1000Batch 334174454 261175641 -21.84% BenchmarkLevelDBIndexing4Workers1000Batch 267732436 261461739 -2.34% closes #165	2015-03-06 13:00:53 -05:00
Marty Schoch	c566d34264	bump index format version number, start checking version on open	2015-02-17 17:16:31 +05:30
Marty Schoch	ba978ea27e	improving log messages	2015-01-16 14:07:47 -05:00
Silvan Jegen	ef18dfe4cd	Fix typos in comments and strings	2014-12-18 18:43:12 +01:00
Sergey Avseyev	a8351be5a6	Update protobuf imports	2014-12-10 01:24:59 +03:00
Marty Schoch	c7443fe52b	refactored API a bit more things can return error now in a couple of places we had to swallow errors because they didn't fit the existing API. in these case and proactively in a few others we now return error as well. also the batch API has been updated to allow performing set/delete internal within the batch	2014-10-31 09:40:23 -04:00
Marty Schoch	64b0066121	added support for tracking index stats and exposing via expvar closes #83	2014-10-02 11:12:49 -07:00
Marty Schoch	97902e2619	text analysis now moved out of index write lock onto goroutine 1. text analysis is now done before the write lock is acquired 2. there is now a pool of analysis workers 3. the size of this pool is configurable 4. this allows for documents in a batch to be analyzed concurrently as a part of benchmarking these changes i've also introduce a new null storage implementation. this should never be used, as it does not actualy build an index. it does however let us go through all the normal indexing machinery, without incuring any indexing I/O. this is very helpful in measuring improvements made to the text analsysis pipeline, which are often overshadowed by indexing times in benchmarks actually building an index.	2014-09-24 08:13:14 -04:00
Marty Schoch	198ca1ad4d	major refactor of kvstore/index internals, see below In the index/store package introduce KVReader creates snapshot all read operations consistent from this snapshot must close to release introduce KVWriter only one writer active access to all operations allows for consisten read-modify-write must close to release introduce AssociativeMerge operation on batch allows efficient read-modify-write for associative operations used to consolidate updates to the term summary rows saves 1 set and 1 get op per shared instance of term in field In the index package introduced an IndexReader exposes a consisten snapshot of the index for searching At top level All searches now operate on a consisten snapshot of the index	2014-09-12 17:21:35 -04:00
Marty Schoch	9d2187706e	another round of golint	2014-09-03 19:53:59 -04:00
Marty Schoch	377ae090d0	additional golint issues resolved	2014-09-03 18:17:26 -04:00
Marty Schoch	d534b0836b	converted ALL_CAPS constants to CamelCase	2014-09-03 17:48:40 -04:00
Marty Schoch	7a7eb2e94c	add newline between license and package this avoids cluttering godocs with the license	2014-09-02 10:54:50 -04:00
Marty Schoch	1161361bea	rename imports from couchbaselabs to blevesearch	2014-08-28 15:38:57 -04:00
Marty Schoch	45a7a6dd8e	fix two missing Close calls holding iterators open	2014-08-25 15:13:15 -04:00
Marty Schoch	3309c698f8	fixed Document() behavior ot return nil when doc doesn't exist	2014-08-25 08:55:14 -04:00
Marty Schoch	082a5b0b03	major change to fields now can track array positions for field values stored fields now include this in the key and the back index now uses protobufs to simplify serialization closes #73	2014-08-19 08:58:26 -04:00
Marty Schoch	c33f1668f7	refactor dump methods improved test coverage	2014-08-15 13:12:55 -04:00

1 2

68 Commits