bleve

Author	SHA1	Message	Date
Sreekanth Sivasankaran	67a5814fbe	MB-22410:deleting/editing index definition with large dirty write queue can be very slow Adding a configurable forced store close	2017-03-01 18:58:32 +05:30
Sreekanth Sivasankaran	324e4237cf	adding configurable Abort Close	2017-03-01 16:23:56 +05:30
Sundar Sridharan	74c7de0dcf	re-order childSnapshot declaration	2017-02-21 15:54:04 -08:00
Sundar Sridharan	04d428656e	Add Snapshot interface methods for moss child collections feature	2017-02-20 15:03:45 -08:00
Steve Yen	0b70a1bcb8	use inlined prealloc'ed termFreqRow in upsidedown termFieldReader	2017-02-08 18:23:13 -08:00
Steve Yen	31fecc3663	avoid row alloc's in upsidedown termFieldReader constructor	2017-02-08 18:14:30 -08:00
Marty Schoch	606fd6344b	INDEX FORMAT CHANGE: change back index row value Previously term entries were encoded pairwise (field/term), so you'd have data like: F1/T1 F1/T2 F1/T3 F2/T4 F3/T5 As you can see, even though field 1 has 3 terms, we repeat the F1 part in the encoded data. This is a bit wasteful. In the new format we encode it as a list of terms for each field: F1/T1,T2,T3 F2/T4 F3/T5 When fields have multiple terms, this saves space. In unit tests there is no additional waste even in the case that a field has only a single value. Here are the results of an indexing test case (beer-search): $ benchcmp indexing-before.txt indexing-after.txt benchmark old ns/op new ns/op delta BenchmarkIndexing-4 11275835988 10745514321 -4.70% benchmark old allocs new allocs delta BenchmarkIndexing-4 25230685 22480494 -10.90% benchmark old bytes new bytes delta BenchmarkIndexing-4 4802816224 4741641856 -1.27% And here are the results of a MatchAll search building a facet on the "abv" field: $ benchcmp facet-before.txt facet-after.txt benchmark old ns/op new ns/op delta BenchmarkFacets-4 439762100 228064575 -48.14% benchmark old allocs new allocs delta BenchmarkFacets-4 9460208 3723286 -60.64% benchmark old bytes new bytes delta BenchmarkFacets-4 260784261 151746483 -41.81% Although we expect the index to be smaller in many cases, the beer-search index is about the same in this case. However, this may be due to the underlying storage (boltdb) in this case. Finally, the index version was bumped from 5 to 7, since smolder also used version 6, which could lead to some confusion.	2017-01-24 15:39:38 -05:00
Steve Yen	5927224e15	optimize mergeOldAndNew for case of first time a doc is seen	2017-01-09 22:48:58 -08:00
Steve Yen	790f2e3e32	optimize by alloc'ing arrays of TermFrequencyRow/TermVector	2017-01-09 22:42:00 -08:00
Steve Yen	8f4726ab10	use struct{}{} idiom instead of additional mark var	2017-01-09 10:17:26 -08:00
Steve Yen	302cac72c4	optimize mergeOldAndNew when non-update case	2017-01-08 17:59:49 -08:00
Steve Yen	931d133024	go fmt and go vet	2017-01-07 22:14:22 -08:00
Steve Yen	40780254ae	optimize upsidedown mergeOldAndNew existing key maps The optimization is to provide a better initial size to the map constructor and to use a 0-byte-sized struct{} as the map values.	2017-01-07 22:05:55 -08:00
Steve Yen	c2bafa2a51	optimize term vectors/locations via preallocated arrays The change should hit the allocator less often when processing term vectors/locations as it preallocates larger, contiguous arrays of records upfront.	2017-01-07 12:34:06 -08:00
Steve Yen	8b140d84c4	minor optimization of upsidedown backIndexRowForDoc This change might allow a smart enough golang compiler to perhaps allocate a backIndexRow on the stack rather than the heap.	2017-01-07 11:49:42 -08:00
Steve Yen	c21d27e15a	upsidedown TermFieldReader checks includeTermVectors flag param The flag was part of the API, but wasn't previously checked.	2017-01-05 21:10:27 -08:00
Steve Yen	37490864ce	bleve/index/store/moss - accessor for underlying mossStore This change adds methods that provide access to the actual, underlying mossStore instance in the bleve/index/store/moss KVStore adaptor. This enables applications to utilize advanced, mossStore-specific features (such as partial rollback of indexes). See also https://issues.couchbase.com/browse/MB-17805	2016-12-05 12:25:29 -08:00
Patrick Mezard	c81fd6fdb0	index: DocIDReader.Next() returns nil when done not io.EOF	2016-11-20 19:05:35 +01:00
Steve Yen	2a8237e8cc	optimize FacetsBuilder with cached fields & avoid some allocs	2016-10-25 15:34:48 -07:00
Steve Yen	a941a0f318	simplify DocumentFieldTerms append() usage	2016-10-25 15:30:19 -07:00
Rob McColl	12c404aec0	Update kvstore.go	2016-10-19 10:31:00 -04:00
Rob McColl	2b26218591	Fix NumDeletes doc copy/paste err s/Merge/Delete/g	2016-10-17 12:42:21 -04:00
Marty Schoch	77b79a2684	Merge pull request #466 from steveyen/optimize-fieldDict-reader-with-prealloc Optimize upside-down's field dict reader with preallocated objects	2016-10-13 14:09:54 +02:00
Steve Yen	62e6f1f648	reuse incrementBytes() in moss KV store integration In this commit, I saw that there was a simple incrementBytes() implementation elsewhere in bleve that seemed simpler than using the big int package. Edge case note: if the input bytes would overflow in incrementBytes(), such as with an input of [0xff 0xff 0xff], it returns nil. moss then treats a nil endKeyExclusive iterator param as a logical "higher-than-topmost" key, which produces the prefix iteration behavior that we want for this edge situation.	2016-10-12 09:34:36 -07:00
Steve Yen	01fb59d293	optimize upside-down DictionaryRow for fewer parsing alloc's	2016-10-12 09:22:50 -07:00
Steve Yen	2d72b542c0	optimize upside-down FieldDict reader with prealloc'ed objects As part of this commit, there's also a newly added Dictionaryrow.parseDictionaryK() helper method.	2016-10-12 09:18:58 -07:00
Marty Schoch	2f48d7fb02	fix misspellings	2016-10-02 12:11:15 -04:00
Marty Schoch	2332455bd2	nicer formatting of license header	2016-10-02 10:13:14 -04:00
Marty Schoch	6bf9dd59ab	BREAKING CHANGE - additional package renaming i recently learned that package names should also prefer the singular form, not the plural form	2016-10-01 17:20:59 -04:00
Steve Yen	004e157963	field cache also tracks fieldIndex -> fieldName reverse mapping	2016-10-01 13:06:03 -07:00
Steve Yen	c362ab302e	fix tracking of termSearchersFinished stats	2016-09-30 16:11:30 -07:00
Marty Schoch	caf5256f74	don't export internal timers from metrics kvstore	2016-09-30 15:52:16 -04:00
Marty Schoch	f90856b8d3	BREAKING CHANGE - rename upside_down to upsidedown	2016-09-30 12:36:38 -04:00
Marty Schoch	35da361bfa	BREAKING CHANGE - renamed packages to be shorter and not use _ this commit only addresses the analysis sub-package	2016-09-30 12:36:10 -04:00
Marty Schoch	73d0951b2a	don't panic on missing backindex row part of #419	2016-09-27 22:16:45 -04:00
Marty Schoch	fb0f4bbecd	BREAKING CHANGE - new method to create memory only index Previously bleve allowed you to create a memory-only index by simply passing "" as the path argument to the New() method. This was not clear when reading the code, and led to some problematic error cases as well. Now, to create a memory-only index one should use the NewMemOnly() method. Passing "" as the path argument to the New() method will now return os.ErrInvalid. Advanced users calling NewUsing() can create disk-based or memory-only indexes, but the change here is that pass "" as the path argument no longer defaults you into getting a memory-only index. Instead, the KV store is selected manually, just as it is for the disk-based solutions. Here is an example use of the NewUsing() method to create a memory-only index: NewUsing("", indexMapping, Config.DefaultIndexType, Config.DefaultMemKVStore, nil) Config.DefaultMemKVStore is just a new default value added to the configuration, it currently points to gtreap.Name (which could have been used directly instead for more control) closes #427	2016-09-27 14:11:40 -04:00
Marty Schoch	1f79f65b6a	Merge pull request #450 from mschoch/bug449 fix logic in Advance() of UpsideDownCouchDocIDReader	2016-09-26 12:44:09 -04:00
Marty Schoch	981812ff70	fix logic in Advance() of UpsideDownCouchDocIDReader also added unit tests for newUpsideDownCouchDocIDReaderOnly use cases fixes #449	2016-09-26 12:36:24 -04:00
Steve Yen	10cab1826d	added upside_down TermFrequencyRow.KeyAppendTo() API This is a cleanup commit that's followup to a code review discussion on a previous Advance() perf-optimization PR... https://github.com/blevesearch/bleve/pull/443	2016-09-23 09:22:42 -07:00
Steve Yen	988dfb02e9	moss kvstore iterator Seek() invokes underlying moss SeekTo() API	2016-09-22 17:46:06 -07:00
Steve Yen	5f5b5d3b80	optimize upside_down TermFieldReader.Advance() to reuse memory On a dev laptop, bleve-query benchmark on wiki dataset using query-string of "+text:afternoon +text:coffee" previously had throughput of 1222qps, and with this change hits 1940qps.	2016-09-22 17:46:06 -07:00
Steve Yen	bcec199c89	issue 441 - upside_down termFieldReader doesn't call Next() early This change to upside_down term-field-reader no longer moves the underlying iterator forward preemptively. Instead, it will invoke Next() on the underlying iterator only when the caller invokes the term-field-reader's Next(). There's a special case to handle the situation on the first Next() invocation after the term-field-reader is created.	2016-09-22 09:18:29 -07:00
slavikm	3eec1ae16c	Satisfy errcheck	2016-09-21 17:56:03 +03:00
slavikm	40c1dc076f	Now, without the rollback	2016-09-21 16:15:06 +03:00
slavikm	588f379962	Commit if there is no error, rollback otherwise	2016-09-21 16:13:47 +03:00
slavikm	ac49306077	Make sure that the transaction is closed if there is an error	2016-09-21 14:32:05 +03:00
Marty Schoch	e68f6ca9e6	Merge pull request #432 from steveyen/perf-skip-0xff-scan skip termFrequencyRow 0xFF scan as term length is already known	2016-09-18 12:20:21 -04:00
Steve Yen	b5d2c32b46	skip termFrequencyRow 0xFF scan as term length is already known This commit modifies the upside_down TermFrequencyRow parseKDoc() to skip the ByteSeparator (0xFF) scan, as we already know the term's length in the UpsideDownCouchTermFieldReader. On my dev box, results from bleve-query test on high frequency terms went from previous 107qps to 124qps.	2016-09-18 08:56:05 -07:00
Marty Schoch	3fd2a64872	BREAKING CHANGE - removed DumpXXX() methods from bleve.Index The DumpXXX() methods were always documented as internal and unsupported. However, now they are being removed from the public top-level API. They are still available on the internal IndexReader, which can be accessed using the Advanced() method. The DocCount() and DumpXXX() methods on the internal index have moved to the internal index reader, since they logically operate on a snapshot of an index.	2016-09-13 12:40:01 -04:00
Marty Schoch	e1fb860a86	removed unused AsyncIndex interface	2016-09-13 08:42:36 -04:00

1 2 3 4 5 ...

419 Commits