bleve

Author	SHA1	Message	Date
Sreekanth Sivasankaran	127aaa06bf	Merge `1ef41101ba` into `7a98d75fc5`	2018-03-28 20:51:54 +00:00
Steve Yen	7a98d75fc5	Merge pull request #860 from steveyen/optimize-docInternalToNumber optimize docInternalToNumber() to avoid allocations	2018-03-28 10:21:30 -07:00
Steve Yen	b955bdcd72	scorch optimize docInternalToNumber() to avoid allocations docInternalToNumber() no longer allocates a reader instance and a heap uint64 to hold the result.	2018-03-28 10:08:21 -07:00
Steve Yen	fd07fdb862	Merge pull request #859 from steveyen/optimize-when-zero-hits optimizations in the case of zero hits	2018-03-27 16:04:03 -07:00
Steve Yen	013d06d756	scorch TermFieldReader() reuses string(term)	2018-03-27 15:39:33 -07:00
Steve Yen	596d990eb9	scorch zap optimize when zero hits Instead of allocating brand-new empty postingsList/Iterator instances, reuse some empty singletons.	2018-03-27 15:39:33 -07:00
Sreekanth Sivasankaran	6c6c1419b5	Merge pull request #855 from blevesearch/tfr_advance TermFieldReader Advance optimisation	2018-03-27 22:49:48 +05:30
Abhinav Dangeti	1fcfc0a5f1	Merge pull request #842 from abhinavdangeti/segment-tests Unit tests for segments with docs with non-overlapping fields	2018-03-27 09:54:02 -07:00
Sreekanth Sivasankaran	db6a2c274f	adding nil check	2018-03-27 22:10:09 +05:30
Sreekanth Sivasankaran	72ac352961	TermFieldReader Advance optimization skips to the target segment and avoid un necesary read of freq,loc,norm details	2018-03-27 20:18:16 +05:30
Steve Yen	e9ca76be78	Merge pull request #850 from steveyen/more-reuse-optimizations More buffer & slice reuse optimizations	2018-03-26 13:07:46 -07:00
Steve Yen	1cab701f85	scorch zap postingsIter skips freq/norm/locs parsing if allowed In this optimization, the zap PostingsIterator skips the parsing of freq/norm/locs chunks based on the includeFreq\|Norm\|Locs flags. In bleve-query microbenchmark on dev macbookpro, with 50K en-wiki docs, on a medium frequency term search that does not ask for term vectors, throughput was ~750 q/sec before the change and went to ~1400 q/sec after the change.	2018-03-26 09:49:44 -07:00
Steve Yen	192621f402	scorch includeFreq/Norm/Locs params for postingsList.Iterator API This commit adds boolean flag params to the scorch PostingsList.Iterator() method, so that the caller can specify whether freq/norm/locs information is needed or not. Future changes can leverage these params for optimizations.	2018-03-26 09:49:44 -07:00
Steve Yen	fc7584f5a0	scorch zap prealloc extra locs for future growth	2018-03-26 09:49:44 -07:00
Steve Yen	3f4b161850	scorch zap postingsIter reuses array positions slice	2018-03-26 09:49:44 -07:00
Steve Yen	db792717a6	scorch zap postingsIter reuses nextLocs/nextSegmentLocs The previous code would inefficiently throw away the nextLocs and would also throw away the []segment.Location slice if there were no locations, such as if it was a 1-hit postings list. This change tries to reuse the nextLocs/nextSegmentLocs for all cases.	2018-03-26 09:49:44 -07:00
Steve Yen	6540b197d4	scorch zap provide full buffer capacity to snappy Encode/Decode() The snappy Encode/Decode() API's accept an optional destination buffer param where their encoded/decoded output results will be placed, but they only check that the buffer has enough len() rather than enough capacity before deciding to allocate a new buffer.	2018-03-26 09:49:44 -07:00
Steve Yen	84424edcad	scorch zap sync.Pool for reusable VisitDocument() data structures As part of this, snappy.Decode() is also provided a reused buffer for decompression.	2018-03-26 09:49:44 -07:00
Steve Yen	33b1f065dc	Merge pull request #857 from steveyen/replace-locsBitmap-attempt2 optimization to replace locations bitmap, attempt #2	2018-03-26 09:49:17 -07:00
Steve Yen	ba644f3893	scorch zap fix postingsIter.nextBytes() when 1-bit encoded The previous commit's optimization that replaced the locsBitmap was incorrectly handling the case when there was a 1-bit encoding optimization in the postingsIterator.nextBytes() method, incorrectly generating the freq-norm bytes. Also as part of this change, more unused locsBitmap's were removed.	2018-03-26 09:19:00 -07:00
Steve Yen	7a19e6fd7e	scorch zap replace locsBitmap w/ 1 bit from freq-norm varint encoding This is attempt #2 of the optimization that replaces the locsBitmap, without any changes from the original commit attempt. A commit that follows this one contains the actual fix. See also... - commit `621b58dd83` (the 1st attempt) - commit `49a4ee60ba` (the revert) ------------- The original commit message body from 621b58 was... NOTE: this is a zap file format change. The separate "postings locations" roaring Bitmap that encoded whether a posting has locations info is now replaced by the least significant bit in the freq varint encoded in the freq-norm chunkedIntCoder. encode/decodeFreqHasLocs() are added as helper functions.	2018-03-23 12:50:24 -07:00
Steve Yen	1f7faf7e01	Merge pull request #856 from steveyen/revert-locsBitmap-replacement Revert "scorch zap replace locsBitmap w/ 1 bit from freq-norm varint …	2018-03-23 11:20:45 -07:00
Steve Yen	49a4ee60ba	Revert "scorch zap replace locsBitmap w/ 1 bit from freq-norm varint encoding" Testing with the cbft application led to cbft process exits... AsyncError exit()... error reading location field: EOF -- main.initBleveOptions.func1() at init_bleve.go:85 This reverts commit `621b58dd83`.	2018-03-23 10:01:30 -07:00
Steve Yen	c6df65286c	Merge pull request #854 from steveyen/replace-locsBitmap replace locs bitmap with 1 bit from freq-norm varint encoding	2018-03-22 19:39:32 -07:00
Abhinav Dangeti	2384c41098	Merge pull request #851 from abhinavdangeti/master MB-28782: Error handling in merger/persister when index is closed	2018-03-22 18:07:11 -07:00
Steve Yen	67f75005c4	fix cmd/bleve help string for internal command	2018-03-22 17:43:07 -07:00
Steve Yen	621b58dd83	scorch zap replace locsBitmap w/ 1 bit from freq-norm varint encoding NOTE: this is a zap file format change. The separate "postings locations" roaring Bitmap that encoded whether a posting has locations info is now replaced by the least significant bit in the freq varint encoded in the freq-norm chunkedIntCoder. encode/decodeFreqHasLocs() are added as helper functions.	2018-03-22 17:43:07 -07:00
abhinavdangeti	18cfcd11d1	MB-28782: Error handling in merger/persister when index is closed When the index is closed, do not fire an AsyncError (fatal) from either the merger or the persister that is actively working. This is quite a probable situation, so exit the loop within the goroutine.	2018-03-22 14:29:59 -07:00
Steve Yen	a7c4237d00	Merge pull request #852 from steveyen/scorch-zap-postingsIterator-allNChunk-bug PostingsIterator.nextDocNum() maintains allNChunk correctly	2018-03-22 13:26:14 -07:00
Steve Yen	7e32a35af5	Merge pull request #853 from steveyen/scorch-cmd-ascii-help-fix fix cmd/bleve scorch ascii cmd help text	2018-03-22 11:00:49 -07:00
Steve Yen	6b78dd4184	fix cmd/bleve scorch ascii cmd help text Initially, there was a typo with an extra space char, but then I realized there was some copypasting corrections.	2018-03-22 06:48:42 -07:00
Steve Yen	b506fae4f7	scorch zap postingsItr remove unused offset/locoffset fields	2018-03-21 18:00:14 -07:00
Steve Yen	d1e2b55c72	scorch zap postingsItr.nextDocNum() maintains allNChunk correctly When PostingsIterator.nextDocNum() moves the 'all' roaring bitmap iterator forwards, it was incorrectly not keeping the allNChunk value aligned.	2018-03-21 17:57:54 -07:00
Abhinav Dangeti	ae27aa2f14	Merge pull request #848 from abhinavdangeti/curr Getting rid of panics added for debugging MB-28719,MB-28781	2018-03-20 15:14:22 -07:00
Abhinav Dangeti	a4d88f8a12	Merge pull request #833 from abhinavdangeti/master Return an error when the snapshotEpoch is invalid	2018-03-20 15:04:23 -07:00
Marty Schoch	110cfa3074	Merge pull request #847 from mschoch/fix-scorch-missing-invalid-field fix MB-28719 and MB-28781 invalid/missing field in scorch	2018-03-20 17:52:52 -04:00
abhinavdangeti	0e3c57c465	Revert "scorch zap getField() which panics if the field is unknown" This reverts commit `85b4a31e2a`.	2018-03-20 14:51:33 -07:00
abhinavdangeti	844845b5d2	Revert "scorch zap panic if mergeFields() sees unsorted fields" This reverts commit `2f4d3d8587`.	2018-03-20 14:51:25 -07:00
Marty Schoch	35ea1d4423	fix MB-28719 and MB-28781 invalid/missing field in scorch Use of sync.Pool to reuse the interm structure relied on resetting the fieldsInv slice. However, actual segments continued to use this same fieldsInv slice after returning it to the pool. Simple fix is to nil out fieldsInv slice in reset method and let the newly built segment keep the one from the interim struct.	2018-03-20 17:41:56 -04:00
Steve Yen	e88cb783e2	Merge pull request #845 from steveyen/MB-28719-related-assertions MB-28719 related assertions	2018-03-20 11:38:31 -07:00
Steve Yen	2f4d3d8587	scorch zap panic if mergeFields() sees unsorted fields mergeFields depends on the fields from the various segments being sorted for the fieldsSame comparison to work. Of note, the 'fieldi > 1' guard skips the 0th field, which should always be the '_id' field.	2018-03-20 11:17:46 -07:00
Steve Yen	85b4a31e2a	scorch zap getField() which panics if the field is unknown	2018-03-20 11:12:18 -07:00
abhinavdangeti	85df86ba17	Unit tests for segments with docs with non-overlapping fields	2018-03-19 12:37:50 -07:00
Marty Schoch	65e16a7d96	Merge pull request #841 from mschoch/improve-cmdline improve command-line tool for zap	2018-03-19 15:06:17 -04:00
Steve Yen	0492b33c2e	Merge pull request #840 from steveyen/MB-28781 MB-28781 - check if fields are the same before using merge optimization of copying term/norm/loc bytes	2018-03-19 11:58:53 -07:00
Marty Schoch	e9b228bcdd	improve command-line tool for zap correctly handle/print additional loc bitmap address this fixes bitmap length that is output instantiate roaring bitmap and print it out removed some unnecessary debug logging updated dict command to print 1-hit encoded vals this makes dict command usable for seeing which doc ids are in a segment and their corresponding doc number	2018-03-19 14:57:30 -04:00
Steve Yen	f65ba5c0f4	MB-28781 - scorch zap merge freq/loc copying only when fieldsSame The optimization recently introduced in commit `530a3d24cf`, ("scorch zap optimize merge by byte copying freq/norm/loc's") was to byte-copy freq/norm/loc data directly during merging. But, it was incorrect if the fields were different across segments. This change now performs that byte-copying merging optimization only when the fields are the same across segments, and if not, leverages the old approach of deserializing & re-serializing the freq/norm/loc information, which has the important step of remapping fieldID's. See also: https://issues.couchbase.com/browse/MB-28781	2018-03-19 11:26:51 -07:00
Steve Yen	c881146270	scorch zap mergeTermFreqNormLocsByCopying() helper func	2018-03-19 10:36:23 -07:00
Sreekanth Sivasankaran	1ef41101ba	vellum adoption for regex and fuzzy queries	2018-03-19 17:29:29 +05:30
Sreekanth Sivasankaran	cf8e0d63bb	Merge pull request #837 from blevesearch/docnum_missing_fix MB-28753 - docNumber "xx" not found err with updates	2018-03-19 14:36:30 +05:30

1 2 3 4 5 ...

1889 Commits