bleve

Author	SHA1	Message	Date
Steve Yen	621b58dd83	scorch zap replace locsBitmap w/ 1 bit from freq-norm varint encoding NOTE: this is a zap file format change. The separate "postings locations" roaring Bitmap that encoded whether a posting has locations info is now replaced by the least significant bit in the freq varint encoded in the freq-norm chunkedIntCoder. encode/decodeFreqHasLocs() are added as helper functions.	2018-03-22 17:43:07 -07:00
Steve Yen	a7c4237d00	Merge pull request #852 from steveyen/scorch-zap-postingsIterator-allNChunk-bug PostingsIterator.nextDocNum() maintains allNChunk correctly	2018-03-22 13:26:14 -07:00
Steve Yen	7e32a35af5	Merge pull request #853 from steveyen/scorch-cmd-ascii-help-fix fix cmd/bleve scorch ascii cmd help text	2018-03-22 11:00:49 -07:00
Steve Yen	6b78dd4184	fix cmd/bleve scorch ascii cmd help text Initially, there was a typo with an extra space char, but then I realized there was some copypasting corrections.	2018-03-22 06:48:42 -07:00
Steve Yen	b506fae4f7	scorch zap postingsItr remove unused offset/locoffset fields	2018-03-21 18:00:14 -07:00
Steve Yen	d1e2b55c72	scorch zap postingsItr.nextDocNum() maintains allNChunk correctly When PostingsIterator.nextDocNum() moves the 'all' roaring bitmap iterator forwards, it was incorrectly not keeping the allNChunk value aligned.	2018-03-21 17:57:54 -07:00
Abhinav Dangeti	ae27aa2f14	Merge pull request #848 from abhinavdangeti/curr Getting rid of panics added for debugging MB-28719,MB-28781	2018-03-20 15:14:22 -07:00
Abhinav Dangeti	a4d88f8a12	Merge pull request #833 from abhinavdangeti/master Return an error when the snapshotEpoch is invalid	2018-03-20 15:04:23 -07:00
Marty Schoch	110cfa3074	Merge pull request #847 from mschoch/fix-scorch-missing-invalid-field fix MB-28719 and MB-28781 invalid/missing field in scorch	2018-03-20 17:52:52 -04:00
abhinavdangeti	0e3c57c465	Revert "scorch zap getField() which panics if the field is unknown" This reverts commit `85b4a31e2a`.	2018-03-20 14:51:33 -07:00
abhinavdangeti	844845b5d2	Revert "scorch zap panic if mergeFields() sees unsorted fields" This reverts commit `2f4d3d8587`.	2018-03-20 14:51:25 -07:00
Marty Schoch	35ea1d4423	fix MB-28719 and MB-28781 invalid/missing field in scorch Use of sync.Pool to reuse the interm structure relied on resetting the fieldsInv slice. However, actual segments continued to use this same fieldsInv slice after returning it to the pool. Simple fix is to nil out fieldsInv slice in reset method and let the newly built segment keep the one from the interim struct.	2018-03-20 17:41:56 -04:00
Steve Yen	e88cb783e2	Merge pull request #845 from steveyen/MB-28719-related-assertions MB-28719 related assertions	2018-03-20 11:38:31 -07:00
Steve Yen	2f4d3d8587	scorch zap panic if mergeFields() sees unsorted fields mergeFields depends on the fields from the various segments being sorted for the fieldsSame comparison to work. Of note, the 'fieldi > 1' guard skips the 0th field, which should always be the '_id' field.	2018-03-20 11:17:46 -07:00
Steve Yen	85b4a31e2a	scorch zap getField() which panics if the field is unknown	2018-03-20 11:12:18 -07:00
Marty Schoch	65e16a7d96	Merge pull request #841 from mschoch/improve-cmdline improve command-line tool for zap	2018-03-19 15:06:17 -04:00
Steve Yen	0492b33c2e	Merge pull request #840 from steveyen/MB-28781 MB-28781 - check if fields are the same before using merge optimization of copying term/norm/loc bytes	2018-03-19 11:58:53 -07:00
Marty Schoch	e9b228bcdd	improve command-line tool for zap correctly handle/print additional loc bitmap address this fixes bitmap length that is output instantiate roaring bitmap and print it out removed some unnecessary debug logging updated dict command to print 1-hit encoded vals this makes dict command usable for seeing which doc ids are in a segment and their corresponding doc number	2018-03-19 14:57:30 -04:00
Steve Yen	f65ba5c0f4	MB-28781 - scorch zap merge freq/loc copying only when fieldsSame The optimization recently introduced in commit `530a3d24cf`, ("scorch zap optimize merge by byte copying freq/norm/loc's") was to byte-copy freq/norm/loc data directly during merging. But, it was incorrect if the fields were different across segments. This change now performs that byte-copying merging optimization only when the fields are the same across segments, and if not, leverages the old approach of deserializing & re-serializing the freq/norm/loc information, which has the important step of remapping fieldID's. See also: https://issues.couchbase.com/browse/MB-28781	2018-03-19 11:26:51 -07:00
Steve Yen	c881146270	scorch zap mergeTermFreqNormLocsByCopying() helper func	2018-03-19 10:36:23 -07:00
Sreekanth Sivasankaran	cf8e0d63bb	Merge pull request #837 from blevesearch/docnum_missing_fix MB-28753 - docNumber "xx" not found err with updates	2018-03-19 14:36:30 +05:30
Sreekanth Sivasankaran	980ce9ebb3	MB-28753 - document number "xxx" not found err with update workload Introducer was incorrectly updating the offsets slice of segments, by considering only the live doc count while computing the "running". This can result in incorrectly computing the residing segment as well as the local doc numbers while loading a document after a search hit.	2018-03-19 12:11:37 +05:30
Steve Yen	6693a89441	Merge pull request #835 from steveyen/use-1MB-buffer-for-file-merger scorch zap file merger uses 1MB buffered writer	2018-03-18 09:06:52 -07:00
Steve Yen	5df53c8e1f	scorch zap file merger uses 1MB buffered writer pprof of bleve-blast was showing file merging was in syscall/write a lot. The bufio.NewWriter() provides a default buffer size of 4K, which is too small, and using bufio.NewWriterSize(1MB buffer size) leads to syscall/write dropping out of the file merging flame graphs.	2018-03-16 11:49:53 -07:00
abhinavdangeti	60bdf6d247	Return an error when the snapshotEpoch is invalid Avoiding this stacktrace (SIGSEGV) while using bleve scorch cmd-line utility when snapshotEpoch provided is invalid: github.com/blevesearch/bleve/index/scorch.(IndexSnapshot).Segments(...) /Users/abhinavdangeti/Documents/couchbaseV/godeps/src/github.com/blevesearch/bleve/index/scorch/snapshot_index.go:56 github.com/blevesearch/bleve/cmd/bleve/cmd/scorch.glob..func1(0x1f347e0, 0xc4201f1400, 0x2, 0x2, 0x0, 0x0) /Users/abhinavdangeti/Documents/couchbaseV/godeps/src/github.com/blevesearch/bleve/cmd/bleve/cmd/scorch/ascii.go:43 +0xe4 github.com/blevesearch/bleve/cmd/bleve/vendor/github.com/spf13/cobra.(Command).execute(0x1f347e0, 0xc4201f12e0, 0x2, 0x2, 0x1f347e0, 0xc4201f12e0) /Users/abhinavdangeti/Documents/couchbaseV/godeps/src/github.com/blevesearch/bleve/cmd/bleve/vendor/github.com/spf13/cobra/command.go:646 +0x3e8 github.com/blevesearch/bleve/cmd/bleve/vendor/github.com/spf13/cobra.(Command).ExecuteC(0x1f334c0, 0x0, 0x0, 0x0) /Users/abhinavdangeti/Documents/couchbaseV/godeps/src/github.com/blevesearch/bleve/cmd/bleve/vendor/github.com/spf13/cobra/command.go:737 +0x2fe github.com/blevesearch/bleve/cmd/bleve/vendor/github.com/spf13/cobra.(Command).Execute(0x1f334c0, 0x0, 0x0) /Users/abhinavdangeti/Documents/couchbaseV/godeps/src/github.com/blevesearch/bleve/cmd/bleve/vendor/github.com/spf13/cobra/command.go:695 +0x2b github.com/blevesearch/bleve/cmd/bleve/cmd.Execute() /Users/abhinavdangeti/Documents/couchbaseV/godeps/src/github.com/blevesearch/bleve/cmd/bleve/cmd/root.go:74 +0x31 main.main() /Users/abhinavdangeti/Documents/couchbaseV/goproj/src/github.com/couchbase/cbft/cmd/cbft-bleve/main.go:39 +0x1cb	2018-03-16 11:43:25 -07:00
Steve Yen	3d1cea0556	Merge pull request #834 from steveyen/more-mem-reuse-and-optimizations More mem reuse and optimizations	2018-03-16 11:41:10 -07:00
Steve Yen	b411e65234	scorch zap optimize postingsIterator reuse of freq/locChunkOffsets	2018-03-16 11:22:50 -07:00
Steve Yen	e52eb84e37	scorch zap optimize merge when deletion bitmap is empty This change detects whether a deletion bitmap is empty, and treats that as a nil bitmap, which allows further postings iterator codepaths to avoid roaring bitmap operations (like, AndNot(docNums, drops)).	2018-03-16 11:22:50 -07:00
Steve Yen	5411d9ae4f	Merge pull request #826 from steveyen/scorch-estimate-buf-size estimate interim buffer size based on previous results	2018-03-16 11:22:42 -07:00
Marty Schoch	4f33b4be44	Merge pull request #832 from mschoch/rename-size-full rename SizeFull to Size	2018-03-16 12:10:48 -04:00
Marty Schoch	11ff31c2f9	rename SizeFull to Size	2018-03-16 11:31:47 -04:00
Marty Schoch	9a87593fd7	Merge pull request #830 from mschoch/avoid-app-herder-hot-lock memoize the size of an entire index snapshot	2018-03-16 11:28:22 -04:00
Marty Schoch	f1c26e29f0	Merge branch 'master' into avoid-app-herder-hot-lock	2018-03-16 10:30:34 -04:00
Marty Schoch	dee639ccc0	Merge pull request #829 from abhinavdangeti/master Do not account IndexReader's size in the query RAM estimate	2018-03-16 10:28:24 -04:00
Sreekanth Sivasankaran	53bf29763b	Merge pull request #821 from blevesearch/minor_docvalue_space_savings docValue space savings	2018-03-16 09:12:13 +05:30
Sreekanth Sivasankaran	53c3cab512	Merge branch 'master' into minor_docvalue_space_savings	2018-03-16 08:53:57 +05:30
Sreekanth Sivasankaran	23cebae5a8	Merge pull request #815 from blevesearch/loadchunk_minor minor optimisation to loadChunk method	2018-03-16 08:15:37 +05:30
Marty Schoch	45e0e5c666	memoize the size of an entire index snapshot by memoizing the size of index snapshots and their constituent parts, we significantly reduce the amount of time that the lock is held in the app_herder, when calculating the total memory used	2018-03-15 17:25:05 -04:00
abhinavdangeti	65fed52d0b	Do not account IndexReader's size in the query RAM estimate Since its just the pointer size of the IndexReader that is being accounted for while estimating the RAM needed to execute a search query, get rid of the Size() API in the IndexReader interface.	2018-03-15 13:23:58 -07:00
Sreekanth Sivasankaran	d1155c223a	zap version bump, changed the offset slice format ,UTs	2018-03-15 23:25:53 +05:30
Steve Yen	d1b84d4578	Merge pull request #828 from blevesearch/minor_fixes posting iterator array positions clean up	2018-03-15 09:31:15 -07:00
Sreekanth Sivasankaran	1775602958	posting iterator array positions clean up, max segment size limit adjustment for hit-1 optimisation	2018-03-15 14:40:00 +05:30
Sreekanth Sivasankaran	441065a41b	comments,simplification	2018-03-15 13:11:29 +05:30
Steve Yen	4af65a7846	scorch zap prealloc buf via estimate from previous interim work	2018-03-14 09:32:14 -07:00
Steve Yen	985082d5d2	Merge pull request #824 from steveyen/reuse-interim-vellum scorch zap optimize interim's reuse of vellum builders	2018-03-14 08:10:43 -07:00
Steve Yen	7578ff7cb8	scorch zap optimize interim's reuse of vellum builders Since interim structs are now sync.Pool'ed, we can now also hold onto and reuse the associated vellum builder.	2018-03-14 07:49:28 -07:00
Abhinav Dangeti	2c69a5651b	Merge pull request #825 from abhinavdangeti/master MB-27385: De-duplicate the list of requested fields	2018-03-13 14:33:13 -07:00
abhinavdangeti	715144d632	MB-27385: De-duplicate the list of requested fields De-duplicate the list of fields provided by the client as part of the search request, so as to not inadvertantly load the same stored field more than once.	2018-03-13 14:19:02 -07:00
Steve Yen	62afdf4ac1	Merge pull request #823 from blevesearch/max_segment_size adding maxsegment size limit checks	2018-03-13 07:52:27 -07:00
Sreekanth Sivasankaran	debbcd7d47	adding maxsegment size limit checks	2018-03-13 17:35:54 +05:30

1 2 3 4 5 ...

1860 Commits