bleve

Author	SHA1	Message	Date
Sreekanth Sivasankaran	c45822347f	Merge branch 'master' into mergeplanner_options	2018-02-26 15:59:20 +05:30
Sreekanth Sivasankaran	e4cc79a9ad	adopting json parsing on options, fixed the inadvertant option modification	2018-02-26 15:56:30 +05:30
Sreekanth Sivasankaran	4109e327ff	Merge pull request #771 from sreekanth-cb/merge_handling_empty_seg_tasks Fix for empty segment merge handling	2018-02-24 10:48:31 +05:30
Sreekanth Sivasankaran	683e195ac4	adding empty segment handling during introduction cleaning up the segment live size check	2018-02-24 07:03:27 +05:30
Abhinav Dangeti	1929ceb1f5	Merge pull request #781 from abhinavdangeti/upsidedown-missing-close Handle case where store snapshot isn't closed in upsidedown's Batch() API	2018-02-23 15:02:44 -08:00
abhinavdangeti	da70758635	Handle case where store snapshot isn't closed in upsidedown's Batch() API	2018-02-23 14:47:22 -08:00
Steve Yen	19080c1ae5	Merge pull request #779 from steveyen/wip-in-mem-seg-merging merging of in-memory segments during persistSnapshot	2018-02-23 14:02:02 -08:00
Steve Yen	c50d9b4023	scorch conditional merging during persistSnapshot() As part of this change, there are nw helper methods -- persistSnapshotMaybeMerge() and persistSnapshotDirect().	2018-02-23 09:17:02 -08:00
Sreekanth Sivasankaran	a1db057656	configurable mergePlanner options mergePlanner options are parsed from the scorch configs parameters	2018-02-23 16:09:37 +05:30
Steve Yen	a0b7508da7	scorch zap mergeSegmentBases() func As part of this, zap.MergeToWriter() now returns more information -- enough so that callers can now create their own SegmentBase instances. Also, the fieldsMap maintained and returned by zap.MergeToWriter() is now a mapping from fieldName ==> fieldID+1 (instead of the previous mapping from fieldName ==> fieldID). This makes it similar to how fieldsMap are handled in other parts of zap to avoid "zero value" issues.	2018-02-19 14:13:31 -08:00
Steve Yen	720010783e	scorch zap InitSegmentBase() helper func Refactored out a zap.InitSegmentBase() func so that non-zap packages can create SegmentBase instances.	2018-02-19 14:13:31 -08:00
Steve Yen	656220ca9d	Merge pull request #769 from steveyen/scorch-rollback-ignores-unsafeBatch scorch rollback ignores unsafeBatch flag	2018-02-15 18:51:59 -08:00
Sreekanth Sivasankaran	606a270669	Fix for empty segment merge handling Avoid creating new files with emtpy segments tasks during the merge operation, skips the incorrect appending of a newer segment during merge.	2018-02-15 16:44:20 +05:30
Steve Yen	030469a351	Merge pull request #767 from steveyen/persistSnapshot-err-handling improvements to err handling in persistSnapshot(), etc	2018-02-13 14:53:42 -08:00
Steve Yen	2651ba4b19	Merge pull request #773 from steveyen/merge-enumerator scorch zap segment merging via a new enumerator instead of vellum.MergeIterator	2018-02-13 13:05:39 -08:00
Steve Yen	57fc03258e	scorch rollback ignores unsafeBatch flag See also: https://github.com/blevesearch/bleve/issues/760	2018-02-13 10:21:42 -08:00
Steve Yen	29663c2795	Merge pull request #770 from steveyen/optimize-prealloced-postings-iterator scorch zap segment merging reuses prealloc'ed PostingsIterator	2018-02-13 10:02:42 -08:00
Steve Yen	fe544f3352	scorch zap merge uses enumerator for vellum.Iterator's	2018-02-12 21:28:46 -08:00
Steve Yen	a073424e5a	scorch zap dict.postingsListFromOffset() method A helper method that can create a PostingsList if the caller already knows the postingsOffset.	2018-02-12 20:54:07 -08:00
Steve Yen	2158e06c40	scorch zap merge collects dicts & itrs in lock-step The theory with this change is that the dicts and itrs should be positionally in "lock-step" with paired entries. And, since later code also uses the same array indexing to access the drops and newDocNums, those also need to be positionally in pair-wise lock-step, too.	2018-02-12 20:54:07 -08:00
Steve Yen	95a4f37e5c	scorch zap enumerator impl that joins multiple vellum iterators Unlike vellum's MergeIterator, the enumerator introduced in this commit doesn't merge when there are matching keys across iterators. Instead, the enumerator implementation provides a traversal of all the tuples of (key, iteratorIndex, val) from the underlying vellum iterators, ordered by key ASC, iteratorIndex ASC.	2018-02-12 20:54:06 -08:00
Steve Yen	a4c54c4389	Merge pull request #772 from abhinavdangeti/master Update vendor'ed revision for moss to the latest	2018-02-12 11:12:44 -08:00
abhinavdangeti	846235593c	Update vendor'ed revision for moss to the latest	2018-02-12 10:04:34 -08:00
Steve Yen	e37c563c56	scorch zap merge move fieldDvLocsOffset var declaration Move the var declaration to nearer where its used.	2018-02-08 18:03:09 -08:00
Steve Yen	f177f07613	scorch zap segment merging reuses prealloc'ed PostingsIterator During zap segment merging, a new zap PostingsIterator was allocated for every field X segment X term. This change optimizes by reusing a single PostingsIterator instance per persistMergedRest() invocation. And, also unused fields are removed from the PostingsIterator.	2018-02-08 17:24:30 -08:00
Steve Yen	6f5f90cd41	scorch zap segment cleanup handling for some edge cases Two cases in this commit... If we're shutting down, the merger might not have handed off its latest merged segment to the introducer yet, so the merger still owns the segment and needs to Close() that segment itself. In persistSnapshot(), there migth be cases where the persister might not be able to swap in its newly persisted segments -- so, the persistSnapshot() needs to Close() those segments itself.	2018-02-08 14:04:04 -08:00
Steve Yen	83272a9629	scorch persistSnapshot() err handling & propagation	2018-02-08 14:03:59 -08:00
Steve Yen	dee6a2b1c6	scorch persistSnapshot() consistently uses err to commit vs abort Some codepaths in persistSnapshot() were saving errors into an err2 local variable, which might lead incorrectly to commit during an error situation rather than abort.	2018-02-08 14:02:35 -08:00
Steve Yen	7b9fe0a216	Merge pull request #768 from steveyen/issue-764 scorch uses segment.id to encode boltdb sub-bucket key	2018-02-08 13:51:11 -08:00
Steve Yen	91ac0d011a	scorch uses segment.id to encode boltdb sub-bucket key fixes #764	2018-02-08 13:25:16 -08:00
Steve Yen	8a7990427f	Merge pull request #765 from steveyen/more-TestIndexRollback-fixes fix for TestIndexRollback unit tests	2018-02-08 12:45:28 -08:00
Steve Yen	1552caeab9	Merge pull request #766 from steveyen/scorch-persistSnapshot-comment scorch persistSnapshot comments update	2018-02-08 12:41:01 -08:00
Steve Yen	d0644fec12	scorch persistSnapshot comments update See also: https://github.com/blevesearch/bleve/issues/763	2018-02-08 12:22:58 -08:00
Steve Yen	99852accb0	scorch RollbackPoints() no error at start & fix TestIndexRollback When a scorch is just opened and is "empty", RollbackPoints() no longer considers that an error situation. Also, this commit makes the TestIndexRollback unit tests is a bit more forgiving to races, as we were seeing failures sometimes in travis-CI environments (TestIndexRollback was passing fine on my dev macbook). The theory is the double-looping in the persisterLoop would sometimes be racy, leading to 1 or 2 rollback points.	2018-02-08 11:45:25 -08:00
Marty Schoch	ea20b1be42	Merge pull request #755 from steveyen/optimize-zap-merge-byte-copy-storedDocs optimize zap merge byte copy stored docs	2018-02-08 12:27:50 -05:00
Steve Yen	ed4826b189	scorch zap merge optimization to byte-copy storedDocs The optimization to byte-copy all the storedDocs for a given segment during merging kicks in when the fields are the same across all segments and when there are no deletions for that given segment. This can happen, for example, during data loading or insert-only scenarios. As part of this commit, the Segment.copyStoredDocs() method was added, which uses a single Write() call to copy all the stored docs bytes of a segment to a writer in one shot. And, getDocStoredMetaAndCompressed() was refactored into a related helper function, getDocStoredOffsets(), which provides the storedDocs metadata (offsets & lengths) for a doc.	2018-02-08 09:08:35 -08:00
Steve Yen	0b50a20cac	scorch zap move docDropped const to earlier in file	2018-02-08 09:06:31 -08:00
Steve Yen	822457542e	scorch zap VERSION bump: check whether fields are the same at merge COMPATIBILITY NOTE: scorch zap version bumped in this commit. The version bump is because mergeFields() now computes whether fields are the same across segments and it relies on the previous commit where fieldID's are assigned in field name sorted order (albeit with _id field always having fieldID of 0). Potential future commits might rely on this info that "fields are the same across segments" for more optimizations, etc.	2018-02-08 09:06:30 -08:00
Steve Yen	ffdeb8055e	scorch sorts fields by name to assign fieldID's This is a stepping stone to allow easier future comparisons of field maps and potential merge optimizations. In bleve-blast tests on a 2015 macbook (50K wikipedia docs, 8 indexers, batch size 100, ssd), this does not seem to have a distinct effect on indexing throughput.	2018-02-08 09:06:30 -08:00
Marty Schoch	1af90936c4	Merge pull request #751 from sreekanth-cb/merger_persister_handshake_fix fix for merger persister handshake stalemate	2018-02-08 11:03:01 -05:00
Marty Schoch	0bcfb15ace	Merge pull request #754 from sreekanth-cb/mergeplan_edge_tuning tuning the edge for merge-task execution loop	2018-02-08 10:59:03 -05:00
Marty Schoch	534bd5ef4d	Merge pull request #753 from steveyen/zap-rollback-test-fixes scorch zap TestIndexRollback fixes	2018-02-08 10:57:41 -05:00
Marty Schoch	f531a248e7	Merge pull request #749 from sreekanth-cb/zapfile_cleanup_fix unblock the files for clean up, esp for merged new segment files	2018-02-08 10:53:41 -05:00
Steve Yen	3d729c73c1	Merge pull request #758 from steveyen/scorch-optimizations-20180207 scorch optimizations via struct reuse	2018-02-08 06:16:27 -08:00
Sreekanth Sivasankaran	feecce1eb2	fix for merger persister handshake stalemate The slow merger was lagging behind the fast persister to a persister notify send-loop while the persister awaits for any new introductions from introducer totally blocking the merger This fix along with the deleted files eligibilty flipping makes the file count to around 6 to 11 files per shard for both travel and beer samples	2018-02-08 11:00:21 +05:30
Steve Yen	a83ee0f364	scorch zap.MergeToWriter() takes SegmentBases instead of Segments This change turns zap.MergeToWriter() into a public func, so that it's now directly callable from outside packages (such as from scorch's top-level merger or persister). And, MergerToWriter() now takes input of SegmentBases instead of Segments, so that it can now work on either in-memory zap segments or file-based zap segments. This is yet another stepping stone towards in-memory merging of zap segments.	2018-02-07 14:38:13 -08:00
Steve Yen	8c2520d55c	scorch zap optimize via postingsList reuse pprof graphs were showing many postingsList allocations during merging, so this change optimizes by reusing postingList memory in the merging loops.	2018-02-07 14:33:20 -08:00
Steve Yen	03c8b2b7ec	scorch mem segment optimizes DictEntry's across Next() calls This change optimizes the scorch/mem DictionaryIterator by reusing a DictEntry struct across multiple Next() calls. This follows the same optimization trick and Next() semantics as upsidedown's FieldDict implementation.	2018-02-07 14:17:48 -08:00
Steve Yen	78a7ae562f	Merge pull request #756 from steveyen/optimize-storedIndexOffset-loop scorch zap mergeStoredAndRemap loop optimization	2018-02-06 18:00:34 -08:00
Steve Yen	0dfd73d6cc	scorch zap mergeStoredAndRemap loop optimization This change avoids an array/slice access in a loop body.	2018-02-06 17:10:44 -08:00

1 2 3 4 5 ...

1697 Commits