0
0
Commit Graph

13 Commits

Author SHA1 Message Date
Steve Yen
192621f402 scorch includeFreq/Norm/Locs params for postingsList.Iterator API
This commit adds boolean flag params to the scorch
PostingsList.Iterator() method, so that the caller can specify whether
freq/norm/locs information is needed or not.

Future changes can leverage these params for optimizations.
2018-03-26 09:49:44 -07:00
Steve Yen
eade78be2f scorch zap unit tests no longer use mem.Segment 2018-03-09 15:23:58 -08:00
Steve Yen
eac9808990 scorch zap optimize FST val encoding for terms with 1 hit
NOTE: this is a scorch zap file format change / bump to version 4.

In this optimization, the uint64 val stored in the vellum FST (term
dictionary) now may either be a uint64 postingsOffset (same as before
this change) or a uint64 encoding of the docNum + norm (in the case
where a term appears in just a single doc).
2018-03-08 09:19:54 -08:00
Sreekanth Sivasankaran
2a9739ee1b naming change, interface removal 2018-03-07 14:43:33 +05:30
Sreekanth Sivasankaran
395b0a312d adding UTs 2018-03-05 17:02:58 +05:30
Steve Yen
ed4826b189 scorch zap merge optimization to byte-copy storedDocs
The optimization to byte-copy all the storedDocs for a given segment
during merging kicks in when the fields are the same across all
segments and when there are no deletions for that given segment.  This
can happen, for example, during data loading or insert-only scenarios.

As part of this commit, the Segment.copyStoredDocs() method was added,
which uses a single Write() call to copy all the stored docs bytes of
a segment to a writer in one shot.

And, getDocStoredMetaAndCompressed() was refactored into a related
helper function, getDocStoredOffsets(), which provides the storedDocs
metadata (offsets & lengths) for a doc.
2018-02-08 09:08:35 -08:00
Steve Yen
8c2520d55c scorch zap optimize via postingsList reuse
pprof graphs were showing many postingsList allocations during
merging, so this change optimizes by reusing postingList memory in the
merging loops.
2018-02-07 14:33:20 -08:00
Steve Yen
93b037cdbb scorch zap TestMergeWithUpdates() 2018-01-31 11:44:41 -08:00
Steve Yen
4dd64b68fa scorch zap TestMergeWithEmptySegment(s) 2018-01-30 22:27:40 -08:00
Steve Yen
684ee3c0e7 scorch zap DictIterator term count fixed and more merge unit tests
The zap DictionaryIterator Next() was incorrectly returning the
postingsList offset as the term count.  As part of this, refactored
out a PostingsList.read() helper method.

Also added more merge unit test scenarios, including merging a segment
for a few rounds to see if there are differences before/after merging.
2018-01-30 21:22:06 -08:00
Marty Schoch
a0e12b2640 add license to a few files missing it 2017-12-13 16:12:29 -05:00
Marty Schoch
f83c9f2a20 initial cut of merger that actually introduces changes 2017-12-13 13:41:03 -05:00
Marty Schoch
665c3c80ff initial cut of zap segment merging 2017-12-12 11:21:55 -05:00