This change produces less garbage by switching from a map[uint16]'s to
array's for the fieldLens and docMap, and then reusing those arrays
across multiple processDocument() calls.
NOTE: this is a scorch zap file format change / bump to version 4.
In this optimization, the uint64 val stored in the vellum FST (term
dictionary) now may either be a uint64 postingsOffset (same as before
this change) or a uint64 encoding of the docNum + norm (in the case
where a term appears in just a single doc).
Do not re-account for certain referenced data in the zap structures.
New estimates:
ESTIMATE BENCHMEM
TermQuery 11396 12437
MatchQuery 12244 12951
DisjunctionQuery (Term queries) 20644 20709
Before this change, writeRoaringWithLen() would leverage a reused
bytes.Buffer (#A) and invoke the roaring.WriteTo() API.
But, it turns out the roaring.WriteTo() API has a suboptimal
implementation, in that underneath-the-hood it converts the roaring
bitmap to a byte buffer (using roaring.ToBytes()), and then calls
Write(). But, that Write() turns out to be an additional memcpy into
the provided bytes.Buffer (#A).
By directly invoking roaring.ToBytes(), this change to
writeRoaringWithLen() avoids the extra memory allocation and memcpy.
This change leverages the ability for the chunkedIntCoder.Add() method
to accept multiple input param values (via the '...' param signature),
meaning there are fewer Add() invocations.
This API (unexported) will estimate the amount of memory needed to execute
a search query over an index before the collector begins data collection.
Sample estimates for certain queries:
{Size: 10, BenchmarkUpsidedownSearchOverhead}
ESTIMATE BENCHMEM
TermQuery 4616 4796
MatchQuery 5210 5405
DisjunctionQuery (Match queries) 7700 8447
DisjunctionQuery (Term queries) 6514 6591
ConjunctionQuery (Match queries) 7524 8175
Nested disjunction query (disjunction of disjunctions) 10306 10708
…
This change adds a zap PostingsIterator.nextBytes() method, which is
similar to Next(), but instead of returning a Posting instance,
nextBytes() returns the encoded freq/norm and location byte slices.
The zap merge code then provides those byte slices directly to the
intCoder's via a new method, intCoder.AddBytes(), thereby avoiding
having to encode many uvarint's.
Due to the usage rules of iterators, mem.PostingsIterator.Next() can
reuse its returned Postings instance.
Also, there's a micro optimization in persistDocValues() for one fewer
access to the docTermMap in the inner-loop.
This change allows the introducer to become the only goroutine to
modify the root, which in turn allows the introducer to greatly reduce
its root lock holding surface area.
There's now multiple competing merge activities (file-merging and
in-memory merging during persistence), so the simple math to
precalculate capacity for the slice of segments in introduceMerge() no
longer works for all cases and might have negative capacity.
This change removes that (sometimes wrong) precalculation, and instead
depends on append() to grow the slice correctly.