This API (unexported) will estimate the amount of memory needed to execute
a search query over an index before the collector begins data collection.
Sample estimates for certain queries:
{Size: 10, BenchmarkUpsidedownSearchOverhead}
ESTIMATE BENCHMEM
TermQuery 4616 4796
MatchQuery 5210 5405
DisjunctionQuery (Match queries) 7700 8447
DisjunctionQuery (Term queries) 6514 6591
ConjunctionQuery (Match queries) 7524 8175
Nested disjunction query (disjunction of disjunctions) 10306 10708
…
Found with "versus" test (TestScorchVersusUpsideDownBoltSmallMNSAM),
which had a boolean query with a MustNot that was the same as the Must
parameters. This replicates a situation found by
Aruna/Mihir/testrunner/RQG (MB-27291). Example:
"query": {
"must_not": {"disjuncts": [
{"field": "body", "match": "hello"}
]},
"must": {"conjuncts": [
{"field": "body", "match": "hello"}
]}
}
The nested searchers along the MustNot pathway would end up looking
roughly like...
booleanSearcher
MustNot
=> disjunctionSearcher
=> disjunctionSearcher
=> termSearcher
On the first Next() call by the collector, the two disjunction
searchers would run through their respective Next() method processing,
which includes their initSearcher() processing on the first time.
This has the effect of driving the leaf termSearcher through two
Next() invocations.
That is, if there were 3 docs (doc-1, doc-2, doc-3), the leaf
termSearcher would at this point have moved to point to doc-3, while
the topmost MustNot would have received doc-1.
Next, the booleanSearcher's Must searcher would produce doc-2, so the
booleanSearcher would try to Advance() the MustNot searcher to doc-2.
But, in scorch, the leafmost termSearcher had already gotten past
doc-2 and would return its doc-3.
In upsidedown, in contrast, the leaf termSearcher would then drive the
KVStore iterator with a Seek(doc-2), and the KVStore iterator would
perform a backwards seek to reach doc-2.
In scorch, however, backwards iteration seeking isn't supported.
So, this fix checks the state of the disjunction searcher to see if we
already have the necessary state so that we don't have to perform
actual Advance()'es on the underlying searchers. This not only fixes
the behavior w.r.t. scorch, but also can have an effect of potentially
making upsidedown slightly faster as we're avoiding some backwards
KVStore iterator seeks.
This is a change in search result behavior in that location
information is no longer provided by default with search results.
Although this looks like a wide-ranging change, it's mostly a
mechanical replacement of the explain bool flag with a new
search.SearcherOptions struct, which holds both the Explain bool flag
and the IncludeTermVectors bool flag.
This commit reverts a previous optimization attempt 3f588cd4a that
tried to trim or shrink the array of child searchers in a
search-disjunction.
Although I am not sure why at the moment, that optimization
incorrectly broke higher level boolean queries, but reverting so that
functionality is restored.
Disjunction searchers are used heavily by higher-level searchers, like
prefix searchers. In that case, a disjunction searcher might have
many thousands of child searchers.
This commit adds an optimization to close each child term searcher as
soon as a child searcher is finished and remove it from the
disjunction searcher's children.