0
0
Commit Graph

22 Commits

Author SHA1 Message Date
abhinavdangeti
65fed52d0b Do not account IndexReader's size in the query RAM estimate
Since its just the pointer size of the IndexReader that is
being accounted for while estimating the RAM needed to
execute a search query, get rid of the Size() API in the
IndexReader interface.
2018-03-15 13:23:58 -07:00
abhinavdangeti
7e36109b3c MB-28162: Provide API to estimate memory needed to run a search query
This API (unexported) will estimate the amount of memory needed to execute
a search query over an index before the collector begins data collection.

Sample estimates for certain queries:
{Size: 10, BenchmarkUpsidedownSearchOverhead}
                                                           ESTIMATE    BENCHMEM
TermQuery                                                  4616        4796
MatchQuery                                                 5210        5405
DisjunctionQuery (Match queries)                           7700        8447
DisjunctionQuery (Term queries)                            6514        6591
ConjunctionQuery (Match queries)                           7524        8175
Nested disjunction query (disjunction of disjunctions)     10306       10708
…
2018-03-06 13:53:42 -08:00
Marty Schoch
272da43c16 phrase searcher don't allow advance after end 2017-12-27 10:24:33 -08:00
Steve Yen
c7a342bc7d scorch conjuncts match phrase test passes
The conjunction searcher Advance() method now checks if its curr
doc-matches suffices before advancing them.
2017-12-23 09:19:40 -08:00
Marty Schoch
c048833fcd added stringer method to phrase part
a failing test was producing unhelpful pointer addresses as
the only debug output.  this changes the output to print
the terms and locations as readable text

part of #629
2017-09-01 09:16:08 -04:00
Marty Schoch
2ba915b929 add additional parens to clarify logic 2017-02-10 20:22:32 -05:00
Marty Schoch
c6085d8cdc address initial code review comments 2017-02-10 15:22:14 -05:00
Marty Schoch
09d00829db phrase searcher now supports multi-phrase
backwards compatability maintained through previous constructor
very basic test added (not sufficient)
2017-02-10 15:17:50 -05:00
Marty Schoch
9c8e1e82de add initial low-level support for multi-phrase
this adds basic multi-phrase support,
a shim to keep the top-level working
and unit tests for new multi-phrase cases
2017-02-10 13:16:05 -05:00
Marty Schoch
4e38c49287 move phrase search logic into phrase searcher
the logic of how a phrase search works should be an internal
detail of the phrase searcher.  further, these changes will
allow proper scoring of phrase matches, which require access
to the underlying searcher objects, which were hidden in the
previous approach.
2017-02-10 12:05:01 -05:00
Marty Schoch
8096d9fb90 remove use of float64 to represent int things
this originated from a misunderstanding of mine going back
several years.  the values need not be float64 just because
we plan to serialize them as json.

there are still larger questions about what the right type should
be, and where should any conversions go.  but, this commit
simply attempts to address the most egregious problems
2017-02-09 20:15:59 -05:00
Marty Schoch
232fc80dad add support for phrase slop to internals of phrase searcher
phrase slop is not yet supported on the frontend
added lots of tests around slop
2017-02-09 15:59:51 -05:00
Marty Schoch
f82638c117 refactor phrase search to be recursive
a more correct solution that will enable us to extend in two
important ways:

1) support slop
2) support multi-phrase
2017-02-03 16:05:21 -05:00
Marty Schoch
12a7257b5f remove duplicate code suggested by review from @steveyen 2017-01-31 15:12:06 -05:00
Marty Schoch
7fd8aeb50a refactor phrase search into seprate methods
at the core, the Next() method moves another searcher forward
and checks each hit to see if it also satisfies the phrase
constraints.  the current implementation has 4 nested for loops.
these nested loops make it harder read (indentation) and harder
to reason about (complexity).

this refactor does not remove any loops, it simply moves some
of the inner loops into separate methods so that one can
more easily reason about the parts separately.
2017-01-31 13:32:46 -05:00
Silvan Jegen
33e2432fc6 Initialize the return value as late as possible 2016-11-08 20:05:36 +01:00
Silvan Jegen
3dd363afaa Don't search the same term twice
We have searched for the first term in the phrase query already so we
can skip it. Before doing so we have to add the location of the first
term.
2016-11-08 20:05:04 +01:00
Silvan Jegen
d87b4f88bf Refactor phrase searching
Reduce nesting by using early continues.
2016-11-08 20:04:28 +01:00
Steve Yen
8230a7195f some simplification / DRY for phrase searcher 2016-10-12 09:26:31 -07:00
Marty Schoch
3a276153a3 actually rename packages to singular, not just directory name 2016-10-02 10:29:39 -04:00
Marty Schoch
2332455bd2 nicer formatting of license header 2016-10-02 10:13:14 -04:00
Marty Schoch
6bf9dd59ab BREAKING CHANGE - additional package renaming
i recently learned that package names should also prefer the
singular form, not the plural form
2016-10-01 17:20:59 -04:00