0
0
Fork 0
Commit Graph

1755 Commits

Author SHA1 Message Date
abhinavdangeti 7e36109b3c MB-28162: Provide API to estimate memory needed to run a search query
This API (unexported) will estimate the amount of memory needed to execute
a search query over an index before the collector begins data collection.

Sample estimates for certain queries:
{Size: 10, BenchmarkUpsidedownSearchOverhead}
                                                           ESTIMATE    BENCHMEM
TermQuery                                                  4616        4796
MatchQuery                                                 5210        5405
DisjunctionQuery (Match queries)                           7700        8447
DisjunctionQuery (Term queries)                            6514        6591
ConjunctionQuery (Match queries)                           7524        8175
Nested disjunction query (disjunction of disjunctions)     10306       10708
…
2018-03-06 13:53:42 -08:00
Steve Yen 2b005f1e23
Merge pull request #801 from steveyen/scorch-postings-itr-byte-copy
scorch merge optimizations via tf/loc byte copy & reader/decoder reuse
2018-03-06 13:41:58 -08:00
Steve Yen 5b86da85f3 scorch zap optimize postings itr with tf/loc reader/decoder reuse 2018-03-06 13:30:59 -08:00
Steve Yen 530a3d24cf scorch zap optimize merge by byte copying freq/norm/loc's
This change adds a zap PostingsIterator.nextBytes() method, which is
similar to Next(), but instead of returning a Posting instance,
nextBytes() returns the encoded freq/norm and location byte slices.

The zap merge code then provides those byte slices directly to the
intCoder's via a new method, intCoder.AddBytes(), thereby avoiding
having to encode many uvarint's.
2018-03-06 13:30:59 -08:00
Steve Yen 655268bec8 scorch zap postings iterator nextDocNum() helper method
Refactored out a nextDocNum() helper method from Next() that future
optimizations can use.
2018-03-06 07:55:26 -08:00
Steve Yen 502e64c256 scorch zap Posting doesn't use iterator field 2018-03-05 16:33:13 -08:00
Steve Yen 16174c589d
Merge pull request #799 from steveyen/scorch-optimizations
more scorch optimizations
2018-03-05 15:59:19 -08:00
Steve Yen 8f8fd511b7 scorch zap access freqs[offset] outside loop 2018-03-05 12:02:33 -08:00
Steve Yen a338386a03 scorch build optimize freq/loc slice capacity 2018-03-05 12:02:33 -08:00
Steve Yen 856778ad7b scorch zap build prealloc docNumbers capacity 2018-03-05 12:02:33 -08:00
Steve Yen 8c0881eab2 scorch zap build reuses mem postingsList/Iterator structs 2018-03-05 12:02:33 -08:00
Steve Yen 85761c6a57 go fmt 2018-03-05 12:02:33 -08:00
Steve Yen d44c5ad568 scorch stats MaxBatchIntroTime bug fix and more timing stats
Added timing stats for in-mem zap merging and file-based zap merging.
2018-03-05 12:02:33 -08:00
Steve Yen c5ab1f61d7
Merge pull request #795 from steveyen/scorch-mem-optimizations
More scorch micro optimizations when processing mem segments
2018-03-05 12:02:15 -08:00
Steve Yen 884da6f93a scorch optimize mem processDocument() norm calculation
This change moves the norm calculation outside of the inner loop.
2018-03-03 11:58:30 -08:00
Steve Yen 6ae799052a scorch mem optimize processDocument() stored field 2018-03-03 11:52:33 -08:00
Steve Yen b7cfef81c9 scorch optimize mem processDocument() dict access
This change moves the dict lookup to outside of the loop.
2018-03-03 11:43:25 -08:00
Steve Yen 88c740095b scorch optimizations for mem.PostingsIterator.Next() & docTermMap
Due to the usage rules of iterators, mem.PostingsIterator.Next() can
reuse its returned Postings instance.

Also, there's a micro optimization in persistDocValues() for one fewer
access to the docTermMap in the inner-loop.
2018-03-03 11:31:18 -08:00
Steve Yen 4ebf3f1d44
Merge pull request #794 from steveyen/persister-uses-introducer
scorch persister goes through introducer to affect root
2018-03-02 16:23:17 -08:00
Steve Yen a5253bfe2b scorch persister goes through introducer to affect root
This change allows the introducer to become the only goroutine to
modify the root, which in turn allows the introducer to greatly reduce
its root lock holding surface area.
2018-03-02 16:14:28 -08:00
Marty Schoch 33641ef9d3
Merge pull request #793 from mschoch/remove-reader
remove unnecessary scorch reader wrapper
2018-03-02 14:13:31 -08:00
Marty Schoch 30acc55d05 remove unnecessary scorch reader wrapper
we now use *IndexSnapshot directly
2018-03-02 14:03:54 -08:00
Steve Yen fd6bfb0113
Merge pull request #792 from steveyen/more-stats
adding more scorch related stats
2018-03-02 13:41:06 -08:00
Steve Yen d61d9e4cf6 scorch stats MaxBatchIntroTime and TotBatchIntroTime 2018-03-02 13:33:06 -08:00
Steve Yen 868a66279e scorch indexing time stat
Looks like this was forgotten along the way -- the stat for analysis
time was tracked correctly, but indexing time wasn't.
2018-03-02 11:07:39 -08:00
Steve Yen 3fe7e2e4f4
Merge pull request #791 from steveyen/scorch-stats-gauge
renamed to CurOnDiskBytes/Files as those are gauges
2018-03-01 22:47:21 -08:00
Steve Yen 7e5bb0bd8d renamed to CurOnDiskBytes/Files as those are gauges 2018-03-01 14:13:43 -08:00
Marty Schoch bbfda08cf7
Merge pull request #790 from mschoch/use-vellum-reset
update to use new vellum Reset API
2018-03-01 10:06:38 -08:00
Marty Schoch 0363b24dd4 update to use new vellum Reset API 2018-03-01 09:37:39 -08:00
Steve Yen 39f9cee910
Merge pull request #789 from steveyen/sreekanth-cb-scorch_stats
adding stats for scorch, with no gauges
2018-02-28 17:41:10 -08:00
Steve Yen 1e6243e21c
Merge pull request #788 from steveyen/intcoder-encoder-never-nil
scorch zap intcoder encoder is never nil
2018-02-28 17:32:25 -08:00
Steve Yen 1b661ef844 stats cleanup, renaming, gauges replaced with counters 2018-02-28 17:03:28 -08:00
Steve Yen 7d46d2c7ae scorch zap intcoder encoder is never nil 2018-02-28 10:09:21 -08:00
Sreekanth Sivasankaran 4b742505aa adding stats for scorch 2018-02-28 15:31:55 +05:30
Steve Yen 56c2acd990
Merge pull request #775 from steveyen/posting-reuse-reader
Postings codepaths reuse readers and location slices
2018-02-27 19:26:20 -08:00
Steve Yen dd7d93ee5e scorch zap loadChunk reuses Location slices 2018-02-27 18:01:48 -08:00
Steve Yen 4dbb4b1495 scorch zap posting reuses freqNorm & loc reader and decoder 2018-02-27 18:01:48 -08:00
Steve Yen 1733d4ee5e
Merge pull request #786 from steveyen/MB-28403
MB-28403: scorch introduceMerge doesn't prealloc segments capacity
2018-02-27 15:26:27 -08:00
Steve Yen a32362ba2e MB-28403: scorch introduceMerge doesn't prealloc segments capacity
There's now multiple competing merge activities (file-merging and
in-memory merging during persistence), so the simple math to
precalculate capacity for the slice of segments in introduceMerge() no
longer works for all cases and might have negative capacity.

This change removes that (sometimes wrong) precalculation, and instead
depends on append() to grow the slice correctly.
2018-02-27 15:14:34 -08:00
Marty Schoch b5ce0b046f
Merge pull request #785 from mschoch/use-new-context-pkg
BREAKING API CHANGE - use stdlib context pkg
2018-02-27 14:17:08 -08:00
Marty Schoch 8063132766 fix new issues found by go vet when using stdlib context pkg 2018-02-27 11:57:21 -08:00
Marty Schoch c74e08f039 BREAKING API CHANGE - use stdlib context pkg
update all references to context to use std lib pkg
2018-02-27 11:33:43 -08:00
Marty Schoch f58a205ae8 remove 1.6 from travis, add "1.10" 2018-02-27 11:29:16 -08:00
Steve Yen 1a319cdf5b
Merge pull request #784 from steveyen/drops-loop-optimization
scorch zap merge optimize drops lookup to outside of loop
2018-02-27 09:52:56 -08:00
Steve Yen 3f1dcb6078 scorch zap merge optimize drops lookup to outside of loop 2018-02-27 09:23:29 -08:00
Marty Schoch b8bb7922eb
Merge pull request #782 from steveyen/scorch-intcoder-optimizations
Various scorch optimizations around merge & chunkedIntCoder
2018-02-26 17:57:00 -05:00
Steve Yen 99ed127176 scorch zap merge optimize newDocNums lookup to outside of loop
And, also a "go fmt".
2018-02-26 14:23:55 -08:00
Steve Yen 98d5d7bd81 scorch zap chunkedIntCoder optimizations
The optimizations / changes include...

- reuse of a memory buf when serializing varint's.

- reuse of a govarint.U64Base128Encoder instance, as it's a thin,
  wrapper around an underlying chunkBuf, so Reset()'s on the
  chunkBuf is enough for encoder reuse.

- chunkedIntcoder.Write() method was changed to invoke w.Write() less
  often by forming a larger, reused buf.  Profiling and analysis
  showed w.Write() was getting called a lot, often with tiny 1 or 2
  byte inputs.  The theory is w.Write() and its underlying memmove()
  can be more efficient when provided with larger bufs.

- some repeated code removal, by reusing the Close() method.
2018-02-26 14:17:09 -08:00
Steve Yen ce2332e111 scorch zap merge reuses tf/locEncoder across terms
The finishTerm() helper func that's invoked on every outer loop resets
the tf/locEncoders so they can be safely reused.
2018-02-26 11:37:11 -08:00
Marty Schoch eca31dfd27
Merge pull request #777 from sreekanth-cb/persister_pause
pausing persister until merging catches up
2018-02-26 14:36:07 -05:00