0
0
Commit Graph

1837 Commits

Author SHA1 Message Date
Abhinav Dangeti
b38a61d4cf
Merge pull request #812 from abhinavdangeti/search-callbacks
MB-28562: Support search query callbacks before and after execution
2018-03-08 15:11:32 -08:00
Steve Yen
614e6f19f0
Merge pull request #813 from steveyen/reuse-fieldLens-docMap
scorch mem processDocument reuses fieldLens/docMap arrays
2018-03-08 14:31:20 -08:00
abhinavdangeti
40f63baeb9 MB-28562: Support search query callbacks before and after execution
+ SearchQueryStartCallback
+ SearchQueryEndCallback
2018-03-08 13:35:51 -08:00
Steve Yen
25beba615d scorch mem processDocument reuses fieldLens/docMap arrays
This change produces less garbage by switching from a map[uint16]'s to
array's for the fieldLens and docMap, and then reusing those arrays
across multiple processDocument() calls.
2018-03-08 13:04:51 -08:00
Steve Yen
e9bbca4270
Merge pull request #811 from steveyen/chumbawamba
scorch zap optimize FST val encoding for terms with 1 hit
2018-03-08 13:04:24 -08:00
Abhinav Dangeti
676c85c935
Merge pull request #810 from abhinavdangeti/master
MB-28163: Provide an API to estimate the RAM needed for SearchResult
2018-03-08 12:01:17 -08:00
Steve Yen
eac9808990 scorch zap optimize FST val encoding for terms with 1 hit
NOTE: this is a scorch zap file format change / bump to version 4.

In this optimization, the uint64 val stored in the vellum FST (term
dictionary) now may either be a uint64 postingsOffset (same as before
this change) or a uint64 encoding of the docNum + norm (in the case
where a term appears in just a single doc).
2018-03-08 09:19:54 -08:00
abhinavdangeti
33ef4ce35e MB-28163: Provide an API to estimate the RAM needed for SearchResult
exported API: MemoryNeededForSearchResult(req *SearchRequest)
2018-03-07 12:08:16 -08:00
Steve Yen
f04226d10b
Merge pull request #809 from steveyen/TestRoaringSizes
added TestRoaringSizes()
2018-03-07 12:01:24 -08:00
Steve Yen
1e2bb14f13 added TestRoaringSizes() 2018-03-07 10:53:24 -08:00
Steve Yen
0ec4a1935a
Merge pull request #808 from steveyen/more-scorch-optimizing
err fix and more scorch optimizing
2018-03-07 10:39:20 -08:00
Abhinav Dangeti
06be1ad72e
Merge pull request #806 from abhinavdangeti/master
Fixing the scorch search request memory estimate
2018-03-07 10:11:24 -08:00
Steve Yen
2b5da7a819 go fmt 2018-03-07 09:12:55 -08:00
Steve Yen
59eb70d020 scorch zap remove unused chunkedIntCoder field 2018-03-07 09:11:10 -08:00
Steve Yen
79f28b7c93 scorch fix persistDocValues() err return 2018-03-07 09:11:10 -08:00
Steve Yen
8c0f402d4b scorch zap optimize processDocument() loc inner loop 2018-03-07 09:11:10 -08:00
Steve Yen
15242af465
Merge pull request #805 from steveyen/optimize-scorch-mem-processField
Optimize scorch processField() inner loop and writeRoaringWithLen()
2018-03-07 09:09:57 -08:00
Sreekanth Sivasankaran
c813165d4b
Merge pull request #798 from blevesearch/compaction_bytes_stats
adding compaction_written_bytes/sec stats to scorch
2018-03-07 22:29:13 +05:30
Sreekanth Sivasankaran
73ed8e248d
fixing the indentation issues.
looks like it happened during the web based conflict resolution..
2018-03-07 18:34:54 +05:30
Sreekanth Sivasankaran
e0369a3553
Merge branch 'master' into compaction_bytes_stats 2018-03-07 14:47:33 +05:30
Sreekanth Sivasankaran
2a9739ee1b naming change, interface removal 2018-03-07 14:43:33 +05:30
abhinavdangeti
5c721226cf Fixing the scorch search request memory estimate
Do not re-account for certain referenced data in the zap structures.

New estimates:

                                    ESTIMATE    BENCHMEM
TermQuery                           11396       12437
MatchQuery                          12244       12951
DisjunctionQuery (Term queries)     20644       20709
2018-03-06 16:03:10 -08:00
Steve Yen
8841d79d26 scorch optimize mem processField inner-loop 2018-03-06 15:26:54 -08:00
Steve Yen
dde6c2e01b scorch zap optimize writeRoaringWithLen()
Before this change, writeRoaringWithLen() would leverage a reused
bytes.Buffer (#A) and invoke the roaring.WriteTo() API.

But, it turns out the roaring.WriteTo() API has a suboptimal
implementation, in that underneath-the-hood it converts the roaring
bitmap to a byte buffer (using roaring.ToBytes()), and then calls
Write().  But, that Write() turns out to be an additional memcpy into
the provided bytes.Buffer (#A).

By directly invoking roaring.ToBytes(), this change to
writeRoaringWithLen() avoids the extra memory allocation and memcpy.
2018-03-06 14:59:20 -08:00
Steve Yen
ae81806435
Merge pull request #802 from steveyen/use-chunkedIntCoder-multival-input
scorch zap optimize chunkedIntCoder.Add() calls to use multiple vals
2018-03-06 14:23:44 -08:00
Steve Yen
b62ca996f6 scorch zap optimize chunkedIntCoder.Add() calls to use multiple vals
This change leverages the ability for the chunkedIntCoder.Add() method
to accept multiple input param values (via the '...' param signature),
meaning there are fewer Add() invocations.
2018-03-06 14:11:41 -08:00
Abhinav Dangeti
79d376ecfb
Merge pull request #803 from abhinavdangeti/master
Address build breakage after rebase
2018-03-06 14:09:08 -08:00
abhinavdangeti
38b6c522b0 Address build breakage after rebase
Removed attribute: iterator of type Posting
2018-03-06 14:00:54 -08:00
abhinavdangeti
96071c085c MB-28163: Register a callback with context to estimate RAM for search
This callback if registered with context will invoke the api to estimate
the memory needed to execute a search query. The callback defined at
the client side will be responsible for determining whether to
continue with the search or abort based on the threshold settings.
2018-03-06 13:53:42 -08:00
abhinavdangeti
7e36109b3c MB-28162: Provide API to estimate memory needed to run a search query
This API (unexported) will estimate the amount of memory needed to execute
a search query over an index before the collector begins data collection.

Sample estimates for certain queries:
{Size: 10, BenchmarkUpsidedownSearchOverhead}
                                                           ESTIMATE    BENCHMEM
TermQuery                                                  4616        4796
MatchQuery                                                 5210        5405
DisjunctionQuery (Match queries)                           7700        8447
DisjunctionQuery (Term queries)                            6514        6591
ConjunctionQuery (Match queries)                           7524        8175
Nested disjunction query (disjunction of disjunctions)     10306       10708
…
2018-03-06 13:53:42 -08:00
Steve Yen
2b005f1e23
Merge pull request #801 from steveyen/scorch-postings-itr-byte-copy
scorch merge optimizations via tf/loc byte copy & reader/decoder reuse
2018-03-06 13:41:58 -08:00
Steve Yen
5b86da85f3 scorch zap optimize postings itr with tf/loc reader/decoder reuse 2018-03-06 13:30:59 -08:00
Steve Yen
530a3d24cf scorch zap optimize merge by byte copying freq/norm/loc's
This change adds a zap PostingsIterator.nextBytes() method, which is
similar to Next(), but instead of returning a Posting instance,
nextBytes() returns the encoded freq/norm and location byte slices.

The zap merge code then provides those byte slices directly to the
intCoder's via a new method, intCoder.AddBytes(), thereby avoiding
having to encode many uvarint's.
2018-03-06 13:30:59 -08:00
Steve Yen
655268bec8 scorch zap postings iterator nextDocNum() helper method
Refactored out a nextDocNum() helper method from Next() that future
optimizations can use.
2018-03-06 07:55:26 -08:00
Sreekanth Sivasankaran
fa5de8e09a making NumSnapshotsToKeep configurable 2018-03-06 16:22:11 +05:30
Steve Yen
502e64c256 scorch zap Posting doesn't use iterator field 2018-03-05 16:33:13 -08:00
Steve Yen
16174c589d
Merge pull request #799 from steveyen/scorch-optimizations
more scorch optimizations
2018-03-05 15:59:19 -08:00
Steve Yen
8f8fd511b7 scorch zap access freqs[offset] outside loop 2018-03-05 12:02:33 -08:00
Steve Yen
a338386a03 scorch build optimize freq/loc slice capacity 2018-03-05 12:02:33 -08:00
Steve Yen
856778ad7b scorch zap build prealloc docNumbers capacity 2018-03-05 12:02:33 -08:00
Steve Yen
8c0881eab2 scorch zap build reuses mem postingsList/Iterator structs 2018-03-05 12:02:33 -08:00
Steve Yen
85761c6a57 go fmt 2018-03-05 12:02:33 -08:00
Steve Yen
d44c5ad568 scorch stats MaxBatchIntroTime bug fix and more timing stats
Added timing stats for in-mem zap merging and file-based zap merging.
2018-03-05 12:02:33 -08:00
Steve Yen
c5ab1f61d7
Merge pull request #795 from steveyen/scorch-mem-optimizations
More scorch micro optimizations when processing mem segments
2018-03-05 12:02:15 -08:00
Sreekanth Sivasankaran
395b0a312d adding UTs 2018-03-05 17:02:58 +05:30
Sreekanth Sivasankaran
dec265c481 adding compaction_written_bytes/sec stats to scorch 2018-03-05 16:32:57 +05:30
Steve Yen
884da6f93a scorch optimize mem processDocument() norm calculation
This change moves the norm calculation outside of the inner loop.
2018-03-03 11:58:30 -08:00
Steve Yen
6ae799052a scorch mem optimize processDocument() stored field 2018-03-03 11:52:33 -08:00
Steve Yen
b7cfef81c9 scorch optimize mem processDocument() dict access
This change moves the dict lookup to outside of the loop.
2018-03-03 11:43:25 -08:00
Steve Yen
88c740095b scorch optimizations for mem.PostingsIterator.Next() & docTermMap
Due to the usage rules of iterators, mem.PostingsIterator.Next() can
reuse its returned Postings instance.

Also, there's a micro optimization in persistDocValues() for one fewer
access to the docTermMap in the inner-loop.
2018-03-03 11:31:18 -08:00