0
0
Commit Graph

44 Commits

Author SHA1 Message Date
Marty Schoch
4e82a8a0ca
Merge pull request #726 from sreekanth-cb/docValue_configs
DocValue Config, new API Changes
2018-01-10 18:11:18 -05:00
Sreekanth Sivasankaran
53aef2104e fixing err handling in UTs, name changes 2018-01-10 22:00:26 +05:30
abhinavdangeti
43bfcc00c9 Do not account mmap'ed part of zap segments in MemoryUsed
This API is designed to only emit the dirty "unpersisted"
bytes only. This does not included the mmap'ed part in the
zap segments (disk).
2018-01-09 09:43:53 -08:00
Sreekanth Sivasankaran
4c256f5669 DocValue Config, new API Changes
-VisitableDocValueFields API for persisted DV field list
-making dv configs overridable at field level
-enabling on the fly/runtime un inverting of doc values
-few UT updates
2018-01-08 10:58:33 +05:30
Marty Schoch
c691cd2bb5 refactor scorch/zap command-line tools under bleve
zap command-line tool added to main bleve command-line tool
this required physical relocation due to the vendoring used
only on the bleve command-line tool (unforseen limitation)

a new scorch command-line tool has also been introduced
and for the same reasons it is physically store under
the top-level bleve command-line tool as well
2018-01-05 10:17:18 -05:00
Sreekanth Sivasankaran
71a726bbf6 perf issue was due to duplicate fieldIDs getting
inserted to the list of dv enabled fields list -
DocValueFields in mem segment.
Moved back to the original type `DocValueFields map[uint16]bool`
for easy look up to check whether the fieldID is
configured for dv storage.
2018-01-04 15:34:55 +05:30
Sreekanth Sivasankaran
f42ecb0ac7 docvalue "zap-path" cmd to print out the dv disk sizes 2018-01-04 13:58:51 +05:30
Sreekanth Sivasankaran
448201243a removed redundant buf writer, and checks 2017-12-30 16:54:06 +05:30
Sreekanth Sivasankaran
61ba81e964 Merge branch 'scorch', remote-tracking branch 'origin' into docValue_persisted 2017-12-30 16:52:51 +05:30
abhinavdangeti
5c26f5a86d Tracking memory consumption for a scorch index
+ Track memory usage at a segment level
+ Add a new scorch API: MemoryUsed()
    - Aggregate the memory consumption across
      segments when API is invoked.

+ TODO:
    - Revisit the second iteration if it can be gotten
      rid off, and the size accounted for during the first
      run while building an in-mem segment.
    - Accounting for pointer and slice overhead.
2017-12-29 10:20:11 -07:00
Sreekanth Sivasankaran
c8df014c0c Updated readme, zap version, added new docvalue cmd,
fixed the footer and fields cmd,
interface name updated
2017-12-29 21:39:29 +05:30
Sreekanth Sivasankaran
8abac42796 errCheck fixes 2017-12-28 13:23:57 +05:30
Sreekanth Sivasankaran
0272451093 adding checks for robustness 2017-12-28 13:05:25 +05:30
Sreekanth Sivasankaran
76f827f469 docValue persist changes
docValues are persisted along with the index,
in a columnar fashion per field with variable
sized chunking for quick look up.
-naive chunk level caching is added per field
-data part inside a chunk is snappy compressed
-metaHeader inside the chunk index the dv values
 inside the uncompressed data part
-all the fields are docValue persisted in this iteration
2017-12-28 12:05:33 +05:30
Steve Yen
67e0e5973b scorch mergeStoredAndRemap() memory reuse
In mergeStoredAndRemap(), instead of allocating new hashmaps for each
document, this commit reuses some arrays that are indexed by fieldId.
2017-12-20 15:18:22 -08:00
Steve Yen
c155255506 scorch optimize zap.Merge() to reuse some buffers 2017-12-20 14:59:53 -08:00
Steve Yen
1abbfadf0d scorch simplify err check after vellum load 2017-12-19 22:34:39 -08:00
Steve Yen
8f8333e01b scorch optimize zap Count()
This proposed approach avoids building a temporary AndNot() bitmap,
following the same kind of optimization used by mem segments.
2017-12-19 18:02:27 -08:00
Steve Yen
d0e4f85026 scorch avoid extra clone by using roaring.AndNot(x, y) 2017-12-19 13:37:04 -08:00
Steve Yen
f6b506134b import couchbase/vellum instead of couchbaselabs/vellum
Also, scrubbed an old couchbaselabs/moss reference in comments.

Also, go fmt.
2017-12-19 10:49:57 -08:00
Steve Yen
730d906a50 scorch reuses Posting instance in PostingsIterator.Next()
With this change, there are no more memory allocations in the calls to
PostingsIterator.Next() in the micro benchmarks of bleve-query.  On a
dev macbook, on an index of 50K wikipedia docs, using high frequency
search of "text:date"...

   400 qps - upsidedown/moss
   565 qps - scorch before
   680 qps - scorch after
2017-12-18 16:15:38 -08:00
Marty Schoch
b5aa4ed22b return err not panic 2017-12-14 17:41:02 -05:00
Marty Schoch
e1b0c61e2a fix bug in handling iterator-done 2017-12-13 22:08:06 -05:00
Steve Yen
c13ff85aaf scorch ref-counting
Future commits will provide actual cleanup when ref-counts reach 0.
2017-12-13 14:48:07 -08:00
Marty Schoch
a0e12b2640 add license to a few files missing it 2017-12-13 16:12:29 -05:00
Marty Schoch
85e15628ee major refactoring of posting details 2017-12-13 16:10:06 -05:00
Marty Schoch
6e2207c445 additional refactoring of build/merge 2017-12-13 15:22:13 -05:00
Marty Schoch
50441e5065 refactor to reuse shared code 2017-12-13 14:41:20 -05:00
Marty Schoch
289dc398bd more refacotring of build/merge 2017-12-13 14:26:11 -05:00
Marty Schoch
1cd3fd7fbe extrac common functionality between build/merge 2017-12-13 14:06:54 -05:00
Marty Schoch
f83c9f2a20 initial cut of merger that actually introduces changes 2017-12-13 13:41:03 -05:00
Marty Schoch
c15c3c11cd extra protection if dict address is 0 (empty segment) 2017-12-13 13:31:18 -05:00
Marty Schoch
57121e40a8 fix issues identified by errcheck 2017-12-12 11:41:14 -05:00
Marty Schoch
665c3c80ff initial cut of zap segment merging 2017-12-12 11:21:55 -05:00
Marty Schoch
927216df8c fix postings list count impl 2017-12-12 08:42:13 -05:00
Marty Schoch
58ef21a88a fix golint issue 2017-12-11 16:24:46 -05:00
Marty Schoch
f246e0e4c0 update README for zap file format changes 2017-12-11 16:22:29 -05:00
Marty Schoch
74b2eeb14d refactor where we do some work so we can return error 2017-12-11 15:59:36 -05:00
Marty Schoch
f13b786609 fix up issues to get all bleve unit tests passing for scorch
make scorch default
2017-12-11 15:47:41 -05:00
Marty Schoch
8280859bb8 handle read-only and in-mem only cases 2017-12-11 09:07:01 -05:00
Marty Schoch
e8cc7ac0bf add new fields command to zap cmd-line util 2017-12-11 09:05:50 -05:00
Marty Schoch
dc0adc8827 add fsync 2017-12-09 20:52:01 -05:00
Marty Schoch
e0d9828cd0 add more detail to the readme 2017-12-09 14:42:36 -05:00
Marty Schoch
9781d9b089 add initial version of zap file format 2017-12-09 14:28:33 -05:00