Steve Yen
d777d7c365
scorch mem segment comments consistency
2018-01-15 11:08:21 -08:00
Marty Schoch
4e82a8a0ca
Merge pull request #726 from sreekanth-cb/docValue_configs
...
DocValue Config, new API Changes
2018-01-10 18:11:18 -05:00
Sreekanth Sivasankaran
53aef2104e
fixing err handling in UTs, name changes
2018-01-10 22:00:26 +05:30
abhinavdangeti
43bfcc00c9
Do not account mmap'ed part of zap segments in MemoryUsed
...
This API is designed to only emit the dirty "unpersisted"
bytes only. This does not included the mmap'ed part in the
zap segments (disk).
2018-01-09 09:43:53 -08:00
Sreekanth Sivasankaran
4c256f5669
DocValue Config, new API Changes
...
-VisitableDocValueFields API for persisted DV field list
-making dv configs overridable at field level
-enabling on the fly/runtime un inverting of doc values
-few UT updates
2018-01-08 10:58:33 +05:30
Marty Schoch
c691cd2bb5
refactor scorch/zap command-line tools under bleve
...
zap command-line tool added to main bleve command-line tool
this required physical relocation due to the vendoring used
only on the bleve command-line tool (unforseen limitation)
a new scorch command-line tool has also been introduced
and for the same reasons it is physically store under
the top-level bleve command-line tool as well
2018-01-05 10:17:18 -05:00
Sreekanth Sivasankaran
71a726bbf6
perf issue was due to duplicate fieldIDs getting
...
inserted to the list of dv enabled fields list -
DocValueFields in mem segment.
Moved back to the original type `DocValueFields map[uint16]bool`
for easy look up to check whether the fieldID is
configured for dv storage.
2018-01-04 15:34:55 +05:30
Sreekanth Sivasankaran
f42ecb0ac7
docvalue "zap-path" cmd to print out the dv disk sizes
2018-01-04 13:58:51 +05:30
Sreekanth Sivasankaran
448201243a
removed redundant buf writer, and checks
2017-12-30 16:54:06 +05:30
Sreekanth Sivasankaran
61ba81e964
Merge branch 'scorch', remote-tracking branch 'origin' into docValue_persisted
2017-12-30 16:52:51 +05:30
abhinavdangeti
5c26f5a86d
Tracking memory consumption for a scorch index
...
+ Track memory usage at a segment level
+ Add a new scorch API: MemoryUsed()
- Aggregate the memory consumption across
segments when API is invoked.
+ TODO:
- Revisit the second iteration if it can be gotten
rid off, and the size accounted for during the first
run while building an in-mem segment.
- Accounting for pointer and slice overhead.
2017-12-29 10:20:11 -07:00
Sreekanth Sivasankaran
c8df014c0c
Updated readme, zap version, added new docvalue cmd,
...
fixed the footer and fields cmd,
interface name updated
2017-12-29 21:39:29 +05:30
Sreekanth Sivasankaran
8abac42796
errCheck fixes
2017-12-28 13:23:57 +05:30
Sreekanth Sivasankaran
0272451093
adding checks for robustness
2017-12-28 13:05:25 +05:30
Sreekanth Sivasankaran
76f827f469
docValue persist changes
...
docValues are persisted along with the index,
in a columnar fashion per field with variable
sized chunking for quick look up.
-naive chunk level caching is added per field
-data part inside a chunk is snappy compressed
-metaHeader inside the chunk index the dv values
inside the uncompressed data part
-all the fields are docValue persisted in this iteration
2017-12-28 12:05:33 +05:30
Steve Yen
67e0e5973b
scorch mergeStoredAndRemap() memory reuse
...
In mergeStoredAndRemap(), instead of allocating new hashmaps for each
document, this commit reuses some arrays that are indexed by fieldId.
2017-12-20 15:18:22 -08:00
Steve Yen
c155255506
scorch optimize zap.Merge() to reuse some buffers
2017-12-20 14:59:53 -08:00
Steve Yen
1abbfadf0d
scorch simplify err check after vellum load
2017-12-19 22:34:39 -08:00
Steve Yen
8f8333e01b
scorch optimize zap Count()
...
This proposed approach avoids building a temporary AndNot() bitmap,
following the same kind of optimization used by mem segments.
2017-12-19 18:02:27 -08:00
Steve Yen
d0e4f85026
scorch avoid extra clone by using roaring.AndNot(x, y)
2017-12-19 13:37:04 -08:00
Steve Yen
f6b506134b
import couchbase/vellum instead of couchbaselabs/vellum
...
Also, scrubbed an old couchbaselabs/moss reference in comments.
Also, go fmt.
2017-12-19 10:49:57 -08:00
Steve Yen
730d906a50
scorch reuses Posting instance in PostingsIterator.Next()
...
With this change, there are no more memory allocations in the calls to
PostingsIterator.Next() in the micro benchmarks of bleve-query. On a
dev macbook, on an index of 50K wikipedia docs, using high frequency
search of "text:date"...
400 qps - upsidedown/moss
565 qps - scorch before
680 qps - scorch after
2017-12-18 16:15:38 -08:00
Marty Schoch
b5aa4ed22b
return err not panic
2017-12-14 17:41:02 -05:00
Marty Schoch
e1b0c61e2a
fix bug in handling iterator-done
2017-12-13 22:08:06 -05:00
Steve Yen
c13ff85aaf
scorch ref-counting
...
Future commits will provide actual cleanup when ref-counts reach 0.
2017-12-13 14:48:07 -08:00
Marty Schoch
a0e12b2640
add license to a few files missing it
2017-12-13 16:12:29 -05:00
Marty Schoch
85e15628ee
major refactoring of posting details
2017-12-13 16:10:06 -05:00
Marty Schoch
6e2207c445
additional refactoring of build/merge
2017-12-13 15:22:13 -05:00
Marty Schoch
50441e5065
refactor to reuse shared code
2017-12-13 14:41:20 -05:00
Marty Schoch
289dc398bd
more refacotring of build/merge
2017-12-13 14:26:11 -05:00
Marty Schoch
1cd3fd7fbe
extrac common functionality between build/merge
2017-12-13 14:06:54 -05:00
Marty Schoch
f83c9f2a20
initial cut of merger that actually introduces changes
2017-12-13 13:41:03 -05:00
Marty Schoch
c15c3c11cd
extra protection if dict address is 0 (empty segment)
2017-12-13 13:31:18 -05:00
Marty Schoch
57121e40a8
fix issues identified by errcheck
2017-12-12 11:41:14 -05:00
Marty Schoch
665c3c80ff
initial cut of zap segment merging
2017-12-12 11:21:55 -05:00
Marty Schoch
927216df8c
fix postings list count impl
2017-12-12 08:42:13 -05:00
Marty Schoch
58ef21a88a
fix golint issue
2017-12-11 16:24:46 -05:00
Marty Schoch
f246e0e4c0
update README for zap file format changes
2017-12-11 16:22:29 -05:00
Marty Schoch
74b2eeb14d
refactor where we do some work so we can return error
2017-12-11 15:59:36 -05:00
Marty Schoch
f13b786609
fix up issues to get all bleve unit tests passing for scorch
...
make scorch default
2017-12-11 15:47:41 -05:00
Marty Schoch
d7eb223e14
remove bolt segment format
...
upcomning breaking changes and no desire to maintain
2017-12-11 10:20:26 -05:00
Marty Schoch
8280859bb8
handle read-only and in-mem only cases
2017-12-11 09:07:01 -05:00
Marty Schoch
e8cc7ac0bf
add new fields command to zap cmd-line util
2017-12-11 09:05:50 -05:00
Marty Schoch
dc0adc8827
add fsync
2017-12-09 20:52:01 -05:00
Marty Schoch
e0d9828cd0
add more detail to the readme
2017-12-09 14:42:36 -05:00
Marty Schoch
9781d9b089
add initial version of zap file format
2017-12-09 14:28:33 -05:00
Marty Schoch
ff2e6b98e4
added empty segment
2017-12-09 12:43:02 -05:00
Marty Schoch
adac4f41db
initial version of scorch which persists index to disk
2017-12-06 18:33:47 -05:00
Marty Schoch
b1346b4c8a
add readme describing our use of bolt as a segment format
2017-12-05 16:09:00 -05:00
Marty Schoch
898a6b1e85
fix errcheck issues
2017-12-05 13:32:57 -05:00
Marty Schoch
ece27ef215
adding initial version of bolt persisted segment
2017-12-05 13:05:12 -05:00
Marty Schoch
f6be841668
add test for postings list count method
2017-12-05 13:01:36 -05:00
Marty Schoch
30e9d6daa5
add better testing of array positions
2017-12-05 12:54:44 -05:00
Marty Schoch
8d9d45115f
add test of location field
2017-12-05 12:20:06 -05:00
Marty Schoch
8f0350865b
add test for segment fields method
2017-12-05 12:17:56 -05:00
Marty Schoch
7a6b5483f2
add validation that all locations were seen
2017-12-05 11:58:05 -05:00
Marty Schoch
e08fdab54a
remove todo item
2017-12-05 10:13:27 -05:00
Marty Schoch
87e2627551
added dictionary tests to mem segment
2017-12-05 09:49:41 -05:00
Marty Schoch
ed067f45dd
added Close() method to Segment
2017-12-05 09:31:02 -05:00
Marty Schoch
22ffc8940e
update segment API to return error in key places
2017-12-04 18:06:06 -05:00
Marty Schoch
b74cf4b081
add copyright header to all new files in scorch
2017-12-01 15:42:50 -05:00
Marty Schoch
89aa02cf5b
fix highlighting of composite fields
...
updated log statements for refactored names
2017-12-01 15:12:08 -05:00
Marty Schoch
cff14f1212
fix crash in DocNumbers when segment is empty
2017-12-01 09:50:27 -05:00
Marty Schoch
eb256f78bc
switch to constant referring to id field id 0
...
this avoids potentially mutating something that is intended
to be immutable
2017-12-01 09:30:07 -05:00
Marty Schoch
395458ce83
refactor to make mem segment contents exported
2017-12-01 07:26:47 -05:00
Marty Schoch
848aca4639
fix issues identified by errcheck
2017-11-29 13:34:15 -05:00
Marty Schoch
23f6dc1cc6
working in-memory version
2017-11-29 11:33:35 -05:00