0
0
Commit Graph

125 Commits

Author SHA1 Message Date
Patrick Mezard
e85c9c542e row: expose TermFrequencyRow term and freq fields
Rows content is an implementation detail of bleve index and may change
in the future. That said, they also contains information valuable to
assess the quality of the index or understand its performances. So, as
long as we agree that type asserting rows should only be done if you
know what you are doing and are ready to deal with future changes, I see
no reason to hide the row fields from external packages.

Fix #268
2015-11-17 17:21:26 +01:00
Marty Schoch
30651065e9 fix panic on insufficiently sized buffer
adds test case to reproduce original problem
fixes #264
2015-10-30 18:25:38 -04:00
Marty Schoch
2bd3ef4080 copy relevant k/v pairs before advancing underlying iterator 2015-10-28 12:23:54 -04:00
Marty Schoch
d1b07f4909 fix dump methods to properly copy keys and values 2015-10-28 12:06:44 -04:00
Marty Schoch
6cc21346dc fix errcheck issues 2015-10-19 14:27:03 -04:00
Marty Schoch
817c317c90 Merge branch 'master' into newkvstore 2015-10-19 12:04:07 -04:00
Marty Schoch
faceecf87b make row buffer size constant/configurable
also handle case where it is insufficiently sized
2015-10-19 12:03:38 -04:00
Marty Schoch
c9471d5739 Merge pull request #244 from kevgs/master
reducing allocation count
2015-10-16 15:51:30 -04:00
Marty Schoch
e6d0fc8d95 Merge pull request #247 from pmezard/remove-update-goroutine
upside_down: no need for a goroutine to enqueue AnalysisWork
2015-10-16 10:15:55 -04:00
Marty Schoch
4c6bc23043 rewrite to keep using same buffer when possible 2015-10-13 14:04:56 -07:00
Marty Schoch
8de860bf12 2 more places that used old Key() 2015-10-13 12:35:08 -07:00
Marty Schoch
5f594d1acc Merge branch 'master' into newkvstore 2015-10-12 18:07:04 -07:00
Marty Schoch
08572e4925 move literals outside loop for more predicatble test results 2015-10-12 18:06:38 -07:00
Patrick Mezard
8c928539ee upside_down: no need for a goroutine to enqueue AnalysisWork
It boils down to:
1. client sends some work and a notification channel to a single worker,
   then waits.
2. worker processes the work
3. worker sends the result to the client using the notification channel

I do not see any problem with this, even with unbuffered channels.
2015-10-12 10:42:14 +02:00
Marty Schoch
95e06538f3 fix benchmarks for the x kvstores 2015-10-09 11:09:42 -04:00
Marty Schoch
0f05d1d3ca Merge branch 'master' into newkvstore 2015-10-09 10:33:41 -04:00
Patrick Mezard
aee82f8b49 upside_down: simplify return code in batchRows() 2015-10-09 09:57:12 +02:00
Marty Schoch
e28eb749d7 bump up buffer size 2015-10-06 16:45:38 -04:00
Marty Schoch
71cbb13e07 modify code to reuse buffer for kv generation 2015-10-05 17:49:50 -04:00
Kosov Eugene
a61c350888 reducing allocation count 2015-10-05 22:57:10 +03:00
Marty Schoch
d06b526cbf more refactoring 2015-09-28 16:50:27 -04:00
Marty Schoch
900f1b4a67 major kvstore interface and impl overhaul
clarified the interface contract
2015-09-23 11:25:47 -07:00
Marty Schoch
f81b2be334 major refactor of bleve configuration
see #221 for full details
2015-09-16 17:10:59 -04:00
Marty Schoch
17c64d37c7 add similar benchmarks from firestorm 2015-09-10 08:13:52 -04:00
Marty Schoch
dbb93b75a4 refactoring to allow pluggable index encodings
this lays the foundation for supporting the new firestorm
indexing scheme.  i'm merging these changes ahead of
the rest of the firestorm branch so i can continue
to make changes to the analysis pipeline in parallel
2015-09-02 13:12:08 -04:00
Marty Schoch
3e60ca24ec support using end key on forestdb iterator for term freq lookup
also additoanl forestdb configs
2015-08-18 16:22:02 -04:00
Marty Schoch
01667dfff3 faster protobufs with gogo 2015-08-12 13:18:23 -04:00
Marty Schoch
7df66b4857 fix broken benchmark cause by index row encoding change 2015-08-06 14:48:04 -04:00
Marty Schoch
9db850a53e Merge branch 'fix/MaxVarintLen64' of https://github.com/tukdesk/bleve into tukdesk-fix/MaxVarintLen64 2015-07-31 15:16:16 -04:00
Marty Schoch
3682c25467 update to correctly work with composite fields
also updated search results to return array positions
2015-07-31 11:16:11 -04:00
Marty Schoch
c1c4941dde Merge branch 'feature/term_vector' of https://github.com/tukdesk/bleve into tukdesk-feature/term_vector 2015-07-29 14:31:15 -04:00
Marty Schoch
1b28f6218b additional row validation 2015-07-13 15:22:54 -04:00
Marty Schoch
7be7ecdf8e fix batch indexing bug, incremented docCount before commit
fixes #211
2015-06-08 14:14:05 -04:00
Marty Schoch
2768c2da3c fix previous sloppy fix which hadn't been adequately tested 2015-05-27 19:15:55 -07:00
Marty Schoch
201fb91171 fix up to correctly trim off separator
even though it should never be present
2015-05-27 19:10:12 -07:00
Marty Schoch
a58592ceff fix case where NewBackIndexRowKV returns nil, nil
the logic for reading the docID from the keys
in this row relies on the keys NEVER containing
the byte separator character (0xff), this is OK
as we require that all keys be valid utf-8
however, it turns out that in the case where this
rule was violated, we would panic, because we
return nil, nil and later try to print the doc id
2015-05-27 19:04:57 -07:00
dtynn
59c97ae577 use binary.MaxVarintLen64 2015-05-26 15:35:31 +08:00
Marty Schoch
e0887f9113 fix tests which deadlock boltdb due to deferred cleanup
fixes #209
2015-05-21 12:29:31 -04:00
dtynn
b4f7496031 update the index format version number 2015-05-18 15:16:35 +08:00
dtynn
89dc2c22bc update TermVector 2015-05-17 13:07:14 +08:00
Marty Schoch
8f70def63b properly use the stored array positions when loading a document
fixes #205
2015-05-15 15:47:54 -04:00
Marty Schoch
328bc73ed0 clarify Batch is not threadsafe in docs
in some limited cases we can detect unsafe usage
in these cases, do not trip over ourselves and panic
instead return a strongly typed error upside_down.UnsafeBatchUseDetected
also, introduced Batch.Reset() to allow batch reuse
this is currently still experimental
closes #195
2015-05-15 15:04:52 -04:00
Marty Schoch
57cd67fa88 fix data race on index metadata (docCount)
closes #198
2015-05-08 08:07:20 -04:00
Marty Schoch
57358088ec fix row merging bug
trying to be clever, we reused the memory allocated for the left
operand when doing partial merges
this had been tested to be safe, in general.  however, the
implementation was then written such that we always reused
globally defined operands, this meant that we mutated
the operands which were intended to always represent
+1/-1
this then cascades quickly to making increment/decrement
values much larger/smaller than they should be
related to #197
2015-05-06 11:00:04 -04:00
Marty Schoch
30a0ba1f9b fix bug, dictionary row encoding buffer too small
we incorrectly created a []byte of length 8
but the max for a uvarint is 10
closes #197
2015-05-06 10:04:02 -04:00
Marty Schoch
ee47d1c21a standardize on including 1000 sized batches 2015-04-24 17:31:34 -04:00
Marty Schoch
452fea6a24 adding initial impl of rocksdb kv store 2015-04-24 17:19:44 -04:00
Marty Schoch
a9c07acbfa refactor of kvstore api to support native merge in rocksdb
refactor to share code in emulated batch
refactor to share code in emulated merge
refactor index kvstore benchmarks to share more code
refactor index kvstore benchmarks to be more repeatable
2015-04-24 17:13:50 -04:00
Marty Schoch
d5dc66313f change variable name conflicting when both LevelDB bencharmks run 2015-04-10 15:03:44 -04:00
Marty Schoch
d5caad4405 changed GoLevelDB benchmark names to be different from LevelDB
this will allow for easier comparision when running both
versions at the same time
2015-04-10 15:00:56 -04:00