0
0
Commit Graph

757 Commits

Author SHA1 Message Date
Marty Schoch
f0d282f5f8 add test case for seeing prefix iterators outside of range
similar to #256 except for prefix iterators
includes fix for boltdb and gtreap which had incorrect behavior
2015-10-26 16:14:29 -04:00
Marty Schoch
0ba164322b Merge pull request #261 from pmezard/improve-bleve-dump
bleve_dump: improve online help and error handling
2015-10-26 11:04:55 -04:00
Patrick Mezard
bc977048f7 bleve_dump: improve online help and error handling
I was expecting:

  $ bleve_dump -index /index -docID 123 -fields html

to print only the terms for "html" field in specified document. The
command now detects extra arguments and flag collisions.
2015-10-26 10:37:13 +01:00
Marty Schoch
f649998490 Merge pull request #257 from pmezard/pretty-print-queries
query: add DumpQuery to expand string query and format them as JSON
2015-10-23 09:28:36 -04:00
Patrick Mezard
c9619f0359 query: add DumpQuery to expand string query and format them as JSON
This is convenient to see either complicated queries build
programmatically, or to make sure the query parser does what it is
expected to do.

Note only queries made of bleve basic queries are supported. If we
wanted to support external queries, for instance string queries with an
alternative parser, I suggest to introduce some kind of:

type ExpandableQuery interface {
    Query
    Expand(*IndexMapping) (Query, error)
}

and type assert to that instead of *queryStringQuery.
2015-10-23 14:52:42 +02:00
Marty Schoch
89bc8c3a93 Merge pull request #253 from pmezard/document-index-interfaces
doc: document DocIDReader, and some Index bits
2015-10-20 14:47:35 -04:00
Patrick Mezard
5100e00f20 doc: DocIDReader.Advance() is no longer implementation dependent 2015-10-20 20:32:23 +02:00
Patrick Mezard
2fa334fc27 doc: talk about "documents" not "indexed or stored documents" 2015-10-20 20:24:24 +02:00
Patrick Mezard
b174c137fd doc: document DocIDReader, and some Index bits 2015-10-20 20:24:24 +02:00
Marty Schoch
74780b028e Merge pull request #256 from pmezard/fix-rangeiterator-seek
Fix rangeiterator seek
2015-10-20 14:08:57 -04:00
Patrick Mezard
da72d0c2b9 store_test: deduplicate store initialization 2015-10-20 19:21:01 +02:00
Patrick Mezard
873f483804 gtreap: RangeIterator.Seek should not move before start 2015-10-20 19:12:30 +02:00
Patrick Mezard
5d7628ba3b boltdb: fix RangeIterator outside of range seeks
Two issues:
- Seeking before i.start and iterating returned keys before i.start
- Seeking after the store last key did not invalidate the iterator and
  could cause infinite loops.
2015-10-20 19:09:51 +02:00
Patrick Mezard
aada2e7333 store_test: test RangeIterator.Seek on goleveldb 2015-10-20 19:09:38 +02:00
Marty Schoch
6cc21346dc fix errcheck issues 2015-10-19 14:27:03 -04:00
Marty Schoch
817c317c90 Merge branch 'master' into newkvstore 2015-10-19 12:04:07 -04:00
Marty Schoch
faceecf87b make row buffer size constant/configurable
also handle case where it is insufficiently sized
2015-10-19 12:03:38 -04:00
Marty Schoch
f0ee9a3c66 removed commented code and unused functions 2015-10-19 11:13:03 -04:00
Marty Schoch
c9471d5739 Merge pull request #244 from kevgs/master
reducing allocation count
2015-10-16 15:51:30 -04:00
Marty Schoch
390781b379 Merge branch 'Shugyousha-simplebleveindex' 2015-10-16 13:11:51 -04:00
Marty Schoch
9528e09b1c rearrange code to avoid global variable rv
also check possible error returned by filepath.Walk
2015-10-16 13:10:43 -04:00
Marty Schoch
5c26f96606 Merge branch 'simplebleveindex' of https://github.com/Shugyousha/bleve into Shugyousha-simplebleveindex 2015-10-16 13:09:05 -04:00
Marty Schoch
e6d0fc8d95 Merge pull request #247 from pmezard/remove-update-goroutine
upside_down: no need for a goroutine to enqueue AnalysisWork
2015-10-16 10:15:55 -04:00
Marty Schoch
52f0112b0f allow bleve_dump to proces files wrapped in metrics kv store 2015-10-14 12:46:55 -07:00
Silvan Jegen
0ef988a318 Use filepath.Walk instead of rolling our own func 2015-10-14 16:44:00 +02:00
Marty Schoch
4c6bc23043 rewrite to keep using same buffer when possible 2015-10-13 14:04:56 -07:00
Marty Schoch
8de860bf12 2 more places that used old Key() 2015-10-13 12:35:08 -07:00
Marty Schoch
5f594d1acc Merge branch 'master' into newkvstore 2015-10-12 18:07:04 -07:00
Marty Schoch
08572e4925 move literals outside loop for more predicatble test results 2015-10-12 18:06:38 -07:00
Marty Schoch
22e5bab8ff Merge branch 'master' into newkvstore 2015-10-12 18:02:11 -07:00
Marty Schoch
f43fa4294a simplify prefix coding
based on discussion here:
https://github.com/blevesearch/blevesearch.github.io-hugo/pull/2
2015-10-12 14:53:17 -07:00
Patrick Mezard
8c928539ee upside_down: no need for a goroutine to enqueue AnalysisWork
It boils down to:
1. client sends some work and a notification channel to a single worker,
   then waits.
2. worker processes the work
3. worker sends the result to the client using the notification channel

I do not see any problem with this, even with unbuffered channels.
2015-10-12 10:42:14 +02:00
Marty Schoch
95e06538f3 fix benchmarks for the x kvstores 2015-10-09 11:09:42 -04:00
Marty Schoch
0f05d1d3ca Merge branch 'master' into newkvstore 2015-10-09 10:33:41 -04:00
Marty Schoch
9b16fa6528 Merge pull request #245 from pmezard/minor-doc-and-code-changes
Minor doc and code changes
2015-10-09 09:27:13 -04:00
Patrick Mezard
e2fa3d6351 doc: document Token, TokenFrequencies and Field structs
It helps understanding what is going on in indexing code.
ArrayPositions() was particularly puzzling.
2015-10-09 12:32:44 +02:00
Patrick Mezard
aee82f8b49 upside_down: simplify return code in batchRows() 2015-10-09 09:57:12 +02:00
Marty Schoch
e28eb749d7 bump up buffer size 2015-10-06 16:45:38 -04:00
Marty Schoch
71cbb13e07 modify code to reuse buffer for kv generation 2015-10-05 17:49:50 -04:00
Kosov Eugene
a61c350888 reducing allocation count 2015-10-05 22:57:10 +03:00
Marty Schoch
c0335e9fe4 Merge pull request #243 from pmezard/document-fields-retrieval
doc: document field values storage and retrieval
2015-10-05 10:52:23 -04:00
Patrick Mezard
ee8af9cfa3 doc: document field values storage and retrieval 2015-10-04 11:25:58 +02:00
Marty Schoch
e3a185b1c5 Merge pull request #242 from pmezard/add-boltdb-nosync-option
boltdb: add "nosync" option to force boltdb.DB.NoSync=true
2015-10-03 09:50:55 -04:00
Patrick Mezard
9d5407be13 boltdb: add "nosync" option to force boltdb.DB.NoSync=true
Use this option when rebuilding indexes from scratch. In my small case
(~17000 json documents), it reduces indexing from 520s to 250s.

I did not add any test, short of forced indexing termination it only
has performance effects, which are hard to test. And unknown options are
currently ignored.

Issue #240
2015-10-03 14:26:48 +02:00
Marty Schoch
66700be4f7 Merge branch 'master' into newkvstore 2015-10-02 13:13:28 -04:00
Marty Schoch
6a9e2252c4 Merge pull request #241 from pmezard/document-field-analyzer-a-bit
Document field analyzer a bit
2015-10-02 13:12:56 -04:00
Patrick Mezard
2f48c16c84 doc: document IndexMapping.AddCustomAnalyzer 2015-10-02 17:38:07 +02:00
Patrick Mezard
ed1bdbf599 doc: document field analyzer resolution 2015-10-02 17:00:45 +02:00
Patrick Mezard
498e4a0de7 simplify FieldMapping.analyzerForField()
I stumbled onto that while trying to understand how analyzers are
resolved. The new code looks simpler to me and removes useless calls to
DocumentMapping.defaultAnalyzerName() when an analyzer is set at
FieldMapping level.

The slight change to TestStoredFieldPreserved avoids a stacktrace when
the test fails.
2015-10-02 15:45:43 +02:00
Marty Schoch
64ce81c283 Merge branch 'master' into newkvstore 2015-09-29 14:06:27 -04:00