indraniel
3a70401835
+ make copies of the []bytes returned by goleveldb
...
- The byte strings returned by goleveldb aren't necessarily safe. See
the following google group thread:
https://groups.google.com/forum/#!topic/bleve/aHZ8gmihLiY
This code change is based on the gist created here:
https://groups.google.com/forum/#!topic/bleve/aHZ8gmihLiY
2015-04-10 11:08:02 -05:00
indraniel
8e1c75a1cf
+ comment out the index function for goleveldb
...
- don't want to make goleveldb the default kv store yet.
2015-04-10 11:08:02 -05:00
indraniel
cb50368cdb
+ add a goleveldb config file
2015-04-10 11:08:02 -05:00
indraniel
a88d714778
+ add a goleveldb index updside-down benchmark test
2015-04-10 11:08:02 -05:00
indraniel
a0a2a61050
+ keep 'get' consistent with levigo implementation
...
- this change keeps the method behavior consistent with the
levigo/leveldb implementation.
- don't issue an err if a key isn't found
2015-04-10 11:08:02 -05:00
indraniel
5e55fa2866
+ keep 'getWithSnapshot' consistent with levigo implementation
...
- this change keeps the method behavior consistent with the
levigo/leveldb implementation.
- the leveldb store_test.go and goleveldb store_test.go are now
identical.
2015-04-10 11:08:02 -05:00
indraniel
caa19e6c36
+ initial stub of goleveldb package
...
- This is a first-pass introduction. Things may not be working
correctly yet.
2015-04-10 11:08:02 -05:00
Steve Yen
8ed7dc1505
added godoc badge to README
2015-04-09 08:53:29 -07:00
Steve Yen
3f2c90f7c1
moved all badges to the top of README
2015-04-09 08:48:12 -07:00
Steve Yen
86f2b844b4
added config_metrics.go which builds on debug tag
...
When the debug tag is used, the config_metrics.go file
ensures the metrics decorator KV store is imported.
2015-04-08 22:08:04 -07:00
Marty Schoch
0f16eccd6b
new tokenizer that allows you to pre-identify tokens with regexp
...
name "exception"
configure with list of regexp string "exceptions"
these exceptions regexps that match sequences you want treated
as a single token. these sequences are NOT sent to the
underlying tokenizer
configure "tokenizer" is the named tokenizer that should be
used for processing all text regions not matching exceptions
An example configuration with simple patterns to match URLs and
email addresses:
map[string]interface{}{
"type": "exception",
"tokenizer": "unicode",
"exceptions": []interface{}{
`[hH][tT][tT][pP][sS]?://(\S)*`,
`[fF][iI][lL][eE]://(\S)*`,
`[fF][tT][pP]://(\S)*`,
`\S+@\S+`,
}
}
2015-04-08 15:31:58 -04:00
Marty Schoch
056d74901e
fix test to actually work reliably
2015-04-08 11:17:34 -04:00
Marty Schoch
8581e73cef
added String method for Batch
...
also changed Batch methods to pointer receiver
closes #180
2015-04-08 10:41:42 -04:00
Marty Schoch
683c0a7a54
adding errcheck to travis builds
...
closes #169
2015-04-07 18:11:29 -04:00
Marty Schoch
539aeb8dc7
fix errors identified by errcheck
...
part of #169
2015-04-07 18:05:41 -04:00
Marty Schoch
ba6b3c8bb3
fix more issues identified by errcheck
...
part of #169
2015-04-07 16:45:23 -04:00
Marty Schoch
ab24772bf0
fix issues identified by errcheck
...
part of #169
2015-04-07 16:34:29 -04:00
Marty Schoch
56c4a09de1
fix issues identified by errcheck
...
part of #169
2015-04-07 15:39:56 -04:00
Marty Schoch
c8d974048a
fix issues identified by errcheck
...
part of #169
2015-04-07 14:59:35 -04:00
Marty Schoch
93e01a803e
fix issues identified by errcheck
...
part of #169
2015-04-07 14:52:00 -04:00
Marty Schoch
f1ec73e764
fix issues identified by errcheck
...
part of #169
2015-04-07 13:26:54 -04:00
Marty Schoch
56a30a3574
fix issues identified by errcheck
...
part of #169
2015-04-07 13:05:47 -04:00
Marty Schoch
d2e9409413
fix issues identified by errcheck
...
part of #169
2015-04-07 12:04:59 -04:00
Marty Schoch
24729541b5
fix issues identified by errcheck
...
also add bulkindex utility to gitignore
part of #169
2015-04-07 11:42:46 -04:00
Marty Schoch
35a4333bce
fix issues identified by errcheck
...
part of #169
2015-04-07 11:39:01 -04:00
Marty Schoch
de2e3f4d72
fix improper call to fmt.Errorf instead of Printf
2015-04-07 11:24:01 -04:00
Marty Schoch
dd921d31e3
undoing f92ab131e4
...
we now guarantee bytes were copied earlier in the chain
the kv store is NOT responsible for making an additional copy
closes #181
2015-04-07 11:12:28 -04:00
Marty Schoch
443c0252e0
fix another metrics BytesSafeAfterClose() loop
...
closes #184
2015-04-03 21:17:23 -04:00
Steve Yen
efc39a6857
fix metrics BytesSafeAfterClose() loop
...
fixes issue 184
2015-04-03 16:36:32 -07:00
Marty Schoch
11262c793f
fix bug, internal ops must check that index is open
...
possibly fixes https://github.com/couchbaselabs/cbft/issues/49
2015-04-03 18:05:24 -04:00
Marty Schoch
867110e03b
major improvements to index row encoding
...
improvements uncovered some issues with how k/v data was copied
or not. to address this, kv abstraction layer now lets impl
specify if the bytes returned are safe to use after a reader
(or writer since writers are also readers) are closed
See index/store/KVReader - BytesSafeAfterClose() bool
false is the safe value if you're not sure
it will cause index impls to copy the data
Some kv impls already have created a copy a the C-api barrier
in which case they can safely return true.
Overall this yields ~25% speedup for searches with leveldb.
It yields ~10% speedup for boltdb.
Returning stored fields is now slower with boltdb, as previously
we were returning unsafe bytes.
2015-04-03 16:50:48 -04:00
Marty Schoch
52712b9537
add missing index close causing tests to sometimes fail
2015-04-03 16:41:11 -04:00
Steve Yen
dbf50b7f29
KVStore gtreap allows only 1 writer at a time
2015-03-26 16:40:18 -07:00
Steve Yen
f92ab131e4
KVStore gtreap implementation copies value bytes
2015-03-26 14:46:37 -07:00
Steve Yen
78453dab7d
metrics KVStore now tracks last 100 errors
2015-03-19 18:41:16 -07:00
Marty Schoch
62645f10a2
Merge pull request #179 from gsathya/add_index_tests
...
Add tests for Index
2015-03-19 16:56:45 -04:00
Sathyanarayanan Gunasekaran
5c7aa21643
Add test for index.Stats
2015-03-19 14:06:59 -04:00
Sathyanarayanan Gunasekaran
d9a7a2e3a0
Add test for index.FieldDictPrefix
2015-03-19 14:06:59 -04:00
Sathyanarayanan Gunasekaran
5b4ee3e598
Add test for index.FieldDictRange
2015-03-19 14:06:59 -04:00
Marty Schoch
6f185f8cc0
fix highlighting bug when terms overlap (ngram analysis)
...
fixes #178
2015-03-18 14:34:47 -04:00
Marty Schoch
a41f229b14
added regexp and wildcard queries
...
fixes #152
2015-03-11 16:57:22 -04:00
Marty Schoch
183fcd4b14
added a missing check for errors
2015-03-11 16:56:01 -04:00
Marty Schoch
a44a7c01af
rewrite to used fixed size []byte instead of buffer
...
removes unchecked errors in calls to buffer.Write
and also benchmarks considerably faster
2015-03-11 15:12:13 -04:00
Marty Schoch
50bd082257
fixed issues with portuguese analyzer
...
fixes #70
2015-03-11 14:22:11 -04:00
Marty Schoch
7970f42c29
fix issues with italian analyzer
...
switch it to not require icu/libstemmer
fixes #69
2015-03-11 11:48:13 -04:00
Marty Schoch
eeaf514848
switch fr to not require icu/libstemmer
...
also corrected copy/paste bug in test
2015-03-11 11:46:33 -04:00
Marty Schoch
8ae30fb6f0
fix issues with lucene stemmer
...
fixes issue #68
2015-03-11 11:14:29 -04:00
Marty Schoch
b5a79c8ecc
Merge pull request #173 from gsathya/fix_return_err
...
Check all return errors
2015-03-11 08:30:42 -04:00
Sathyanarayanan Gunasekaran
93e749bc0c
Check all return errors
...
- Fix the following errors found by errcheck :
$ bleve git:(master) errcheck github.com/blevesearch/bleve
github.com/blevesearch/bleve/index_impl.go:206:25 defer indexReader.Close()
github.com/blevesearch/bleve/index_impl.go:317:25 defer indexReader.Close()
github.com/blevesearch/bleve/index_impl.go:353:25 defer indexReader.Close()
github.com/blevesearch/bleve/index_impl.go:359:22 defer searcher.Close()
github.com/blevesearch/bleve/index_impl.go:497:25 defer indexReader.Close()
github.com/blevesearch/bleve/index_impl.go:644:20 defer reader.Close()
github.com/blevesearch/bleve/index_meta.go:67:27 defer indexMetaFile.Close()
2015-03-11 01:28:51 -04:00
Marty Schoch
522f9d5cc7
significant change to index format, support dictionary rows
...
this introduces disk format v4
now the summary rows for a term are stored in their own
"dictionary row" format, previously the same information
was stored in special term frequency rows
this now allows us to easily iterate all the terms for a field
in sorted order (useful for many other fuzzy data structures)
at the top-level of bleve you can now browse terms within a field
using the following api on the Index interface:
FieldDict(field string) (index.FieldDict, error)
FieldDictRange(field string, startTerm []byte, endTerm []byte) (index.FieldDict, error)
FieldDictPrefix(field string, termPrefix []byte) (index.FieldDict, error)
fixes #127
2015-03-10 16:22:19 -04:00