0
0
Commit Graph

796 Commits

Author SHA1 Message Date
indraniel
caa19e6c36 + initial stub of goleveldb package
- This is a first-pass introduction. Things may not be working
    correctly yet.
2015-04-10 11:08:02 -05:00
Steve Yen
8ed7dc1505 added godoc badge to README 2015-04-09 08:53:29 -07:00
Steve Yen
3f2c90f7c1 moved all badges to the top of README 2015-04-09 08:48:12 -07:00
Steve Yen
86f2b844b4 added config_metrics.go which builds on debug tag
When the debug tag is used, the config_metrics.go file
ensures the metrics decorator KV store is imported.
2015-04-08 22:08:04 -07:00
Marty Schoch
0f16eccd6b new tokenizer that allows you to pre-identify tokens with regexp
name "exception"
configure with list of regexp string "exceptions"
these exceptions regexps that match sequences you want treated
as a single token.  these sequences are NOT sent to the
underlying tokenizer
configure "tokenizer" is the named tokenizer that should be
used for processing all text regions not matching exceptions

An example configuration with simple patterns to match URLs and
email addresses:

map[string]interface{}{
	"type":      "exception",
	"tokenizer": "unicode",
	"exceptions": []interface{}{
		`[hH][tT][tT][pP][sS]?://(\S)*`,
		`[fF][iI][lL][eE]://(\S)*`,
		`[fF][tT][pP]://(\S)*`,
		`\S+@\S+`,
  }
}
2015-04-08 15:31:58 -04:00
Marty Schoch
056d74901e fix test to actually work reliably 2015-04-08 11:17:34 -04:00
Marty Schoch
8581e73cef added String method for Batch
also changed Batch methods to pointer receiver
closes #180
2015-04-08 10:41:42 -04:00
Marty Schoch
683c0a7a54 adding errcheck to travis builds
closes #169
2015-04-07 18:11:29 -04:00
Marty Schoch
539aeb8dc7 fix errors identified by errcheck
part of #169
2015-04-07 18:05:41 -04:00
Marty Schoch
ba6b3c8bb3 fix more issues identified by errcheck
part of #169
2015-04-07 16:45:23 -04:00
Marty Schoch
ab24772bf0 fix issues identified by errcheck
part of #169
2015-04-07 16:34:29 -04:00
Marty Schoch
56c4a09de1 fix issues identified by errcheck
part of #169
2015-04-07 15:39:56 -04:00
Marty Schoch
c8d974048a fix issues identified by errcheck
part of #169
2015-04-07 14:59:35 -04:00
Marty Schoch
93e01a803e fix issues identified by errcheck
part of #169
2015-04-07 14:52:00 -04:00
Marty Schoch
f1ec73e764 fix issues identified by errcheck
part of #169
2015-04-07 13:26:54 -04:00
Marty Schoch
56a30a3574 fix issues identified by errcheck
part of #169
2015-04-07 13:05:47 -04:00
Marty Schoch
d2e9409413 fix issues identified by errcheck
part of #169
2015-04-07 12:04:59 -04:00
Marty Schoch
24729541b5 fix issues identified by errcheck
also add bulkindex utility to gitignore
part of #169
2015-04-07 11:42:46 -04:00
Marty Schoch
35a4333bce fix issues identified by errcheck
part of #169
2015-04-07 11:39:01 -04:00
Marty Schoch
de2e3f4d72 fix improper call to fmt.Errorf instead of Printf 2015-04-07 11:24:01 -04:00
Marty Schoch
dd921d31e3 undoing f92ab131e4
we now guarantee bytes were copied earlier in the chain
the kv store is NOT responsible for making an additional copy
closes #181
2015-04-07 11:12:28 -04:00
Marty Schoch
443c0252e0 fix another metrics BytesSafeAfterClose() loop
closes #184
2015-04-03 21:17:23 -04:00
Steve Yen
efc39a6857 fix metrics BytesSafeAfterClose() loop
fixes issue 184
2015-04-03 16:36:32 -07:00
Marty Schoch
11262c793f fix bug, internal ops must check that index is open
possibly fixes https://github.com/couchbaselabs/cbft/issues/49
2015-04-03 18:05:24 -04:00
Marty Schoch
867110e03b major improvements to index row encoding
improvements uncovered some issues with how k/v data was copied
or not.  to address this, kv abstraction layer now lets impl
specify if the bytes returned are safe to use after a reader
(or writer since writers are also readers) are closed
See index/store/KVReader - BytesSafeAfterClose() bool
false is the safe value if you're not sure
it will cause index impls to copy the data
Some kv impls already have created a copy a the C-api barrier
in which case they can safely return true.

Overall this yields ~25% speedup for searches with leveldb.
It yields ~10% speedup for boltdb.
Returning stored fields is now slower with boltdb, as previously
we were returning unsafe bytes.
2015-04-03 16:50:48 -04:00
Marty Schoch
52712b9537 add missing index close causing tests to sometimes fail 2015-04-03 16:41:11 -04:00
Steve Yen
dbf50b7f29 KVStore gtreap allows only 1 writer at a time 2015-03-26 16:40:18 -07:00
Steve Yen
f92ab131e4 KVStore gtreap implementation copies value bytes 2015-03-26 14:46:37 -07:00
Steve Yen
78453dab7d metrics KVStore now tracks last 100 errors 2015-03-19 18:41:16 -07:00
Marty Schoch
62645f10a2 Merge pull request #179 from gsathya/add_index_tests
Add tests for Index
2015-03-19 16:56:45 -04:00
Sathyanarayanan Gunasekaran
5c7aa21643 Add test for index.Stats 2015-03-19 14:06:59 -04:00
Sathyanarayanan Gunasekaran
d9a7a2e3a0 Add test for index.FieldDictPrefix 2015-03-19 14:06:59 -04:00
Sathyanarayanan Gunasekaran
5b4ee3e598 Add test for index.FieldDictRange 2015-03-19 14:06:59 -04:00
Marty Schoch
6f185f8cc0 fix highlighting bug when terms overlap (ngram analysis)
fixes #178
2015-03-18 14:34:47 -04:00
Marty Schoch
a41f229b14 added regexp and wildcard queries
fixes #152
2015-03-11 16:57:22 -04:00
Marty Schoch
183fcd4b14 added a missing check for errors 2015-03-11 16:56:01 -04:00
Marty Schoch
a44a7c01af rewrite to used fixed size []byte instead of buffer
removes unchecked errors in calls to buffer.Write
and also benchmarks considerably faster
2015-03-11 15:12:13 -04:00
Marty Schoch
50bd082257 fixed issues with portuguese analyzer
fixes #70
2015-03-11 14:22:11 -04:00
Marty Schoch
7970f42c29 fix issues with italian analyzer
switch it to not require icu/libstemmer
fixes #69
2015-03-11 11:48:13 -04:00
Marty Schoch
eeaf514848 switch fr to not require icu/libstemmer
also corrected copy/paste bug in test
2015-03-11 11:46:33 -04:00
Marty Schoch
8ae30fb6f0 fix issues with lucene stemmer
fixes issue #68
2015-03-11 11:14:29 -04:00
Marty Schoch
b5a79c8ecc Merge pull request #173 from gsathya/fix_return_err
Check all return errors
2015-03-11 08:30:42 -04:00
Sathyanarayanan Gunasekaran
93e749bc0c Check all return errors
- Fix the following errors found by errcheck :
  $ bleve git:(master) errcheck github.com/blevesearch/bleve
  github.com/blevesearch/bleve/index_impl.go:206:25  defer indexReader.Close()
  github.com/blevesearch/bleve/index_impl.go:317:25  defer indexReader.Close()
  github.com/blevesearch/bleve/index_impl.go:353:25  defer indexReader.Close()
  github.com/blevesearch/bleve/index_impl.go:359:22  defer searcher.Close()
  github.com/blevesearch/bleve/index_impl.go:497:25  defer indexReader.Close()
  github.com/blevesearch/bleve/index_impl.go:644:20  defer reader.Close()
  github.com/blevesearch/bleve/index_meta.go:67:27   defer indexMetaFile.Close()
2015-03-11 01:28:51 -04:00
Marty Schoch
522f9d5cc7 significant change to index format, support dictionary rows
this introduces disk format v4
now the summary rows for a term are stored in their own
"dictionary row" format, previously the same information
was stored in special term frequency rows
this now allows us to easily iterate all the terms for a field
in sorted order (useful for many other fuzzy data structures)

at the top-level of bleve you can now browse terms within a field
using the following api on the Index interface:

  FieldDict(field string) (index.FieldDict, error)
  FieldDictRange(field string, startTerm []byte, endTerm []byte) (index.FieldDict, error)
  FieldDictPrefix(field string, termPrefix []byte) (index.FieldDict, error)

fixes #127
2015-03-10 16:22:19 -04:00
Marty Schoch
4e14f4e4ef change path for forestdb test to correctly cleanup
this is due to forestdb auto-compaction using the provided
path as just the prefix, so if we're not careful we end
up with many stray files laying around
here, we create a sub-directory first, and just nuke the
whole subdir when we're done
2015-03-10 14:05:58 -04:00
Marty Schoch
0df0a6fcb2 better logging on which test failed in integration tests 2015-03-10 14:05:30 -04:00
Marty Schoch
18dabdb946 fix compilation of bulk index utility 2015-03-09 08:20:40 -04:00
Marty Schoch
af356acff0 changed batch behavior
now created through the index itself
mapping problems reported early at the time
data is added to the batch, previously these
were not reported until the batch was executed
2015-03-09 08:20:39 -04:00
Marty Schoch
eaccd74c93 Merge pull request #134 from Shugyousha/numfacet
Add a benchmark for the numeric facet builder and use sort.Sort in it (just like for the terms one)
2015-03-06 14:50:30 -05:00
Marty Schoch
300ec79c96 first pass at checking errors that were ignored
part of #169
2015-03-06 14:46:29 -05:00