0
0
Commit Graph

1195 Commits

Author SHA1 Message Date
Marty Schoch
9174872ba2 add the bleve check tool
bleve check was a consistency checking tool originally developed
as a part of cbft.  currently it checks that the term dictionary
counts match the number of postings for the term.  in the future
additional checks could be added.  this tool has been back
ported to bleve as we've now adopted a single common tool for
both cbft and bleve.
2016-10-22 06:11:50 -04:00
Marty Schoch
77b79a2684 Merge pull request #466 from steveyen/optimize-fieldDict-reader-with-prealloc
Optimize upside-down's field dict reader with preallocated objects
2016-10-13 14:09:54 +02:00
Marty Schoch
5c7a2264a2 Merge pull request #473 from steveyen/reuse-incrementBytes-in-moss-kv-integration
reuse incrementBytes() in moss KV store integration
2016-10-13 14:03:46 +02:00
Marty Schoch
cee18d302e Merge pull request #475 from steveyen/phrase-searcher-simplifications-dry
some simplification / DRY for phrase searcher
2016-10-12 23:07:35 +02:00
Marty Schoch
ac02b206e5 Merge pull request #476 from steveyen/optimize-fuzzy-searcher-prefixTerm-loop
end fuzzy searcher prefixTerm construction loop early
2016-10-12 23:04:50 +02:00
Marty Schoch
8b409d7ab9 Merge pull request #477 from steveyen/fix-BleveQueryType-json-marshaling
fix BleveQueryTime json marshaling with double-quoting
2016-10-12 23:04:08 +02:00
Steve Yen
32e459f6b6 fix BleveQueryTime json marshaling with double-quoting
See also MB-21322 found by Mihir Kamdar.
2016-10-12 11:39:08 -07:00
Steve Yen
1a994ce2a7 end fuzzy searcher prefixTerm construction loop early 2016-10-12 09:51:36 -07:00
Steve Yen
6a38fa3719 go fmt 2016-10-12 09:39:43 -07:00
Steve Yen
62e6f1f648 reuse incrementBytes() in moss KV store integration
In this commit, I saw that there was a simple incrementBytes()
implementation elsewhere in bleve that seemed simpler than using the
big int package.

Edge case note: if the input bytes would overflow in incrementBytes(),
such as with an input of [0xff 0xff 0xff], it returns nil.  moss then
treats a nil endKeyExclusive iterator param as a logical
"higher-than-topmost" key, which produces the prefix iteration
behavior that we want for this edge situation.
2016-10-12 09:34:36 -07:00
Steve Yen
8230a7195f some simplification / DRY for phrase searcher 2016-10-12 09:26:31 -07:00
Steve Yen
01fb59d293 optimize upside-down DictionaryRow for fewer parsing alloc's 2016-10-12 09:22:50 -07:00
Steve Yen
2d72b542c0 optimize upside-down FieldDict reader with prealloc'ed objects
As part of this commit, there's also a newly added
Dictionaryrow.parseDictionaryK() helper method.
2016-10-12 09:18:58 -07:00
Marty Schoch
d026a44230 Merge pull request #474 from steveyen/simplify-AddLocation
simplify TermLocationMap.AddLocation()
2016-10-12 14:20:50 +02:00
Marty Schoch
bddc064069 Merge pull request #471 from steveyen/remove-extra-indirection-LevenshteinDistance
removed extra level of pointer indirection from LevenshteinDistance()'s params
2016-10-12 14:05:34 +02:00
Marty Schoch
4160fb296f Merge pull request #470 from daschl/sigma
Address special unicode sigma at end of term when lowercasing.
2016-10-12 14:03:17 +02:00
Marty Schoch
483f06ef5b Merge pull request #467 from steveyen/optimize-disjunction-searcher-shrink-children
optimize disjunction searcher to trim child searchers array earlier
2016-10-12 14:00:19 +02:00
Marty Schoch
b76cbc805e Merge pull request #465 from steveyen/cleanup-when-PrefixSearcher-error
close resources when we encounter an error on PrefixSearcher initialization
2016-10-12 13:39:28 +02:00
Marty Schoch
4e16818656 Merge pull request #464 from steveyen/check-FieldDictPrefix-err
check for error when prefix searcher starts a FieldDictPrefix reader
2016-10-12 13:36:04 +02:00
Marty Schoch
155827aeef Merge pull request #462 from steveyen/master
log slow queries only when Config.SlowSearchLogThreshold > 0
2016-10-12 13:34:15 +02:00
Steve Yen
e72c8be353 simplify TermLocationMap.AddLocation() 2016-10-11 12:15:28 -07:00
Steve Yen
b6c97ddbfe removed extra ptr indirection from LevenshteinDistance 2016-10-11 08:49:10 -07:00
Michael Nitschinger
7e656dad32 Address special unicode sigma at end of term when lowercasing.
Σ maps to σ, except at the end of a word where it maps to ς.
This is the only conditional (contextual) but language-independent
mapping in unicode.
2016-10-11 12:37:08 +02:00
Marty Schoch
586c6ee1a3 Merge pull request #469 from daschl/optim-lowercase
Skip already lowercased runes on transformation.
2016-10-11 12:12:21 +02:00
Michael Nitschinger
ff35d75aa4 Skip already lowercased runes on transformation.
The LowerCaseFilter works on the original slice to avoid allocations,
so skipping already lowercased runes avoids unnecessary work.

benchmark                      old ns/op     new ns/op     delta
BenchmarkLowerCaseFilter-8     1302          815           -37.40%
2016-10-11 12:03:26 +02:00
Steve Yen
3f588cd4ae optimize disjunction searcher to trim child searchers array earlier
Disjunction searchers are used heavily by higher-level searchers, like
prefix searchers.  In that case, a disjunction searcher might have
many thousands of child searchers.

This commit adds an optimization to close each child term searcher as
soon as a child searcher is finished and remove it from the
disjunction searcher's children.
2016-10-10 22:47:11 -07:00
Steve Yen
535b746b41 close resources when error on PrefixSearcher initialization 2016-10-10 17:29:59 -07:00
Steve Yen
2a022830f0 check FieldDictPrefix err result in prefix searcher 2016-10-10 15:35:54 -07:00
Steve Yen
21b3d592b8 log slow queries only when Config.SlowSearchLogThreshold > 0 2016-10-10 11:34:32 -07:00
Marty Schoch
de0c26718d Merge pull request #461 from bcampbell/master
Settle on default fuzziness of 1 (for now)
2016-10-04 08:36:02 -04:00
Ben Campbell
11f18333fb Settle on default fuzziness of 1 (for now)
see https://groups.google.com/d/msg/bleve/vkVxnLMlXow/5qM1jL0ZEgAJ
2016-10-04 15:00:50 +13:00
Marty Schoch
dfc78ca725 simplify, per gofmt -s recommendation 2016-10-02 12:14:53 -04:00
Marty Schoch
2f48d7fb02 fix misspellings 2016-10-02 12:11:15 -04:00
Marty Schoch
2dc2130633 additional golint cleanups 2016-10-02 12:00:01 -04:00
Marty Schoch
efb1ea7e64 fix golint comment 2016-10-02 11:56:37 -04:00
Marty Schoch
8e784c362b another golint suggestions 2016-10-02 11:54:04 -04:00
Marty Schoch
abeca559cd don't export unnecessary method 2016-10-02 11:50:58 -04:00
Marty Schoch
1b4ee737e0 more golint fixes 2016-10-02 11:46:27 -04:00
Marty Schoch
ce572091eb additional golint cleanups 2016-10-02 11:44:34 -04:00
Marty Schoch
ee6b698edb rename variable with _ 2016-10-02 11:32:46 -04:00
Marty Schoch
667371dbec more golint simplifications 2016-10-02 11:30:58 -04:00
Marty Schoch
c36eb74ead address some golint suggestions 2016-10-02 11:14:09 -04:00
Marty Schoch
f05dc237ab fix comment in wrong format 2016-10-02 11:10:05 -04:00
Marty Schoch
f3dc89699d address golint warnings 2016-10-02 10:47:40 -04:00
Marty Schoch
cd6b409971 fix code i carelessly broke 2016-10-02 10:39:20 -04:00
Marty Schoch
d4d3e7a043 address golint naming issues 2016-10-02 10:35:24 -04:00
Marty Schoch
3a276153a3 actually rename packages to singular, not just directory name 2016-10-02 10:29:39 -04:00
Marty Schoch
2332455bd2 nicer formatting of license header 2016-10-02 10:13:14 -04:00
Marty Schoch
c452804e3d Merge pull request #460 from mschoch/morename
BREAKING CHANGE - additional package renaming
2016-10-02 09:00:22 -04:00
Marty Schoch
6bf9dd59ab BREAKING CHANGE - additional package renaming
i recently learned that package names should also prefer the
singular form, not the plural form
2016-10-01 17:20:59 -04:00