bleve

Author	SHA1	Message	Date
Marty Schoch	d40cfb0870	Merge pull request #521 from mschoch/improved-backindex-row INDEX FORMAT CHANGE: change back index row value	2017-01-24 16:07:47 -05:00
Marty Schoch	606fd6344b	INDEX FORMAT CHANGE: change back index row value Previously term entries were encoded pairwise (field/term), so you'd have data like: F1/T1 F1/T2 F1/T3 F2/T4 F3/T5 As you can see, even though field 1 has 3 terms, we repeat the F1 part in the encoded data. This is a bit wasteful. In the new format we encode it as a list of terms for each field: F1/T1,T2,T3 F2/T4 F3/T5 When fields have multiple terms, this saves space. In unit tests there is no additional waste even in the case that a field has only a single value. Here are the results of an indexing test case (beer-search): $ benchcmp indexing-before.txt indexing-after.txt benchmark old ns/op new ns/op delta BenchmarkIndexing-4 11275835988 10745514321 -4.70% benchmark old allocs new allocs delta BenchmarkIndexing-4 25230685 22480494 -10.90% benchmark old bytes new bytes delta BenchmarkIndexing-4 4802816224 4741641856 -1.27% And here are the results of a MatchAll search building a facet on the "abv" field: $ benchcmp facet-before.txt facet-after.txt benchmark old ns/op new ns/op delta BenchmarkFacets-4 439762100 228064575 -48.14% benchmark old allocs new allocs delta BenchmarkFacets-4 9460208 3723286 -60.64% benchmark old bytes new bytes delta BenchmarkFacets-4 260784261 151746483 -41.81% Although we expect the index to be smaller in many cases, the beer-search index is about the same in this case. However, this may be due to the underlying storage (boltdb) in this case. Finally, the index version was bumped from 5 to 7, since smolder also used version 6, which could lead to some confusion.	2017-01-24 15:39:38 -05:00
Marty Schoch	f94a790156	Merge pull request #520 from mschoch/faster_regexp improve performance of regular expression and wildcard queries	2017-01-18 16:31:49 -05:00
Marty Schoch	b55c9043b9	improve performance of regular expression and wildcard queries While researching an observed performance issue with wildcard queries, it was observed that the LiteralPrefix() method on the regexp.Regexp struct did not always behave as expected. In particular, when the pattern starts with ^, AND involves some backtracking, the LiteralPrefix() seems to always be the empty string. The side-effect of this is that we rely on having a helpful prefix, to reduce the number of terms in the term dictionary that need to be visited. This change now makes the searcher enforce start/end on the term directly, by using FindStringIndex() instead of Match(). Next, we also modified WildcardQuery and RegexpQuery to no longer include the ^ and $ modifiers. Documentation was also udpated to instruct users that they should not include the ^ and $ modifiers in their patterns.	2017-01-18 16:22:16 -05:00
Marty Schoch	72731336bf	Merge pull request #517 from minagawa-sho/fix-confusing-variable-name fix the confusing variable name	2017-01-14 09:00:34 -05:00
Sho Minagawa	5537688394	fix the confusing variable name	2017-01-14 20:26:08 +09:00
Marty Schoch	269cc302e3	Merge pull request #514 from steveyen/master more upsidedown optimizations	2017-01-10 09:15:04 -05:00
Steve Yen	5927224e15	optimize mergeOldAndNew for case of first time a doc is seen	2017-01-09 22:48:58 -08:00
Steve Yen	790f2e3e32	optimize by alloc'ing arrays of TermFrequencyRow/TermVector	2017-01-09 22:42:00 -08:00
Marty Schoch	8cd6040b63	Merge pull request #512 from steveyen/master API change: optional SearchRequest.IncludeLocations flag	2017-01-09 14:19:17 -05:00
Marty Schoch	ae219d6397	Merge pull request #489 from Shugyousha/refactorphrasesearch Refactor PhraseSearcher	2017-01-09 14:13:22 -05:00
Steve Yen	8f4726ab10	use struct{}{} idiom instead of additional mark var	2017-01-09 10:17:26 -08:00
Marty Schoch	d081ed712a	Merge pull request #513 from mosuka/master renamed detect_lang to detectlang	2017-01-09 09:17:57 -05:00
Minoru Osuka	63c0d9a4d2	renamed detect_lang to detectlang renamed detect_lang to detectlang.	2017-01-09 16:51:48 +09:00
Steve Yen	302cac72c4	optimize mergeOldAndNew when non-update case	2017-01-08 17:59:49 -08:00
Steve Yen	931d133024	go fmt and go vet	2017-01-07 22:14:22 -08:00
Steve Yen	40780254ae	optimize upsidedown mergeOldAndNew existing key maps The optimization is to provide a better initial size to the map constructor and to use a 0-byte-sized struct{} as the map values.	2017-01-07 22:05:55 -08:00
Steve Yen	c2bafa2a51	optimize term vectors/locations via preallocated arrays The change should hit the allocator less often when processing term vectors/locations as it preallocates larger, contiguous arrays of records upfront.	2017-01-07 12:34:06 -08:00
Steve Yen	8b140d84c4	minor optimization of upsidedown backIndexRowForDoc This change might allow a smart enough golang compiler to perhaps allocate a backIndexRow on the stack rather than the heap.	2017-01-07 11:49:42 -08:00
Steve Yen	89a1cefde1	API change: optional SearchRequest.IncludeLocations flag This is a change in search result behavior in that location information is no longer provided by default with search results. Although this looks like a wide-ranging change, it's mostly a mechanical replacement of the explain bool flag with a new search.SearcherOptions struct, which holds both the Explain bool flag and the IncludeTermVectors bool flag.	2017-01-05 21:11:22 -08:00
Steve Yen	c21d27e15a	upsidedown TermFieldReader checks includeTermVectors flag param The flag was part of the API, but wasn't previously checked.	2017-01-05 21:10:27 -08:00
Marty Schoch	3b2bc30b54	fix type identification when object indexed is pointer to struct fixes #508	2016-12-08 08:07:38 -05:00
Marty Schoch	d4f21a6290	Merge pull request #503 from steveyen/master bleve/index/store/moss - accessor for underlying mossStore	2016-12-05 16:41:05 -05:00
Steve Yen	37490864ce	bleve/index/store/moss - accessor for underlying mossStore This change adds methods that provide access to the actual, underlying mossStore instance in the bleve/index/store/moss KVStore adaptor. This enables applications to utilize advanced, mossStore-specific features (such as partial rollback of indexes). See also https://issues.couchbase.com/browse/MB-17805	2016-12-05 12:25:29 -08:00
Marty Schoch	c351931701	Merge branch 'slavikm-master4'	2016-11-28 15:00:48 -05:00
Marty Schoch	c927e124dd	Merge branch 'master' of https://github.com/slavikm/bleve into slavikm-master4	2016-11-28 14:03:35 -05:00
slavikm	75c8c0e2b1	Revert the nil protection which is not needed	2016-11-23 09:26:07 -08:00
slavikm	20b847f04e	Added protection again nil Boost	2016-11-22 13:04:36 -08:00
slavikm	a4c94e440e	Added missing boost getters	2016-11-22 12:50:08 -08:00
Marty Schoch	58fe9b9562	Merge pull request #502 from pmezard/fix-docidreader-next-doc index: DocIDReader.Next() returns nil when done not io.EOF	2016-11-20 13:16:35 -05:00
Patrick Mezard	c81fd6fdb0	index: DocIDReader.Next() returns nil when done not io.EOF	2016-11-20 19:05:35 +01:00
Marty Schoch	3da28dfbc1	Merge pull request #499 from mschoch/498 add support for parsing BoolFieldQuery from JSON	2016-11-16 11:50:44 -05:00
Marty Schoch	d372602f3c	add support for parsing BoolFieldQuery from JSON presence of the "bool" key triggers parsing as a BoolFieldQuery fixes #498	2016-11-15 10:29:11 -05:00
slavikm	187d6013df	Make sure getters follow the Go convention	2016-11-14 15:30:07 -08:00
slavikm	339ddbe0fa	Added getters to boost and field query interfaces	2016-11-14 14:02:43 -08:00
Silvan Jegen	1a6a4c493b	Check locations in the phrase searcher as well	2016-11-08 20:05:36 +01:00
Silvan Jegen	33e2432fc6	Initialize the return value as late as possible	2016-11-08 20:05:36 +01:00
Silvan Jegen	3dd363afaa	Don't search the same term twice We have searched for the first term in the phrase query already so we can skip it. Before doing so we have to add the location of the first term.	2016-11-08 20:05:04 +01:00
Silvan Jegen	d87b4f88bf	Refactor phrase searching Reduce nesting by using early continues.	2016-11-08 20:04:28 +01:00
Marty Schoch	bcaea084c5	Merge pull request #496 from mschoch/fix495 fix date facets when using MultiSearch	2016-11-04 15:06:57 -04:00
Marty Schoch	8e2159cbe4	Merge pull request #494 from steveyen/MB-21474 simplified MultiSearch requires that indexes honor context deadlines	2016-11-04 15:06:47 -04:00
Marty Schoch	647bfd10ad	fix date facets when using MultiSearch changed date parsing to NOT update internal state of the date range object (avoids races) second, when marshaling a facet date range, we now use the string version, if the time.Time is zero and the string version is not ""	2016-11-04 14:02:01 -04:00
Steve Yen	dc2b6cd656	simplified MultiSearch requires that indexes honor context deadlines MultiSearch previously had its own timeout checking. This commit removes that timeout checking, so that now MultiSearch instead depends upon the bleve.Index implementations to perform their own context deadline/timeout checking. Because deadline/timeout checking is now handled by the bleve.Index implementations, this change allows applications to provide richer error and status results during timeouts.	2016-11-03 16:44:20 -07:00
Marty Schoch	a3f8953c9f	Merge pull request #493 from mschoch/fix492 numeric range facet merging compare range values not pointers	2016-11-03 16:28:04 -04:00
Marty Schoch	e5ec831250	numeric range facet merging compare range values not pointers fix #492	2016-11-03 15:48:46 -04:00
Marty Schoch	81c76f2a4a	Merge pull request #490 from steveyen/master optimize NewRegexpSearcher to return its disjunction searcher	2016-10-31 10:39:50 -04:00
Steve Yen	adc409e823	optimize NewRegexpSearcher to return its disjunction searcher This minor optimization removes an unnecessary wrapper around the disjunction searcher.	2016-10-27 13:16:41 -07:00
Marty Schoch	1bd6451581	Merge pull request #487 from steveyen/optimize-facets-builder Minor facets builder optimizations	2016-10-26 14:14:25 -04:00
Marty Schoch	f45584bf54	Merge pull request #486 from robmccoll/feature/overridejsontag adding override for "json" in struct tags, tests	2016-10-26 14:11:22 -04:00
Steve Yen	2a8237e8cc	optimize FacetsBuilder with cached fields & avoid some allocs	2016-10-25 15:34:48 -07:00

... 6 7 8 9 10 ...

1608 Commits