0
0
Commit Graph

1124 Commits

Author SHA1 Message Date
Marty Schoch
79cc39a67e refactor mapping to inteface and move into separate package
the index mapping contains some relatively messy logic
and the top-level bleve package only cares about a relatively
small portion of this
the motivation for this change is to codify the part that the
top-level bleve package cares about into an interface
then move all the details into its own package

NOTE: the top-level bleve package still has hard dependency on
the actual implementation (for now) because it must deserialize
mappings from JSON and simply assumes it is this one instance.
this is seen as OK for now, and this issue could be revisited
in a future change.  moving the logic into a separate package
is seen as a simplification of top-level bleve, even though
we still depend on the one particular implementation.
2016-09-29 14:53:18 -04:00
Marty Schoch
97393d0273 adding appenginevm build tag 2016-09-28 18:31:36 -04:00
Marty Schoch
b7c1139d8e Merge branch 'dtylman-master' 2016-09-28 08:30:30 -04:00
Marty Schoch
12bd29db80 add unit test for skipping more than number of hits found 2016-09-28 08:28:20 -04:00
Danny Tylman
fe0151287e closes #453
panic in collectStoreSlice.Final
2016-09-28 13:25:34 +03:00
Marty Schoch
73d0951b2a don't panic on missing backindex row
part of #419
2016-09-27 22:16:45 -04:00
Marty Schoch
1077cb6012 Merge pull request #452 from mschoch/memonly
BREAKING CHANGE - new method to create memory only index
2016-09-27 21:23:13 -04:00
Marty Schoch
fb0f4bbecd BREAKING CHANGE - new method to create memory only index
Previously bleve allowed you to create a memory-only index by
simply passing "" as the path argument to the New() method.

This was not clear when reading the code, and led to some
problematic error cases as well.

Now, to create a memory-only index one should use the
NewMemOnly() method.  Passing "" as the path argument
to the New() method will now return os.ErrInvalid.

Advanced users calling NewUsing() can create disk-based or
memory-only indexes, but the change here is that pass ""
as the path argument no longer defaults you into getting
a memory-only index.  Instead, the KV store is selected
manually, just as it is for the disk-based solutions.

Here is an example use of the NewUsing() method to create
a memory-only index:

NewUsing("", indexMapping, Config.DefaultIndexType,
         Config.DefaultMemKVStore, nil)

Config.DefaultMemKVStore is just a new default value
added to the configuration, it currently points to
gtreap.Name (which could have been used directly
instead for more control)

closes #427
2016-09-27 14:11:40 -04:00
Marty Schoch
3cf7e00b50 remove binary accidentally committed to repo
update .gitignore to prevent this in the future
2016-09-27 13:05:50 -04:00
Marty Schoch
e66861d5ba Merge pull request #451 from steveyen/master
Revert "optimize when disjunction query has only a single child"
2016-09-26 17:57:45 -04:00
Steve Yen
2b3e6ee836 Revert "optimize when disjunction query has only a single child"
See also: https://issues.couchbase.com/browse/MB-21046

This reverts commit 6d6fae2895.

It turns out that boolean searcher was depending on its 'should'
constituent (a disjunction query) and its min state changes, so a
rewrite wasn't safe/correct given this situation.
2016-09-26 14:46:21 -07:00
Marty Schoch
1f79f65b6a Merge pull request #450 from mschoch/bug449
fix logic in Advance() of UpsideDownCouchDocIDReader
2016-09-26 12:44:09 -04:00
Marty Schoch
981812ff70 fix logic in Advance() of UpsideDownCouchDocIDReader
also added unit tests for newUpsideDownCouchDocIDReaderOnly
use cases
fixes #449
2016-09-26 12:36:24 -04:00
Marty Schoch
2d2ee6f350 Merge pull request #447 from steveyen/upside_down-TFR-KeyAppendTo
added upside_down TermFrequencyRow.KeyAppendTo() API
2016-09-23 17:56:34 -04:00
Steve Yen
10cab1826d added upside_down TermFrequencyRow.KeyAppendTo() API
This is a cleanup commit that's followup to a code review discussion
on a previous Advance() perf-optimization PR...

https://github.com/blevesearch/bleve/pull/443
2016-09-23 09:22:42 -07:00
Marty Schoch
1d81d34a5a Merge pull request #446 from steveyen/perf-locations-alloc
perf avoid locations alloc
2016-09-23 12:19:39 -04:00
Steve Yen
134f7b7479 update moss gvt manifest for SeekTo() API 2016-09-23 07:18:14 -07:00
Steve Yen
647a039a6f optimize disjunction scorer to avoid locations alloc if unneeded 2016-09-22 18:19:21 -07:00
Steve Yen
988dfb02e9 moss kvstore iterator Seek() invokes underlying moss SeekTo() API 2016-09-22 17:46:06 -07:00
Steve Yen
5f5b5d3b80 optimize upside_down TermFieldReader.Advance() to reuse memory
On a dev laptop, bleve-query benchmark on wiki dataset using
query-string of "+text:afternoon +text:coffee" previously had
throughput of 1222qps, and with this change hits 1940qps.
2016-09-22 17:46:06 -07:00
Steve Yen
e380245cd8 optimize ConjunctionSearcher to avoid extra id comparisons
This commit avoids extra id comparisons in the ConjunctionSearcher,
such as a comparison of the currs[maxIDIdx] to itself.
2016-09-22 17:46:06 -07:00
Marty Schoch
6566ace8bf Merge pull request #442 from steveyen/issue-441
issue 441 - upside_down termFieldReader doesn't call Next() early
2016-09-22 14:22:37 -04:00
Steve Yen
bcec199c89 issue 441 - upside_down termFieldReader doesn't call Next() early
This change to upside_down term-field-reader no longer moves the
underlying iterator forward preemptively.  Instead, it will invoke
Next() on the underlying iterator only when the caller invokes the
term-field-reader's Next().

There's a special case to handle the situation on the first Next()
invocation after the term-field-reader is created.
2016-09-22 09:18:29 -07:00
Marty Schoch
47a98fcf1b Merge pull request #439 from steveyen/perf-boolean-searcher2
optimize when boolean query has should constituents only
2016-09-21 14:49:37 -04:00
Marty Schoch
4de6e9a2b9 Merge branch 'slavikm-master3' 2016-09-21 13:13:46 -04:00
Marty Schoch
da766263f9 Merge branch 'master' of https://github.com/slavikm/bleve into slavikm-master3 2016-09-21 13:04:07 -04:00
Marty Schoch
0236043f65 rewrite links suitable for blevesearch website 2016-09-21 12:58:18 -04:00
slavikm
3eec1ae16c Satisfy errcheck 2016-09-21 17:56:03 +03:00
slavikm
40c1dc076f Now, without the rollback 2016-09-21 16:15:06 +03:00
slavikm
588f379962 Commit if there is no error, rollback otherwise 2016-09-21 16:13:47 +03:00
slavikm
ac49306077 Make sure that the transaction is closed if there is an error 2016-09-21 14:32:05 +03:00
Steve Yen
6d6fae2895 optimize when disjunction query has only a single child
On my dev laptop, the bleve-query benchmark of query-string
"+text:afternoon +text:coffee" (which gets parsed into a conjection of
disjunctions) had throughput of 308qps before this change, and after
this change was 342qps.
2016-09-20 23:09:26 -07:00
Steve Yen
38bd2fc058 ConjunctionSearcher avoids one internal id comparison 2016-09-20 23:03:15 -07:00
Steve Yen
cdfa2710fb moved ConjunctionSearcher fields for alignment 2016-09-20 22:29:40 -07:00
Steve Yen
e344582021 optimize DisjunctionSearcher.Next()
This change simplifies and removes the DisjunctionSearcher.currentID
tracking, and instead utilizes the the matching/matchingIdxs slices
for tracking the required information.

As the core of the optimization, the previous code used two loop
passses to compare the internal ID's to the currentID field.  This
commit instead optimizes to have a single pass to both compare the
internalID's and to also maintain the matching/matchingIdxs arrays.

On my dev box, using a bleve-query benchmark on a wiki dataset, with
query-string of "text:afternoon text:coffee", the previous code had
throughput of 958qps, and this commit has 1174qps.
2016-09-20 19:22:37 -07:00
Steve Yen
75281a1f9f change DisjunctionSearcher.min type from float64 to int 2016-09-20 18:51:40 -07:00
Steve Yen
16dac98f71 optimization when boolean query has should constituents only
A common search case is when a user performs a query-string query,
such as for "the lazy dog".  That would be parsed into a boolean query
with a nil Must child, a nil MustNot child, and a non-nil Should child
(a disjunction query for "the", "lazy", "dog").

The optimization in this case is to return just the Should child
directly, skipping any additional Must and MustNot overhead.

On a dev box bleve-query benchmark on a wiki index with a query string
of "text:afternoon text:coffee", the throughput was previously 873qps
and with this change hits 940qps.
2016-09-20 18:26:00 -07:00
Steve Yen
46a46357a7 simplify BooleanSearcher mustSearcher else logic 2016-09-20 18:11:04 -07:00
Marty Schoch
949ea6397c Merge pull request #438 from mschoch/buildtagdocs
add build tag protecting merge-coverprofile
2016-09-20 14:44:14 -04:00
Marty Schoch
85b61a8631 add build tag protecting merge-coverprofile
this should prevent people that run:
go get github.com/blevesearch/bleve/...
from getting a useless "docs" program in their bin/ dir
2016-09-20 14:29:01 -04:00
Marty Schoch
60ef1c89dc Merge pull request #430 from mschoch/newblevetool
migrated all bleve utils into single bleve command
2016-09-20 14:11:35 -04:00
Marty Schoch
0d52d2f8ea add build tag to ignore gendocs by default 2016-09-20 13:58:59 -04:00
Marty Schoch
81e676de79 improved usage and added utility to generate markdown docs 2016-09-20 13:42:45 -04:00
Marty Schoch
b896537eff Merge pull request #437 from steveyen/perf-boolean-searcher
optimize boolean search Next() with fewer id comparisons
2016-09-20 12:59:38 -04:00
Steve Yen
3acad78875 optimize boolean search Next() with fewer id comparisons
This change to the BooleanSearcher.Next() tries to perform fewer
internal id comparisons.
2016-09-20 09:43:01 -07:00
Marty Schoch
58a5ac2c45 Merge pull request #433 from steveyen/perf-misc
miscellaneous search perf tweaks
2016-09-18 14:12:34 -04:00
Steve Yen
26b621e916 reuse backing array of matches for boolean searcher
The reused backing array of constituent matches should help avoid
additional memory allocations.
2016-09-18 10:43:29 -07:00
Steve Yen
dd7cb14a56 disjunction searcher avoids second ID.Equals() comparison
Optimization for DisjunctionSearcher, where an extra matchingIdxs
helps track the currs that were matching.  This avoids the previous
code's second loop through the currs slice.
2016-09-18 10:43:16 -07:00
Steve Yen
090c08eb46 upside_down disjunction searcher reuses matching slice 2016-09-18 10:43:16 -07:00
Marty Schoch
e68f6ca9e6 Merge pull request #432 from steveyen/perf-skip-0xff-scan
skip termFrequencyRow 0xFF scan as term length is already known
2016-09-18 12:20:21 -04:00