0
0
Fork 0
Commit Graph

90 Commits

Author SHA1 Message Date
abhinavdangeti 715144d632 MB-27385: De-duplicate the list of requested fields
De-duplicate the list of fields provided by the client as part
of the search request, so as to not inadvertantly load the same
stored field more than once.
2018-03-13 14:19:02 -07:00
abhinavdangeti 40f63baeb9 MB-28562: Support search query callbacks before and after execution
+ SearchQueryStartCallback
+ SearchQueryEndCallback
2018-03-08 13:35:51 -08:00
abhinavdangeti 96071c085c MB-28163: Register a callback with context to estimate RAM for search
This callback if registered with context will invoke the api to estimate
the memory needed to execute a search query. The callback defined at
the client side will be responsible for determining whether to
continue with the search or abort based on the threshold settings.
2018-03-06 13:53:42 -08:00
abhinavdangeti 7e36109b3c MB-28162: Provide API to estimate memory needed to run a search query
This API (unexported) will estimate the amount of memory needed to execute
a search query over an index before the collector begins data collection.

Sample estimates for certain queries:
{Size: 10, BenchmarkUpsidedownSearchOverhead}
                                                           ESTIMATE    BENCHMEM
TermQuery                                                  4616        4796
MatchQuery                                                 5210        5405
DisjunctionQuery (Match queries)                           7700        8447
DisjunctionQuery (Term queries)                            6514        6591
ConjunctionQuery (Match queries)                           7524        8175
Nested disjunction query (disjunction of disjunctions)     10306       10708
…
2018-03-06 13:53:42 -08:00
Marty Schoch c74e08f039 BREAKING API CHANGE - use stdlib context pkg
update all references to context to use std lib pkg
2018-02-27 11:33:43 -08:00
Seif Lotfy 06b4daed87 Add new IndexAdvanced function 2017-04-12 00:31:51 +02:00
Marty Schoch 3ad13236ec fix geopoint fields to be able to be stored and retrieved 2017-03-31 09:40:54 -04:00
Steve Yen 89a1cefde1 API change: optional SearchRequest.IncludeLocations flag
This is a change in search result behavior in that location
information is no longer provided by default with search results.

Although this looks like a wide-ranging change, it's mostly a
mechanical replacement of the explain bool flag with a new
search.SearcherOptions struct, which holds both the Explain bool flag
and the IncludeTermVectors bool flag.
2017-01-05 21:11:22 -08:00
Marty Schoch 647bfd10ad fix date facets when using MultiSearch
changed date parsing to NOT update internal state of the date
range object (avoids races)

second, when marshaling a facet date range, we now use the
string version, if the time.Time is zero and the string version
is not ""
2016-11-04 14:02:01 -04:00
Steve Yen 21b3d592b8 log slow queries only when Config.SlowSearchLogThreshold > 0 2016-10-10 11:34:32 -07:00
Marty Schoch 3a276153a3 actually rename packages to singular, not just directory name 2016-10-02 10:29:39 -04:00
Marty Schoch 2332455bd2 nicer formatting of license header 2016-10-02 10:13:14 -04:00
Marty Schoch 6bf9dd59ab BREAKING CHANGE - additional package renaming
i recently learned that package names should also prefer the
singular form, not the plural form
2016-10-01 17:20:59 -04:00
Marty Schoch f90856b8d3 BREAKING CHANGE - rename upside_down to upsidedown 2016-09-30 12:36:38 -04:00
Marty Schoch 79cc39a67e refactor mapping to inteface and move into separate package
the index mapping contains some relatively messy logic
and the top-level bleve package only cares about a relatively
small portion of this
the motivation for this change is to codify the part that the
top-level bleve package cares about into an interface
then move all the details into its own package

NOTE: the top-level bleve package still has hard dependency on
the actual implementation (for now) because it must deserialize
mappings from JSON and simply assumes it is this one instance.
this is seen as OK for now, and this issue could be revisited
in a future change.  moving the logic into a separate package
is seen as a simplification of top-level bleve, even though
we still depend on the one particular implementation.
2016-09-29 14:53:18 -04:00
Marty Schoch fb0f4bbecd BREAKING CHANGE - new method to create memory only index
Previously bleve allowed you to create a memory-only index by
simply passing "" as the path argument to the New() method.

This was not clear when reading the code, and led to some
problematic error cases as well.

Now, to create a memory-only index one should use the
NewMemOnly() method.  Passing "" as the path argument
to the New() method will now return os.ErrInvalid.

Advanced users calling NewUsing() can create disk-based or
memory-only indexes, but the change here is that pass ""
as the path argument no longer defaults you into getting
a memory-only index.  Instead, the KV store is selected
manually, just as it is for the disk-based solutions.

Here is an example use of the NewUsing() method to create
a memory-only index:

NewUsing("", indexMapping, Config.DefaultIndexType,
         Config.DefaultMemKVStore, nil)

Config.DefaultMemKVStore is just a new default value
added to the configuration, it currently points to
gtreap.Name (which could have been used directly
instead for more control)

closes #427
2016-09-27 14:11:40 -04:00
Marty Schoch 3fd2a64872 BREAKING CHANGE - removed DumpXXX() methods from bleve.Index
The DumpXXX() methods were always documented as internal and
unsupported.  However, now they are being removed from the
public top-level API.  They are still available on the internal
IndexReader, which can be accessed using the Advanced() method.

The DocCount() and DumpXXX() methods on the internal index
have moved to the internal index reader, since they logically
operate on a snapshot of an index.
2016-09-13 12:40:01 -04:00
Marty Schoch 47c239ca7b refactored data structure out of collector
the TopNCollector now can either use a heap or a list

i did not code it to use an interface, because this is a very hot
loop during searching.  rather, it lets bleve developers easily
toggle between the two (or other ideas) by changing 2 lines

The list is faster in the benchmark, but causes more allocations.
The list is once again the default (for now).

To switch to the heap implementation, change:

store *collectStoreList
to
store *collectStoreHeap

and

newStoreList(...
to
newStoreHeap(...
2016-08-26 10:29:50 -04:00
Marty Schoch 0bb69a9a1c Merge branch 'master' of https://github.com/dtylman/bleve into sort-by-field-try2 2016-08-12 14:23:55 -04:00
Danny Tylman 0d6a2b565f closes #110 2016-08-10 11:44:31 +03:00
Danny Tylman 5164e70f6e Adding sort to SearchRequest. 2016-08-09 16:18:53 +03:00
Marty Schoch 5aa9e95468 major refactor of index/search API
index id's are now opaque (until finally returned to top-level user)
 - the TermFieldDoc's returned by TermFieldReader no longer contain doc id
 - instead they return an opaque IndexInternalID
 - items returned are still in the "natural index order"
 - but that is no longer guaranteed to be "doc id order"
 - correct behavior requires that they all follow the same order
 - but not any particular order

 - new API FinalizeDocID which converts index internal ID's to public string ID

 - APIs used internally which previously took doc id now take IndexInternalID
     - that is DocumentFieldTerms() and DocumentFieldTermsForFields()
 - however, APIs that are used externally do not reflect this change
     - that is Document()

 - DocumentIDReader follows the same changes, but this is less obvious
     - behavior clarified, used to iterate doc ids, BUT NOT in doc id order
     - method STILL available to iterate doc ids in range
     - but again, you won't get them in any meaningful order
     - new method to iterate actual doc ids from list of possible ids
         - this was introduced to make the DocIDSearcher continue working

searchers now work with the new opaque index internal doc ids
 - they return new DocumentMatchInternal (which does not have string ID)
scorerers also work with these opaque index internal doc ids
 - they return DocumentMatchInternal (which does not have string ID)
collectors now also perform a final step of converting the final result
 - they STILL return traditional DocumentMatch (with string ID)
 - but they now also require an IndexReader (so that they can do the conversion)
2016-07-31 13:46:18 -04:00
Marty Schoch 389e18a779 attempt to support google app engine
the default configuration, which sets the default kv engine
to boltdb is now done in file protected with the !appengine
build tag.  this at least lets the analysis-wizzard app
run locally in the appengine simulator.

this still has not been tested on the real appengine, and further
changes may be required.
2016-07-29 21:29:05 -04:00
slavikm 6d830a9f3e Load the document only once for both fields and highlighter 2016-04-28 11:12:33 -07:00
Marty Schoch d0c6dbc9cf unregister index from expvar stats on close 2016-04-20 11:43:14 -04:00
Marty Schoch 709b418823 properly initialize index stats object for in memory indexes 2016-04-20 11:37:51 -04:00
Marty Schoch 2b82387eae export the Validate method on mapping objects 2016-03-28 17:14:41 -04:00
Marty Schoch e76b1dd8f3 better logging on index mapping corruption 2016-03-11 16:36:43 -05:00
Marty Schoch d7292ed891 add support for gathering stats via map for easier consumption 2016-03-07 18:37:46 -05:00
Marty Schoch 0b2380d9bf introduce ability for searches to timeout or be cancelled
our implementation uses: golang.org/x/net/context

New method SearchInContext() allows the user to run a search
in the provided context.  If that context is cancelled or
exceeds its deadline Bleve will attempt to stop and return
as soon as possible.  This is a *best effort* attempt at this
time and may *not* be in a timely manner.  If the caller must
return very near the timeout, the call should also be wrapped
in a goroutine.

The IndexAlias implementation is affected in a slightly more
complex way.  In order to return partial results when a timeout
occurs on some indexes, the timeout is strictly enforced, and
at the moment this does introduce an additional goroutine.

The Bleve implementation honoring the context is currently
very course-grained.  Specifically we check the Done() channel
between each DocumentMatch produced during the search.  In the
future we will propogate the context deeper into the internals
of Bleve, and this will allow finer-grained timeout behavior.
2016-03-02 17:30:21 -05:00
Marty Schoch 214b67ad66 SearchResult now includes a Status section
the Status section can report on the number of total/fail/success
indexes when querying across multiple indexes through IndexAlias

Further, searching an IndexAlias will now return partial results,
the burden is on the caller to check the number of failed
indexes and decide how to handle this situation.
2016-02-22 16:50:40 -05:00
Marty Schoch f38e3e1b24 remove temporary error and replace with permanent check 2016-02-03 10:23:49 -05:00
Marty Schoch c5dea9e882 fix accessing store via Advanced() method which was broken 2016-02-02 11:54:18 -05:00
Marty Schoch a236737a68 temporary workaround to avoid crashing when an index is not
behaving consistent with the API contracts
2016-02-01 12:31:26 -05:00
Marty Schoch 10e2207179 adding logging for unexplained observed behavior MB-17298
it would appear that a document lookup for an id fails
but that is a document id that was returned as a search hit
since we're using a stable snapshot, this should not happen
2016-01-25 10:45:58 -05:00
opennota 8517feb1c6 Fix some typos 2016-01-15 05:46:27 +07:00
slavikm 680be52f87 Implemented boolean field support 2016-01-11 17:18:03 -08:00
Marty Schoch d73beac3b9 search result hits now have a field with the name of the index
this allows you to figure out where a result actually came
from when using aliases
2015-12-08 13:55:04 -05:00
Marty Schoch aa7658bbb0 give indexes names, make stats available via expvar by default 2015-12-06 14:01:03 -05:00
Marty Schoch 64ce81c283 Merge branch 'master' into newkvstore 2015-09-29 14:06:27 -04:00
Marty Schoch da40935e22 Merge branch 'codesimplification' of https://github.com/Shugyousha/bleve into Shugyousha-codesimplification 2015-09-29 13:02:56 -04:00
Marty Schoch cddf90c0ee don't allow operations on empty doc id
fixes #239
2015-09-28 17:00:08 -04:00
Marty Schoch 1c9feaf792 fix backwards compatibility when index meta does not specify
the index type
2015-09-25 09:57:09 -07:00
Marty Schoch 900f1b4a67 major kvstore interface and impl overhaul
clarified the interface contract
2015-09-23 11:25:47 -07:00
Silvan Jegen 3414701fca Simplify returns 2015-09-21 20:47:10 +02:00
Marty Schoch f81b2be334 major refactor of bleve configuration
see #221 for full details
2015-09-16 17:10:59 -04:00
Marty Schoch dbb93b75a4 refactoring to allow pluggable index encodings
this lays the foundation for supporting the new firestorm
indexing scheme.  i'm merging these changes ahead of
the rest of the firestorm branch so i can continue
to make changes to the analysis pipeline in parallel
2015-09-02 13:12:08 -04:00
Marty Schoch a9c07acbfa refactor of kvstore api to support native merge in rocksdb
refactor to share code in emulated batch
refactor to share code in emulated merge
refactor index kvstore benchmarks to share more code
refactor index kvstore benchmarks to be more repeatable
2015-04-24 17:13:50 -04:00
Marty Schoch 11262c793f fix bug, internal ops must check that index is open
possibly fixes https://github.com/couchbaselabs/cbft/issues/49
2015-04-03 18:05:24 -04:00
Sathyanarayanan Gunasekaran 93e749bc0c Check all return errors
- Fix the following errors found by errcheck :
  $ bleve git:(master) errcheck github.com/blevesearch/bleve
  github.com/blevesearch/bleve/index_impl.go:206:25  defer indexReader.Close()
  github.com/blevesearch/bleve/index_impl.go:317:25  defer indexReader.Close()
  github.com/blevesearch/bleve/index_impl.go:353:25  defer indexReader.Close()
  github.com/blevesearch/bleve/index_impl.go:359:22  defer searcher.Close()
  github.com/blevesearch/bleve/index_impl.go:497:25  defer indexReader.Close()
  github.com/blevesearch/bleve/index_impl.go:644:20  defer reader.Close()
  github.com/blevesearch/bleve/index_meta.go:67:27   defer indexMetaFile.Close()
2015-03-11 01:28:51 -04:00