bleve

gibheer

bleve

Author	SHA1	Message	Date
abhinavdangeti	7e36109b3c	MB-28162: Provide API to estimate memory needed to run a search query This API (unexported) will estimate the amount of memory needed to execute a search query over an index before the collector begins data collection. Sample estimates for certain queries: {Size: 10, BenchmarkUpsidedownSearchOverhead} ESTIMATE BENCHMEM TermQuery 4616 4796 MatchQuery 5210 5405 DisjunctionQuery (Match queries) 7700 8447 DisjunctionQuery (Term queries) 6514 6591 ConjunctionQuery (Match queries) 7524 8175 Nested disjunction query (disjunction of disjunctions) 10306 10708 …	2018-03-06 13:53:42 -08:00
Marty Schoch	0eba2a3f0c	reduce garbage created while processing facets previously we parsed/returned large sections of the documents back index row in order to compute facet information. this would require parsing the protobuf of the entire back index row. unfortunately this creates considerable garbage. this new version introduces a visitor/callback approach to working with data inside the back index row. the benefit of this approach is that we can let the higher-level code see values, prior to any copies of data being made or intermediate garbage being created. implementations of the callback must copy any value which they would like to retain beyond the callback. NOTE: this approach is duplicates code from the automatically generated protobuf code NOTE: this approach assumes that the "field" field be serialized before the "terms" field. This is guaranteed by our currently generated protobuf encoder, and is recommended by the protobuf spec. But, decoders SHOULD support them occuring in any order, which we do not.	2017-03-02 17:00:46 -05:00
Marty Schoch	e5ec831250	numeric range facet merging compare range values not pointers fix #492	2016-11-03 15:48:46 -04:00
Steve Yen	2a8237e8cc	optimize FacetsBuilder with cached fields & avoid some allocs	2016-10-25 15:34:48 -07:00
Marty Schoch	2332455bd2	nicer formatting of license header	2016-10-02 10:13:14 -04:00
Marty Schoch	27ba6187bc	adds support for more complex field sorts with object (not string) previously from JSON we would just deserialize strings like "-abv" or "city" or "_id" or "_score" as simple sorts on fields, ids or scores respectively while this is simple and compact, it can be ambiguous (for example if you have a field starting with - or if you have a field named "_id" already. also, this simple syntax doesnt allow us to specify more cmoplex options to deal with type/mode/missing we keep support for the simple string syntax, but now also recognize a more expressive syntax like: { "by": "field", "field": "abv", "desc": true, "type": "string", "mode": "min", "missing": "first" } type, mode and missing are optional and default to "auto", "default", and "last" respectively	2016-08-17 14:33:51 -07:00
Marty Schoch	750e0ac16c	change sort field impl to use indexed values not stored values	2016-08-17 09:20:44 -07:00
Marty Schoch	e188fe35f7	switch back to single DocumentMatch struct instead of separate DocumentMatch/DocumentMatchInternal rules are simple, everything operates on the IndexInternalID field until the results are returned, then ID is set correctly the IndexInternalID field is not exported to JSON	2016-08-01 14:58:02 -04:00
Marty Schoch	5aa9e95468	major refactor of index/search API index id's are now opaque (until finally returned to top-level user) - the TermFieldDoc's returned by TermFieldReader no longer contain doc id - instead they return an opaque IndexInternalID - items returned are still in the "natural index order" - but that is no longer guaranteed to be "doc id order" - correct behavior requires that they all follow the same order - but not any particular order - new API FinalizeDocID which converts index internal ID's to public string ID - APIs used internally which previously took doc id now take IndexInternalID - that is DocumentFieldTerms() and DocumentFieldTermsForFields() - however, APIs that are used externally do not reflect this change - that is Document() - DocumentIDReader follows the same changes, but this is less obvious - behavior clarified, used to iterate doc ids, BUT NOT in doc id order - method STILL available to iterate doc ids in range - but again, you won't get them in any meaningful order - new method to iterate actual doc ids from list of possible ids - this was introduced to make the DocIDSearcher continue working searchers now work with the new opaque index internal doc ids - they return new DocumentMatchInternal (which does not have string ID) scorerers also work with these opaque index internal doc ids - they return DocumentMatchInternal (which does not have string ID) collectors now also perform a final step of converting the final result - they STILL return traditional DocumentMatch (with string ID) - but they now also require an IndexReader (so that they can do the conversion)	2016-07-31 13:46:18 -04:00
slavikm	fc990bc2d1	Remove the field IDs from outside of the index	2016-07-19 20:42:45 -07:00
slavikm	ce64c17be1	Do field cache only once per search	2016-07-17 16:29:17 -07:00
slavikm	9a9b630a6d	Make facets much faster	2016-07-17 15:31:35 -07:00
Marty Schoch	f1abf6beb3	facets now also have secondary sort in case of term facets, secondary sort (after count) is on the term for date and numberic facets, secondary sort is on the facet name fixes #335	2016-03-14 12:02:30 -04:00
Marty Schoch	6c988de5b5	fix date facet merging for searches on index aliases previously we incorrectly identified matching buckets by comparing string pointers. this worked in the unit test but not in real applications since the strings result from date parsing inside the facet collector, and are therefore different pointers	2016-02-23 15:33:07 -05:00
Marty Schoch	51a59cb05c	initial impl of Index Aliases an IndexAlias allows you easily work with one logical Index while changing the actual Index its pointing to behind the scenes Changing which actual Index is backing an IndexAlias can be done atomically so that your application smoothly transitions from one Index to another. A separate use of IndexAlias is allowed when the IndexAlias is defined to point to multiple Indexes. In this case only the Search() operation is supported, but the Search will be run on each of the underlying indexes in parallel, and the results will be merged.	2014-10-29 09:22:11 -04:00
Marty Schoch	198ca1ad4d	major refactor of kvstore/index internals, see below In the index/store package introduce KVReader creates snapshot all read operations consistent from this snapshot must close to release introduce KVWriter only one writer active access to all operations allows for consisten read-modify-write must close to release introduce AssociativeMerge operation on batch allows efficient read-modify-write for associative operations used to consolidate updates to the term summary rows saves 1 set and 1 get op per shared instance of term in field In the index package introduced an IndexReader exposes a consisten snapshot of the index for searching At top level All searches now operate on a consisten snapshot of the index	2014-09-12 17:21:35 -04:00
Marty Schoch	7a7eb2e94c	add newline between license and package this avoids cluttering godocs with the license	2014-09-02 10:54:50 -04:00
Marty Schoch	1161361bea	rename imports from couchbaselabs to blevesearch	2014-08-28 15:38:57 -04:00
Marty Schoch	7bbaa8ecd5	added support for returning facet results with requests supports terms, numeric ranges, and date ranges closes #14	2014-08-11 11:03:29 -04:00

19 Commits