bleve

gibheer

bleve

Author	SHA1	Message	Date
Steve Yen	c7a342bc7d	scorch conjuncts match phrase test passes The conjunction searcher Advance() method now checks if its curr doc-matches suffices before advancing them.	2017-12-23 09:19:40 -08:00
Steve Yen	d425a3be86	scorch fix disjunction searcher Advance() Found with "versus" test (TestScorchVersusUpsideDownBoltSmallMNSAM), which had a boolean query with a MustNot that was the same as the Must parameters. This replicates a situation found by Aruna/Mihir/testrunner/RQG (MB-27291). Example: "query": { "must_not": {"disjuncts": [ {"field": "body", "match": "hello"} ]}, "must": {"conjuncts": [ {"field": "body", "match": "hello"} ]} } The nested searchers along the MustNot pathway would end up looking roughly like... booleanSearcher MustNot => disjunctionSearcher => disjunctionSearcher => termSearcher On the first Next() call by the collector, the two disjunction searchers would run through their respective Next() method processing, which includes their initSearcher() processing on the first time. This has the effect of driving the leaf termSearcher through two Next() invocations. That is, if there were 3 docs (doc-1, doc-2, doc-3), the leaf termSearcher would at this point have moved to point to doc-3, while the topmost MustNot would have received doc-1. Next, the booleanSearcher's Must searcher would produce doc-2, so the booleanSearcher would try to Advance() the MustNot searcher to doc-2. But, in scorch, the leafmost termSearcher had already gotten past doc-2 and would return its doc-3. In upsidedown, in contrast, the leaf termSearcher would then drive the KVStore iterator with a Seek(doc-2), and the KVStore iterator would perform a backwards seek to reach doc-2. In scorch, however, backwards iteration seeking isn't supported. So, this fix checks the state of the disjunction searcher to see if we already have the necessary state so that we don't have to perform actual Advance()'es on the underlying searchers. This not only fixes the behavior w.r.t. scorch, but also can have an effect of potentially making upsidedown slightly faster as we're avoiding some backwards KVStore iterator seeks.	2017-12-21 18:20:04 -08:00
Steve Yen	93c787ca09	scorch versus_test.go passes errcheck	2017-12-21 16:49:39 -08:00
Steve Yen	b3e41335e1	scorch compared to upsidedown/bolt using templated, generated searches This is somewhat like a simple, unit-test'ish version of testrunner's random query generator, where this does not have a dependency on an external elasticsearch server, and instead depends on functional correctness when comparing to upsidedown/bolt.	2017-12-21 16:43:52 -08:00
Marty Schoch	1eba5541f2	introduce new query TermRange The term range query is not often used in full-text queries, but can be useful when filtering on keyword indexed text terms in the index. The JSON syntax to do a TermRange query is the same as for NumericRange, but the min/max values must be string and not float64.	2017-03-31 22:04:00 -04:00
Marty Schoch	9790574610	update to geo query parsing and top-level bleve accessibility - make geo queries accessible from top-level bleve - update query parsing to support same geo point formats as document parsing - add constructor for easier sorting by geo distance in Go - additional integration tests using alternate (GeoJSON) style points	2017-03-30 15:23:27 -04:00
Marty Schoch	5636536583	fixed typo and formatted searches.json through jq .	2017-03-29 19:33:54 -04:00
Marty Schoch	7f89ff9493	add geo integration tests	2017-03-29 18:57:35 -04:00
Marty Schoch	a5d1d7974c	add query support for multi-phrase when parsing json, when we encounter the key "terms", we first try to parse as traditional phrase query, then if that fails, we also try parsing it as multi-phrase	2017-02-10 16:46:38 -05:00
Steve Yen	89a1cefde1	API change: optional SearchRequest.IncludeLocations flag This is a change in search result behavior in that location information is no longer provided by default with search results. Although this looks like a wide-ranging change, it's mostly a mechanical replacement of the explain bool flag with a new search.SearcherOptions struct, which holds both the Explain bool flag and the IncludeTermVectors bool flag.	2017-01-05 21:11:22 -08:00
Marty Schoch	2332455bd2	nicer formatting of license header	2016-10-02 10:13:14 -04:00
Marty Schoch	d7298a6e97	remove commented out section found by @steveyen code review	2016-09-30 12:36:52 -04:00
Marty Schoch	35da361bfa	BREAKING CHANGE - renamed packages to be shorter and not use _ this commit only addresses the analysis sub-package	2016-09-30 12:36:10 -04:00
Marty Schoch	79cc39a67e	refactor mapping to inteface and move into separate package the index mapping contains some relatively messy logic and the top-level bleve package only cares about a relatively small portion of this the motivation for this change is to codify the part that the top-level bleve package cares about into an interface then move all the details into its own package NOTE: the top-level bleve package still has hard dependency on the actual implementation (for now) because it must deserialize mappings from JSON and simply assumes it is this one instance. this is seen as OK for now, and this issue could be revisited in a future change. moving the logic into a separate package is seen as a simplification of top-level bleve, even though we still depend on the one particular implementation.	2016-09-29 14:53:18 -04:00
Marty Schoch	e1fb860a86	removed unused AsyncIndex interface	2016-09-13 08:42:36 -04:00
Marty Schoch	04fd62dec3	further tweaks, now all bleve tests pass	2016-09-11 20:29:15 -04:00
Marty Schoch	1ae938b781	add integration tests for sorting	2016-08-20 14:45:53 -04:00
Marty Schoch	9089de251f	remove byte_array_conveters fixes #392 fixes #100	2016-07-01 10:21:41 -04:00
Marty Schoch	2043bb4bf8	fix pagination bug introduced by collector optimization fixes #378 this bug was introduced by: `f2aba116c4` theory of operation for this collector (top N, skip K) - collect the highest scoring N+K results - if K > 0, skip K and return the next N internal details - the top N+K are kept in a list - the list is ordered from lowest scoring (first) to highest scoring (last) - as a hit comes in, we find where this new hit would fit into this list - if this caused the list to get too big, trim off the head (lowest scoring hit) theory of the optimization - we were not tracking the lowest score in the list - so if the score was lower than the lowest score, we would add/remove it - by keeping track of the lowest score in the list, we can avoid these ops problem with the optimization - the optimization worked by returning early - by returning early there was a subtle change to documents which had the same score - the reason is that which docs end up in the top N+K changed by returning early - why was that? docs are coming in, in order by key ascending - when finding the correct position to insert a hit into the list, we checked <, not <= the score - this has the subtle effect that docs with the same score end up in reverse order for example consider the following in progress list: doc ids [ c a b ] scores [ 1 5 9 ] if we now see doc d with score 5, we get: doc ids [ c a d b ] scores [ 1 5 5 9 ] While that appears in order (a, d) it is actually reverse order, because when we produce the top N we start at the end. theory of the fix - previous pagination depended on later hits with the same score "bumping" earlier hits with the same score off the bottom of the list - however, if we change the logic to <= instead of <, now the list in the previous example would look like: doc ids [ c d a b ] scores [ 1 5 5 9 ] - this small change means that now earlier (lower id) will score higher, and thus we no longer depend on later hits bumping things down, which means returning early is a valid thing to do NOTE: this does depend on the hits coming back in order by ID. this is not something strictly guaranteed, but it was the same assumption that allowed the original behavior This also has the side-effect that 2 hits with the same score come back in ascending ID order, which is somehow more pleasing to me than reverse order.	2016-06-01 11:35:18 -04:00
Marty Schoch	7ec37d6533	add support for wildcard and regexp queries to query string you can now use terms like: test?string* and similar text in query strings to perform wildcard searches. also if you use: /aregexp/ it will perform a regexp search as well	2016-04-08 15:56:02 -04:00
Marty Schoch	5badbfdb0e	allow running integration tests on alternate kvstore	2016-03-07 08:40:15 -05:00
Marty Schoch	5408083ab5	from JSON parsing regexp/wildcard queries defaulted to boost of 0 having boost of 0 led to invalid scores of NaN added integration test for wildcard query added ability to run single integration test at a time added assertion that scoare is not NaN/+Inf/-Inf	2016-02-23 09:22:39 -05:00
Marty Schoch	c07fa47551	added test case to verify boost is working	2016-02-05 13:10:01 -05:00
Marty Schoch	0bddafb9e1	properly anchor regexp patterns to end of term added integration tests for regexp anchoring fixes #329	2016-01-21 13:44:38 -05:00
Marty Schoch	b7c03dae1a	boolean query defaults to minShould of 0 fixes #258	2016-01-12 16:30:10 -05:00
Marty Schoch	8efbd556a3	fix indexing bug with data coming from arrays fixes #295	2015-12-21 14:59:32 -05:00
Marty Schoch	7bb58e1be4	add ability for integration test to check hit locations	2015-12-21 14:42:43 -05:00
Marty Schoch	f7698f1f15	support match_all, match_none and docid queries via JSON also fixed bug in docIDQuery execution which would cause not matching the highest docID passed in if it was in fact a valid ID	2015-12-16 14:53:14 -05:00
Marty Schoch	84ec206fec	add some tests for index names in results	2015-12-08 14:38:46 -05:00
Marty Schoch	b4d4ee2fff	fix incorrect results returned by phrase search previously phrase searcher would not validate that consecutive terms were actually occurring in the same array position fixes #292	2015-12-06 15:55:00 -05:00
Marty Schoch	a73a178923	fix incorrect prefix search behavior avoids double incrementing of end term when reading term dict fixes #293	2015-12-04 14:07:16 -05:00
Marty Schoch	699c86073a	make existing integration tests work with firestorm	2015-12-01 12:29:56 -05:00
Marty Schoch	f81b2be334	major refactor of bleve configuration see #221 for full details	2015-09-16 17:10:59 -04:00
Marty Schoch	f35e2e42df	fix highlighting to work on fields containing arrays fixes #170	2015-07-31 14:43:12 -04:00
Marty Schoch	2a8f319689	added test case for query string containing only MUST NOT clause	2015-07-13 15:30:19 -04:00
Marty Schoch	7f0961424d	updated tests for <mark></mark>	2015-07-06 18:00:05 -04:00
Marty Schoch	539aeb8dc7	fix errors identified by errcheck part of #169	2015-04-07 18:05:41 -04:00
Marty Schoch	56c4a09de1	fix issues identified by errcheck part of #169	2015-04-07 15:39:56 -04:00
Marty Schoch	0df0a6fcb2	better logging on which test failed in integration tests	2015-03-10 14:05:30 -04:00
Marty Schoch	a69fa1e91d	adding tests based on problems found with fosdem dataset	2015-01-22 09:57:26 -05:00
Silvan Jegen	ef18dfe4cd	Fix typos in comments and strings	2014-12-18 18:43:12 +01:00
Marty Schoch	a2c3fa262a	add more test cases of index tests fields, highlighting, document field loading	2014-11-26 15:36:58 -05:00
Marty Schoch	65fe69d705	added integration tests for facets	2014-11-25 17:18:16 -05:00
Marty Schoch	67beaca6d6	fix to phrase/phrase match search involving stop words closes #122	2014-11-25 10:07:54 -05:00
Marty Schoch	12ec3173fa	added integration test for fuzzy search	2014-11-21 14:01:48 -05:00
Marty Schoch	68a2b9614d	refactored integration tests into separate package also made integration tests declarative you can now easily define new datasets/mappings/searches/results	2014-11-19 15:58:15 -05:00

46 Commits