bleve

Author	SHA1	Message	Date
Marty Schoch	1eba5541f2	introduce new query TermRange The term range query is not often used in full-text queries, but can be useful when filtering on keyword indexed text terms in the index. The JSON syntax to do a TermRange query is the same as for NumericRange, but the min/max values must be string and not float64.	2017-03-31 22:04:00 -04:00
Marty Schoch	2043bb4bf8	fix pagination bug introduced by collector optimization fixes #378 this bug was introduced by: `f2aba116c4` theory of operation for this collector (top N, skip K) - collect the highest scoring N+K results - if K > 0, skip K and return the next N internal details - the top N+K are kept in a list - the list is ordered from lowest scoring (first) to highest scoring (last) - as a hit comes in, we find where this new hit would fit into this list - if this caused the list to get too big, trim off the head (lowest scoring hit) theory of the optimization - we were not tracking the lowest score in the list - so if the score was lower than the lowest score, we would add/remove it - by keeping track of the lowest score in the list, we can avoid these ops problem with the optimization - the optimization worked by returning early - by returning early there was a subtle change to documents which had the same score - the reason is that which docs end up in the top N+K changed by returning early - why was that? docs are coming in, in order by key ascending - when finding the correct position to insert a hit into the list, we checked <, not <= the score - this has the subtle effect that docs with the same score end up in reverse order for example consider the following in progress list: doc ids [ c a b ] scores [ 1 5 9 ] if we now see doc d with score 5, we get: doc ids [ c a d b ] scores [ 1 5 5 9 ] While that appears in order (a, d) it is actually reverse order, because when we produce the top N we start at the end. theory of the fix - previous pagination depended on later hits with the same score "bumping" earlier hits with the same score off the bottom of the list - however, if we change the logic to <= instead of <, now the list in the previous example would look like: doc ids [ c d a b ] scores [ 1 5 5 9 ] - this small change means that now earlier (lower id) will score higher, and thus we no longer depend on later hits bumping things down, which means returning early is a valid thing to do NOTE: this does depend on the hits coming back in order by ID. this is not something strictly guaranteed, but it was the same assumption that allowed the original behavior This also has the side-effect that 2 hits with the same score come back in ascending ID order, which is somehow more pleasing to me than reverse order.	2016-06-01 11:35:18 -04:00
Marty Schoch	7ec37d6533	add support for wildcard and regexp queries to query string you can now use terms like: test?string* and similar text in query strings to perform wildcard searches. also if you use: /aregexp/ it will perform a regexp search as well	2016-04-08 15:56:02 -04:00
Marty Schoch	5408083ab5	from JSON parsing regexp/wildcard queries defaulted to boost of 0 having boost of 0 led to invalid scores of NaN added integration test for wildcard query added ability to run single integration test at a time added assertion that scoare is not NaN/+Inf/-Inf	2016-02-23 09:22:39 -05:00
Marty Schoch	c07fa47551	added test case to verify boost is working	2016-02-05 13:10:01 -05:00
Marty Schoch	0bddafb9e1	properly anchor regexp patterns to end of term added integration tests for regexp anchoring fixes #329	2016-01-21 13:44:38 -05:00
Marty Schoch	b7c03dae1a	boolean query defaults to minShould of 0 fixes #258	2016-01-12 16:30:10 -05:00
Marty Schoch	f7698f1f15	support match_all, match_none and docid queries via JSON also fixed bug in docIDQuery execution which would cause not matching the highest docID passed in if it was in fact a valid ID	2015-12-16 14:53:14 -05:00
Marty Schoch	a73a178923	fix incorrect prefix search behavior avoids double incrementing of end term when reading term dict fixes #293	2015-12-04 14:07:16 -05:00
Marty Schoch	f35e2e42df	fix highlighting to work on fields containing arrays fixes #170	2015-07-31 14:43:12 -04:00
Marty Schoch	2a8f319689	added test case for query string containing only MUST NOT clause	2015-07-13 15:30:19 -04:00
Marty Schoch	7f0961424d	updated tests for <mark></mark>	2015-07-06 18:00:05 -04:00
Marty Schoch	a2c3fa262a	add more test cases of index tests fields, highlighting, document field loading	2014-11-26 15:36:58 -05:00
Marty Schoch	12ec3173fa	added integration test for fuzzy search	2014-11-21 14:01:48 -05:00
Marty Schoch	68a2b9614d	refactored integration tests into separate package also made integration tests declarative you can now easily define new datasets/mappings/searches/results	2014-11-19 15:58:15 -05:00

15 Commits