0
0
Fork 0
Commit Graph

26 Commits

Author SHA1 Message Date
Marty Schoch 1eba5541f2 introduce new query TermRange
The term range query is not often used in full-text queries, but
can be useful when filtering on keyword indexed text terms in
the index.

The JSON syntax to do a TermRange query is the same as for
NumericRange, but the min/max values must be string and not
float64.
2017-03-31 22:04:00 -04:00
Marty Schoch 9790574610 update to geo query parsing and top-level bleve accessibility
- make geo queries accessible from top-level bleve
- update query parsing to support same geo point formats as
  document parsing
- add constructor for easier sorting by geo distance in Go
- additional integration tests using alternate (GeoJSON) style points
2017-03-30 15:23:27 -04:00
Marty Schoch 5636536583 fixed typo and formatted searches.json through jq . 2017-03-29 19:33:54 -04:00
Marty Schoch 7f89ff9493 add geo integration tests 2017-03-29 18:57:35 -04:00
Marty Schoch a5d1d7974c add query support for multi-phrase
when parsing json, when we encounter the key "terms", we first
try to parse as traditional phrase query, then if that fails,
we also try parsing it as multi-phrase
2017-02-10 16:46:38 -05:00
Steve Yen 89a1cefde1 API change: optional SearchRequest.IncludeLocations flag
This is a change in search result behavior in that location
information is no longer provided by default with search results.

Although this looks like a wide-ranging change, it's mostly a
mechanical replacement of the explain bool flag with a new
search.SearcherOptions struct, which holds both the Explain bool flag
and the IncludeTermVectors bool flag.
2017-01-05 21:11:22 -08:00
Marty Schoch 1ae938b781 add integration tests for sorting 2016-08-20 14:45:53 -04:00
Marty Schoch 2043bb4bf8 fix pagination bug introduced by collector optimization
fixes #378

this bug was introduced by:
f2aba116c4

theory of operation for this collector (top N, skip K)

- collect the highest scoring N+K results
- if K > 0, skip K and return the next N

internal details

- the top N+K are kept in a list
- the list is ordered from lowest scoring (first) to highest scoring (last)
- as a hit comes in, we find where this new hit would fit into this list
- if this caused the list to get too big, trim off the head (lowest scoring hit)

theory of the optimization

- we were not tracking the lowest score in the list
- so if the score was lower than the lowest score, we would add/remove it
- by keeping track of the lowest score in the list, we can avoid these ops

problem with the optimization
- the optimization worked by returning early
- by returning early there was a subtle change to documents which had the same score
- the reason is that which docs end up in the top N+K changed by returning early
- why was that? docs are coming in, in order by key ascending
- when finding the correct position to insert a hit into the list, we checked <, not <= the score
- this has the subtle effect that docs with the same score end up in reverse order

for example consider the following in progress list:

doc ids [   c    a    b  ]
scores  [   1    5    9  ]

if we now see doc d with score 5, we get:

doc ids [   c    a    d    b  ]
scores  [   1    5    5    9  ]

While that appears in order (a, d) it is actually reverse order, because when we
produce the top N we start at the end.

theory of the fix

- previous pagination depended on later hits with the same score "bumping" earlier
hits with the same score off the bottom of the list
- however, if we change the logic to <= instead of <, now the list in the previous
example would look like:

doc ids [   c    d    a    b  ]
scores  [   1    5    5    9  ]

- this small change means that now earlier (lower id) will score higher, and
thus we no longer depend on later hits bumping things down, which means returning
early is a valid thing to do

NOTE: this does depend on the hits coming back in order by ID.  this is not
something strictly guaranteed, but it was the same assumption that allowed the
original behavior

This also has the side-effect that 2 hits with the same score come back in
ascending ID order, which is somehow more pleasing to me than reverse order.
2016-06-01 11:35:18 -04:00
Marty Schoch 7ec37d6533 add support for wildcard and regexp queries to query string
you can now use terms like:

test?string*

and similar text in query strings to perform wildcard
searches.  also if you use:

/aregexp/

it will perform a regexp search as well
2016-04-08 15:56:02 -04:00
Marty Schoch 5408083ab5 from JSON parsing regexp/wildcard queries defaulted to boost of 0
having boost of 0 led to invalid scores of NaN
added integration test for wildcard query
added ability to run single integration test at a time
added assertion that scoare is not NaN/+Inf/-Inf
2016-02-23 09:22:39 -05:00
Marty Schoch c07fa47551 added test case to verify boost is working 2016-02-05 13:10:01 -05:00
Marty Schoch 0bddafb9e1 properly anchor regexp patterns to end of term
added integration tests for regexp anchoring
fixes #329
2016-01-21 13:44:38 -05:00
Marty Schoch b7c03dae1a boolean query defaults to minShould of 0
fixes #258
2016-01-12 16:30:10 -05:00
Marty Schoch 8efbd556a3 fix indexing bug with data coming from arrays
fixes #295
2015-12-21 14:59:32 -05:00
Marty Schoch f7698f1f15 support match_all, match_none and docid queries via JSON
also fixed bug in docIDQuery execution which would cause not
matching the highest docID passed in if it was in fact a
valid ID
2015-12-16 14:53:14 -05:00
Marty Schoch b4d4ee2fff fix incorrect results returned by phrase search
previously phrase searcher would not validate that consecutive
terms were actually occurring in the same array position

fixes #292
2015-12-06 15:55:00 -05:00
Marty Schoch a73a178923 fix incorrect prefix search behavior
avoids double incrementing of end term when reading term dict
fixes #293
2015-12-04 14:07:16 -05:00
Marty Schoch f35e2e42df fix highlighting to work on fields containing arrays
fixes #170
2015-07-31 14:43:12 -04:00
Marty Schoch 2a8f319689 added test case for query string containing only MUST NOT clause 2015-07-13 15:30:19 -04:00
Marty Schoch 7f0961424d updated tests for <mark></mark> 2015-07-06 18:00:05 -04:00
Marty Schoch a69fa1e91d adding tests based on problems found with fosdem dataset 2015-01-22 09:57:26 -05:00
Marty Schoch a2c3fa262a add more test cases of index
tests fields, highlighting, document field loading
2014-11-26 15:36:58 -05:00
Marty Schoch 65fe69d705 added integration tests for facets 2014-11-25 17:18:16 -05:00
Marty Schoch 67beaca6d6 fix to phrase/phrase match search involving stop words
closes #122
2014-11-25 10:07:54 -05:00
Marty Schoch 12ec3173fa added integration test for fuzzy search 2014-11-21 14:01:48 -05:00
Marty Schoch 68a2b9614d refactored integration tests into separate package
also made integration tests declarative
you can now easily define new datasets/mappings/searches/results
2014-11-19 15:58:15 -05:00