0
0
Fork 0
Commit Graph

39 Commits

Author SHA1 Message Date
Nicholas Wiersma b12c902457 Allow pre parsing query strings 2017-04-25 16:30:47 +02:00
Marty Schoch 1eba5541f2 introduce new query TermRange
The term range query is not often used in full-text queries, but
can be useful when filtering on keyword indexed text terms in
the index.

The JSON syntax to do a TermRange query is the same as for
NumericRange, but the min/max values must be string and not
float64.
2017-03-31 22:04:00 -04:00
Marty Schoch 6554e9624f geo review comments from sreekanth
also one fix came from steve, i must have forgotten to push that
commit up before merging
2017-03-31 08:41:40 -04:00
Marty Schoch 9790574610 update to geo query parsing and top-level bleve accessibility
- make geo queries accessible from top-level bleve
- update query parsing to support same geo point formats as
  document parsing
- add constructor for easier sorting by geo distance in Go
- additional integration tests using alternate (GeoJSON) style points
2017-03-30 15:23:27 -04:00
Marty Schoch 6507e31787 improved geo searcher unit tests
also added flag for bounding box searcher to optionally not
check boundaries.  this is useful when other searchers are going
to check every point anyway by some other criteria.
2017-03-29 16:57:58 -04:00
Marty Schoch a16efa5e78 add experimental support for indexing/query geo points
New field type GeoPointField, or "geopoint" in mapping JSON.

Currently structs and maps are considered when a mapping explicitly
marks a field as type "geopoint".  Several variants of "lon", "lng", and "lat"
are looked for in map keys, struct field names, or method names.

New query type GeoBoundingBoxQuery searches for documents which have a
GeoPointField indexed with a value that is inside the specified bounding box.

New query type GeoDistanceQuery searches for documents which have a
GeoPointField indexed with a value that is less than or equal to the
specified distance from the specified location.

New sort by method "geo_distance".  Hits can be sorted by their distance
from the specified location.

New geo utility package with all routines ported from Lucene.

New FilteringSearcher, which wraps an existing Searcher, but filters
all hits with a user-provided callback.
2017-03-24 17:22:21 -07:00
Marty Schoch 0aab8d7fb9 fix query string parsing of numeric ranges with negative value
fixes #550
2017-03-16 11:11:28 -04:00
Marty Schoch bbab4d39ee improve error checking when parsing numbers 2017-02-24 16:30:09 -05:00
Marty Schoch 23bf986632 allow for exact numeric matches with field:val syntax
previously, the only way to get numeric matching was with the
range operators :> :>= :< :<=

now, when we encounter field:val if the val can be parsed as a
number, then we do a disjunction search, which includes
searching for val as a text term and as an exact numeric value
2017-02-24 13:41:59 -05:00
Marty Schoch f6563ed9f5 switch from go tool yacc to goyacc as of Go 1.8
does not imply need for Go 1.8 to use, just for developers
to regenerate the query parser
2017-02-24 09:23:00 -05:00
Marty Schoch f391b991c2 improve query string compatibility
1) disjunction and conjunction queries now support a
"query string mode".  By default they do not operate
in this mode.  When in this mode, any disjunct/conjunct
which evaluates to MatchNone searcher, will be removed
from the disjunction/conjunction.  If the query ends
up with NO conjuncts/disjuncts, it will itself
return the MatchNone seacher.

2) boolean query also supports a query string mode.  when in
this mode, the Must, Should and MustNot searchers are all put
into query string mode.

3) rewriting of negation only queries (like -foo) now take into
account the rewriting rules above, and those are handled first.
this means that we rewrite correctly in case of +stoword -foo

4) the empty query string is now valid, and returns 0 hits.
previously this was considered a validation error.
2017-02-23 13:04:18 -05:00
Marty Schoch 56a79528c3 update match_phrase query to handle multiple tokens in same pos
we now use a multiphrase query in all cases
internally its optimized to be the same as regular phrase query
anyway, and we simplly map all the tokens in the stream into
a multi-phrase query with the appropriate structure
2017-02-10 17:12:13 -05:00
Marty Schoch a5d1d7974c add query support for multi-phrase
when parsing json, when we encounter the key "terms", we first
try to parse as traditional phrase query, then if that fails,
we also try parsing it as multi-phrase
2017-02-10 16:46:38 -05:00
Marty Schoch 4e38c49287 move phrase search logic into phrase searcher
the logic of how a phrase search works should be an internal
detail of the phrase searcher.  further, these changes will
allow proper scoring of phrase matches, which require access
to the underlying searcher objects, which were hidden in the
previous approach.
2017-02-10 12:05:01 -05:00
Marty Schoch b55c9043b9 improve performance of regular expression and wildcard queries
While researching an observed performance issue with wildcard
queries, it was observed that the LiteralPrefix() method on
the regexp.Regexp struct did not always behave as expected.

In particular, when the pattern starts with ^, AND involves
some backtracking, the LiteralPrefix() seems to always be the
empty string.

The side-effect of this is that we rely on having a helpful
prefix, to reduce the number of terms in the term dictionary
that need to be visited.

This change now makes the searcher enforce start/end on the term
directly, by using FindStringIndex() instead of Match().
Next, we also modified WildcardQuery and RegexpQuery to no
longer include the ^ and $ modifiers.

Documentation was also udpated to instruct users that they should
not include the ^ and $ modifiers in their patterns.
2017-01-18 16:22:16 -05:00
Steve Yen 89a1cefde1 API change: optional SearchRequest.IncludeLocations flag
This is a change in search result behavior in that location
information is no longer provided by default with search results.

Although this looks like a wide-ranging change, it's mostly a
mechanical replacement of the explain bool flag with a new
search.SearcherOptions struct, which holds both the Explain bool flag
and the IncludeTermVectors bool flag.
2017-01-05 21:11:22 -08:00
Marty Schoch c927e124dd Merge branch 'master' of https://github.com/slavikm/bleve into slavikm-master4 2016-11-28 14:03:35 -05:00
slavikm 75c8c0e2b1 Revert the nil protection which is not needed 2016-11-23 09:26:07 -08:00
slavikm 20b847f04e Added protection again nil Boost 2016-11-22 13:04:36 -08:00
slavikm a4c94e440e Added missing boost getters 2016-11-22 12:50:08 -08:00
Marty Schoch d372602f3c add support for parsing BoolFieldQuery from JSON
presence of the "bool" key triggers parsing as a BoolFieldQuery
fixes #498
2016-11-15 10:29:11 -05:00
slavikm 187d6013df Make sure getters follow the Go convention 2016-11-14 15:30:07 -08:00
slavikm 339ddbe0fa Added getters to boost and field query interfaces 2016-11-14 14:02:43 -08:00
Steve Yen 32e459f6b6 fix BleveQueryTime json marshaling with double-quoting
See also MB-21322 found by Mihir Kamdar.
2016-10-12 11:39:08 -07:00
Ben Campbell 11f18333fb Settle on default fuzziness of 1 (for now)
see https://groups.google.com/d/msg/bleve/vkVxnLMlXow/5qM1jL0ZEgAJ
2016-10-04 15:00:50 +13:00
Marty Schoch 2f48d7fb02 fix misspellings 2016-10-02 12:11:15 -04:00
Marty Schoch abeca559cd don't export unnecessary method 2016-10-02 11:50:58 -04:00
Marty Schoch 3a276153a3 actually rename packages to singular, not just directory name 2016-10-02 10:29:39 -04:00
Marty Schoch 2332455bd2 nicer formatting of license header 2016-10-02 10:13:14 -04:00
Marty Schoch 6bf9dd59ab BREAKING CHANGE - additional package renaming
i recently learned that package names should also prefer the
singular form, not the plural form
2016-10-01 17:20:59 -04:00
Steve Yen a9cb8779c3 more careful Close()'ing and cleanup of searchers
From diagnosing a recent issue where the termSearchersFinished stats
were incorrectly tracked, I ended up scouring the Close() / cleanup
codepaths.

This change takes more care in Close()'ing child searchers, especially
in error situations.  This can be important to allow underlying
kvstore's to release resources.
2016-09-30 16:07:01 -07:00
Marty Schoch c487f29a46 BREAKING CHANGE - rename numeric_util to numeric 2016-09-30 12:36:43 -04:00
Marty Schoch 35da361bfa BREAKING CHANGE - renamed packages to be shorter and not use _
this commit only addresses the analysis sub-package
2016-09-30 12:36:10 -04:00
Marty Schoch b863add129 address code review comments from @steveyen 2016-09-29 14:54:17 -04:00
Marty Schoch 073c4d0ebd fix issues identified by go vet 2016-09-29 14:54:17 -04:00
Marty Schoch 226efaebd8 remove query prefix from filenames, now in query package 2016-09-29 14:54:17 -04:00
Marty Schoch ee17941f7f switch DateRangeQuery to use time.Time instead of string
as we are a Go library is this the much more natural way to
express such queries.

support for strings is still supported through json marshal
and unmarshal, as well as inside query string queries

as before we use the package level QueryDateTimeParser to
deterimine which date time parser to use for parsing

only serializing out to json, we consult a new package
variable: QueryDateTimeFormat

this addresses the longstanding PR #255
2016-09-29 14:54:16 -04:00
Marty Schoch a265218f76 heavier refactor of Query interface to simplify
Boostable, Fieldable, Validatable broken out into separate
interfaces.  This allows them to be discoverable when
needed, but ignorable otherwise.  The top-level bleve package
only every cares about Validatable and even that is optional.

Also, this change goes further to make the structure names
more reasonable, for cases where you're directly interacting
with the structures.
2016-09-29 14:54:16 -04:00
Marty Schoch 9ec2ddd757 initial refactor of query into separate package 2016-09-29 14:54:16 -04:00