0
0
Commit Graph

28 Commits

Author SHA1 Message Date
Marty Schoch
f82638c117 refactor phrase search to be recursive
a more correct solution that will enable us to extend in two
important ways:

1) support slop
2) support multi-phrase
2017-02-03 16:05:21 -05:00
Marty Schoch
12a7257b5f remove duplicate code suggested by review from @steveyen 2017-01-31 15:12:06 -05:00
Marty Schoch
7fd8aeb50a refactor phrase search into seprate methods
at the core, the Next() method moves another searcher forward
and checks each hit to see if it also satisfies the phrase
constraints.  the current implementation has 4 nested for loops.
these nested loops make it harder read (indentation) and harder
to reason about (complexity).

this refactor does not remove any loops, it simply moves some
of the inner loops into separate methods so that one can
more easily reason about the parts separately.
2017-01-31 13:32:46 -05:00
Marty Schoch
b55c9043b9 improve performance of regular expression and wildcard queries
While researching an observed performance issue with wildcard
queries, it was observed that the LiteralPrefix() method on
the regexp.Regexp struct did not always behave as expected.

In particular, when the pattern starts with ^, AND involves
some backtracking, the LiteralPrefix() seems to always be the
empty string.

The side-effect of this is that we rely on having a helpful
prefix, to reduce the number of terms in the term dictionary
that need to be visited.

This change now makes the searcher enforce start/end on the term
directly, by using FindStringIndex() instead of Match().
Next, we also modified WildcardQuery and RegexpQuery to no
longer include the ^ and $ modifiers.

Documentation was also udpated to instruct users that they should
not include the ^ and $ modifiers in their patterns.
2017-01-18 16:22:16 -05:00
Marty Schoch
8cd6040b63 Merge pull request #512 from steveyen/master
API change: optional SearchRequest.IncludeLocations flag
2017-01-09 14:19:17 -05:00
Steve Yen
89a1cefde1 API change: optional SearchRequest.IncludeLocations flag
This is a change in search result behavior in that location
information is no longer provided by default with search results.

Although this looks like a wide-ranging change, it's mostly a
mechanical replacement of the explain bool flag with a new
search.SearcherOptions struct, which holds both the Explain bool flag
and the IncludeTermVectors bool flag.
2017-01-05 21:11:22 -08:00
Silvan Jegen
1a6a4c493b Check locations in the phrase searcher as well 2016-11-08 20:05:36 +01:00
Silvan Jegen
33e2432fc6 Initialize the return value as late as possible 2016-11-08 20:05:36 +01:00
Silvan Jegen
3dd363afaa Don't search the same term twice
We have searched for the first term in the phrase query already so we
can skip it. Before doing so we have to add the location of the first
term.
2016-11-08 20:05:04 +01:00
Silvan Jegen
d87b4f88bf Refactor phrase searching
Reduce nesting by using early continues.
2016-11-08 20:04:28 +01:00
Steve Yen
adc409e823 optimize NewRegexpSearcher to return its disjunction searcher
This minor optimization removes an unnecessary wrapper around the
disjunction searcher.
2016-10-27 13:16:41 -07:00
Steve Yen
58c3b5c9b8 revert optimization that trims search-disjunction child searchers
This commit reverts a previous optimization attempt 3f588cd4a that
tried to trim or shrink the array of child searchers in a
search-disjunction.

Although I am not sure why at the moment, that optimization
incorrectly broke higher level boolean queries, but reverting so that
functionality is restored.
2016-10-18 14:38:34 -07:00
Marty Schoch
5c7a2264a2 Merge pull request #473 from steveyen/reuse-incrementBytes-in-moss-kv-integration
reuse incrementBytes() in moss KV store integration
2016-10-13 14:03:46 +02:00
Marty Schoch
cee18d302e Merge pull request #475 from steveyen/phrase-searcher-simplifications-dry
some simplification / DRY for phrase searcher
2016-10-12 23:07:35 +02:00
Steve Yen
1a994ce2a7 end fuzzy searcher prefixTerm construction loop early 2016-10-12 09:51:36 -07:00
Steve Yen
6a38fa3719 go fmt 2016-10-12 09:39:43 -07:00
Steve Yen
8230a7195f some simplification / DRY for phrase searcher 2016-10-12 09:26:31 -07:00
Marty Schoch
bddc064069 Merge pull request #471 from steveyen/remove-extra-indirection-LevenshteinDistance
removed extra level of pointer indirection from LevenshteinDistance()'s params
2016-10-12 14:05:34 +02:00
Marty Schoch
483f06ef5b Merge pull request #467 from steveyen/optimize-disjunction-searcher-shrink-children
optimize disjunction searcher to trim child searchers array earlier
2016-10-12 14:00:19 +02:00
Marty Schoch
b76cbc805e Merge pull request #465 from steveyen/cleanup-when-PrefixSearcher-error
close resources when we encounter an error on PrefixSearcher initialization
2016-10-12 13:39:28 +02:00
Steve Yen
b6c97ddbfe removed extra ptr indirection from LevenshteinDistance 2016-10-11 08:49:10 -07:00
Steve Yen
3f588cd4ae optimize disjunction searcher to trim child searchers array earlier
Disjunction searchers are used heavily by higher-level searchers, like
prefix searchers.  In that case, a disjunction searcher might have
many thousands of child searchers.

This commit adds an optimization to close each child term searcher as
soon as a child searcher is finished and remove it from the
disjunction searcher's children.
2016-10-10 22:47:11 -07:00
Steve Yen
535b746b41 close resources when error on PrefixSearcher initialization 2016-10-10 17:29:59 -07:00
Steve Yen
2a022830f0 check FieldDictPrefix err result in prefix searcher 2016-10-10 15:35:54 -07:00
Marty Schoch
8e784c362b another golint suggestions 2016-10-02 11:54:04 -04:00
Marty Schoch
3a276153a3 actually rename packages to singular, not just directory name 2016-10-02 10:29:39 -04:00
Marty Schoch
2332455bd2 nicer formatting of license header 2016-10-02 10:13:14 -04:00
Marty Schoch
6bf9dd59ab BREAKING CHANGE - additional package renaming
i recently learned that package names should also prefer the
singular form, not the plural form
2016-10-01 17:20:59 -04:00