We recenlty introduced support for indexing the content of
things implementing TextMarshaler. Since often times interfaces
are implemented via pointer receivers, we added support to
introspect pointers (previously we just dereferenceed them and
traversed into their underlying structs). However, in doing so
we neglected to consider the case where the pointer does
implement the interface we care about, but happens to be nil.
fixes#603
previously, all numeric terms required to implement a numeric
range search were passed to the disjunction query (possibly
exceeding the disjunction clause limit)
now, after producing the list of terms, we filter them against
the terms which actually exist in the term dictionary. the
theory is that this will often greatly reduce the number of terms
and therefore reduce the likelihood that you would run into the
disjunction term limit in practice.
because the term dictionary interface does not have a seek API
and we're reluctant to add that now, i chose to do a binary
search of the terms, which either finds the term, or not. then
subsequent binary searches can proceed from that position,
since both the list of terms and the term dictionary are sorted.
Many existing structs already have a Type field or method which
conflicts with the bleve Classifier interface. To address this
without breaking existing applications, we introduce an
alternate BleveType() method which will be checked first. The
interface describing this method is private, as it should never
need to be referenced outside this package.
fixes#283
Sometimes you have structs which contain data which isn't
exported, or for which the correct data to index isn't just the
contents of it's exported fields. In these cases your struct
can implement TextMarshaler to return a suitable text
representation.
Previously bleve did not recognize this interface and do anything
to use it. Now, if the field containing such a struct is
explicitly mapped as "text" and if the struct (or pointer to it)
implements TextMarshaler, we index a text field with the
contents returned by MarshalText().
For backwards compatibilty, dynamic mappings will never use
this feature, and will continue to traverse into the struct
and index the exported fields directly.
fixes#281
if min and max are the same term
and the term is in dictionary
and both in and max are set to exclusive
then we would panic attempting to access element -1 of a slice.
now, after trimming the slice, we recheck that the length is > 0
this introduces a new light Spanish stemmer
and move the other pure Go Spanish analysis components back
in from the blevex package
the libstemmer version of the stemmer will remain in blevex
this introduces new German light stemmer
and moves the other pure Go german analysis components back
in from the blevex package
the libstemmer version of the stemmer will remain in blevex
there was a bug where if the circle described by the point
distance query crossed the poles, then we incorrectly built
a box around it. this resulted in incorrect searh results.
When performing a MultiSearch, we create child SearchRequests
from the original SearchRequest. In doing so we copy many fields.
But, copying of the SortOrder was incorrect, as this contains
state, and distint SortOrder objects must be used. This change
introduces a Copy() method to the SearchSort interface, and
to the SortOrder types. MultiSearch now creates a new copy of
the SortOrder for each child request.
The term range query is not often used in full-text queries, but
can be useful when filtering on keyword indexed text terms in
the index.
The JSON syntax to do a TermRange query is the same as for
NumericRange, but the min/max values must be string and not
float64.
- TermSearcher has alternate constructor if term is []byte, this can avoid
copying in some cases. TermScorer updated to accept []byte term. Also
removed a few struct fields which were not being used.
- New MultiTermSearcher searches for documents containing any of a list of
terms. Current implementation simply uses DisjunctionSearcher.
- Several other searcher constructors now simply build a list of terms and
then delegate to the MultiTermSearcher
- NewPrefixSearcher
- NewRegexpSearcher
- NewFuzzySearcher
- NewNumericRangeSearcher
- NewGeoBoundingBoxSearcher and NewGeoPointDistanceSearcher make use of
the MultiTermSearcher internally, and follow the pattern of returning
an existing search.Searcher, as opposed to their own wrapping struct.
- Callback filter functions used in NewGeoBoundingBoxSearcher and
NewGeoPointDistanceSearcher have been extracted into separate functions
which makes the code much easier to read.
- make geo queries accessible from top-level bleve
- update query parsing to support same geo point formats as
document parsing
- add constructor for easier sorting by geo distance in Go
- additional integration tests using alternate (GeoJSON) style points