Instead of cloning an input bitmap, the roaring.Or(x, y)
implementation fills a brand new result bitmap, which should be allow
for more efficient packing and memory utilization.
a failing test was producing unhelpful pointer addresses as
the only debug output. this changes the output to print
the terms and locations as readable text
part of #629
the implementation of the doc id search requires that the list
of ids be sorted. however, when doing a multisearch across
many indexes at once, the list of doc ids in the query is shared.
deeper in the implementation, the search of each shard attempts
to sort this list, resulting in a data race.
this is one example of a potentially larger problem, however
it has been decided to fix this data race, even though larger
issues of data owernship may remain unresolved.
this fix makes a copy of the list of doc ids, just prior to
sorting the list. subsequently, all use of the list is on the
copy that was made, not the original.
fixes#518
previously the query string queries were modified to aid in
compatibility with other search systems. this change:
f391b991c2
has a problem when combined with:
77101ae424
due to the introduction of MatchNoneSearchers being returned
in a case where previously they never would.
the fix for now is to simply return disjunction queries on 0
terms instead. this ultimately also matches nothing, but avoids
triggering the logic which handles match none searchers in a
special way.
We recenlty introduced support for indexing the content of
things implementing TextMarshaler. Since often times interfaces
are implemented via pointer receivers, we added support to
introspect pointers (previously we just dereferenceed them and
traversed into their underlying structs). However, in doing so
we neglected to consider the case where the pointer does
implement the interface we care about, but happens to be nil.
fixes#603
previously, all numeric terms required to implement a numeric
range search were passed to the disjunction query (possibly
exceeding the disjunction clause limit)
now, after producing the list of terms, we filter them against
the terms which actually exist in the term dictionary. the
theory is that this will often greatly reduce the number of terms
and therefore reduce the likelihood that you would run into the
disjunction term limit in practice.
because the term dictionary interface does not have a seek API
and we're reluctant to add that now, i chose to do a binary
search of the terms, which either finds the term, or not. then
subsequent binary searches can proceed from that position,
since both the list of terms and the term dictionary are sorted.