0
0
Fork 0
Commit Graph

1402 Commits

Author SHA1 Message Date
Joachim Schwarm 4ddc50e86d
typo in documentation 2017-11-21 16:35:07 +01:00
Marty Schoch 6eea5b78da Merge pull request #631 from dvrkps/patch-1
travis: update go versions
2017-09-12 09:10:15 -04:00
Davor Kapsa f0503355da travis: update go versions 2017-09-12 10:56:33 +02:00
Marty Schoch c048833fcd added stringer method to phrase part
a failing test was producing unhelpful pointer addresses as
the only debug output.  this changes the output to print
the terms and locations as readable text

part of #629
2017-09-01 09:16:08 -04:00
Marty Schoch 930c06dfec rewrote logic to be more obvious
found during code walkthrough on 8/24/2017
2017-08-25 09:30:16 -07:00
Marty Schoch b7a51dae2a Merge pull request #625 from steveyen/master
remove unused Document.Number property
2017-08-24 17:08:20 -07:00
Steve Yen 546700b2de fix comment typo 2017-08-24 16:25:10 -07:00
Steve Yen 87115cbfb7 remove unused Document.Number property 2017-08-24 16:21:26 -07:00
Marty Schoch 82a101aedd Merge pull request #623 from mschoch/fix-race-518
fix data race in doc id search
2017-08-08 08:17:03 -04:00
Marty Schoch cea119449e fix data race in doc id search
the implementation of the doc id search requires that the list
of ids be sorted.  however, when doing a multisearch across
many indexes at once, the list of doc ids in the query is shared.
deeper in the implementation, the search of each shard attempts
to sort this list, resulting in a data race.

this is one example of a potentially larger problem, however
it has been decided to fix this data race, even though larger
issues of data owernship may remain unresolved.

this fix makes a copy of the list of doc ids, just prior to
sorting the list.  subsequently, all use of the list is on the
copy that was made, not the original.

fixes #518
2017-08-07 15:11:35 -04:00
Marty Schoch 174f8ed44a Merge pull request #615 from ethantkoenig/fix/camel_case
Fix token start/end/position values in camelCase tokenizer
2017-06-28 13:18:15 -04:00
Ethan Koenig 0433f05d9c Fix test 2017-06-22 18:56:28 -04:00
Ethan Koenig 8994ad2e00 Fix token start/end/position values in camelCase tokenizer 2017-06-22 17:42:39 -04:00
Marty Schoch 011b168f7b Merge pull request #612 from sreekanth-cb/extend_setter_dateTimeRange
Adding a new bucket setter method for dateTimeRange
2017-06-14 12:31:07 -04:00
Sreekanth Sivasankaran 71afa918fe Adding a new bucket setter method for dateTimeRange 2017-06-12 15:53:27 +05:30
Marty Schoch 48ac9862db Merge pull request #607 from mschoch/fix-query-string-numeric
fix issue with numeric range queries in query string
2017-06-06 16:57:00 -04:00
Marty Schoch 4c801f2f01 fix issue with numeric range queries in query string
previously the query string queries were modified to aid in
compatibility with other search systems.  this change:
f391b991c2
has a problem when combined with:
77101ae424
due to the introduction of MatchNoneSearchers being returned
in a case where previously they never would.

the fix for now is to simply return disjunction queries on 0
terms instead.  this ultimately also matches nothing, but avoids
triggering the logic which handles match none searchers in a
special way.
2017-06-06 16:03:05 -04:00
Marty Schoch 9234339472 Merge pull request #605 from mschoch/fix-nil-ptr
fix nil ptr panic on newly introduced text marshaler support
2017-06-05 10:58:29 -04:00
Marty Schoch 7274dddd2e fix nil ptr panic on newly introduced text marshaler support
We recenlty introduced support for indexing the content of
things implementing TextMarshaler.  Since often times interfaces
are implemented via pointer receivers, we added support to
introspect pointers (previously we just dereferenceed them and
traversed into their underlying structs).  However, in doing so
we neglected to consider the case where the pointer does
implement the interface we care about, but happens to be nil.

fixes #603
2017-06-05 10:08:10 -04:00
Marty Schoch 3351c3b046 Merge pull request #602 from mschoch/filter-numeric-range
filter numeric range terms against the term dictionary
2017-05-31 13:20:32 -04:00
Marty Schoch 77101ae424 filter numeric range terms against the term dictionary
previously, all numeric terms required to implement a numeric
range search were passed to the disjunction query (possibly
exceeding the disjunction clause limit)

now, after producing the list of terms, we filter them against
the terms which actually exist in the term dictionary.  the
theory is that this will often greatly reduce the number of terms
and therefore reduce the likelihood that you would run into the
disjunction term limit in practice.

because the term dictionary interface does not have a seek API
and we're reluctant to add that now, i chose to do a binary
search of the terms, which either finds the term, or not. then
subsequent binary searches can proceed from that position,
since both the list of terms and the term dictionary are sorted.
2017-05-31 13:15:13 -04:00
Marty Schoch cd5b307cde Merge pull request #598 from abhinavdangeti/master
MB-24560: Add moss store|collection histograms to stats
2017-05-26 12:07:14 -04:00
abhinavdangeti 8ec88a6cb0 MB-24560: Add moss store|collection histograms to stats 2017-05-25 16:32:36 -07:00
Marty Schoch 96b1993795 Merge pull request #583 from nrwiersma/master
Allow pre parsing query strings
2017-05-25 14:14:48 -04:00
Marty Schoch 6fde5eb61d Merge pull request #592 from rsdoiel/master
Adding --sort-by to query option for bleve command
2017-05-25 11:50:02 -04:00
Marty Schoch 3885595cb8 Merge pull request #597 from caseymrm/patch-1
Add comments to Location struct
2017-05-25 11:48:42 -04:00
Casey Muller 68b07c9e09 Review feedback 2017-05-25 08:32:10 -07:00
Casey Muller 875e19ebd9 Add comments to Location struct
Closes #596
2017-05-25 08:23:39 -07:00
Marty Schoch 67bef4e679 Merge pull request #595 from sreekanth-cb/sortgeodistance_field_exports
Exposing lon/lat fields in SortGeoDistance struct
2017-05-22 07:33:35 -04:00
Sreekanth Sivasankaran e8374c400b Exposing lon/lat fields in SortGeoDistance struct 2017-05-22 11:51:37 +05:30
Marty Schoch 64c9c61a22 Merge pull request #593 from mschoch/add-text-marshaler
add support for mapping to recognize/use TextMarshaler interface
2017-05-19 09:54:09 -04:00
Marty Schoch b26b845bd6 Merge pull request #594 from mschoch/add-blevetype
add support for BleveType() alternative for type detection
2017-05-19 09:46:28 -04:00
Marty Schoch 0cbe211120 add support for BleveType() alternative for type detection
Many existing structs already have a Type field or method which
conflicts with the bleve Classifier interface.  To address this
without breaking existing applications, we introduce an
alternate BleveType() method which will be checked first.  The
interface describing this method is private, as it should never
need to be referenced outside this package.

fixes #283
2017-05-19 09:22:12 -04:00
Marty Schoch 9359a69ee5 add support for mapping to recognize/use TextMarshaler interface
Sometimes you have structs which contain data which isn't
exported, or for which the correct data to index isn't just the
contents of it's exported fields.  In these cases your struct
can implement TextMarshaler to return a suitable text
representation.

Previously bleve did not recognize this interface and do anything
to use it.  Now, if the field containing such a struct is
explicitly mapped as "text" and if the struct (or pointer to it)
implements TextMarshaler, we index a text field with the
contents returned by MarshalText().

For backwards compatibilty, dynamic mappings will never use
this feature, and will continue to traverse into the struct
and index the exported fields directly.

fixes #281
2017-05-18 15:08:33 -04:00
R. S. Doiel 1d38f4791d added hyphen in query sort by option 2017-05-18 11:27:51 -07:00
R. S. Doiel c1db96946a Added -sortby, -b to query bleve command 2017-05-18 11:16:46 -07:00
Marty Schoch 5c9915c6f4 Merge pull request #589 from mschoch/fix-term-range-panic
fix panic in term range search
2017-05-05 23:18:56 -04:00
Marty Schoch 87f693fc57 fix panic in term range search
if min and max are the same term
and the term is in dictionary
and both in and max are set to exclusive
then we would panic attempting to access element -1 of a slice.

now, after trimming the slice, we recheck that the length is > 0
2017-05-05 23:13:04 -04:00
Marty Schoch 15cb7a505a Merge pull request #564 from steveyen/master
optimmize heap collector Final() for large counts
2017-04-29 19:48:52 -04:00
Marty Schoch 8435ce5054 Merge pull request #587 from mschoch/add-spanish
add a pure Go Spanish analyzer
2017-04-29 19:48:18 -04:00
Marty Schoch c0d5e75e70 Merge pull request #586 from mschoch/add-german
add a pure Go German analyzer
2017-04-29 19:46:08 -04:00
Marty Schoch b8de2df68d add a pure Go Spanish analyzer
this introduces a new light Spanish stemmer
and move the other pure Go Spanish analysis components back
in from the blevex package

the libstemmer version of the stemmer will remain in blevex
2017-04-29 19:31:43 -04:00
Marty Schoch ce901a8870 add a pure Go German analyzer
this introduces new German light stemmer
and moves the other pure Go german analysis components back
in from the blevex package

the libstemmer version of the stemmer will remain in blevex
2017-04-29 18:46:58 -04:00
Marty Schoch 11a45d6f9c Merge pull request #585 from mschoch/fix-geo-point-dist
fix geo point distance search
2017-04-27 18:26:34 -04:00
Marty Schoch 8df8d4e797 fix geo point distance search
there was a bug where if the circle described by the point
distance query crossed the poles, then we incorrectly built
a box around it.  this resulted in incorrect searh results.
2017-04-27 17:28:07 -04:00
Marty Schoch 92c5f3e2e6 Merge pull request #584 from mschoch/more-collector-benchmarks
topn collector switch approach based on size+skip
2017-04-27 09:27:08 -04:00
Marty Schoch a4a34cc3b2 topn collector switch approach based on size+skip
we now use the slice store when size+skip <= 10
and use the heap store when size+skip > 10

here are the new perf numbers:

go test -run=xxx -bench=. -benchmem
BenchmarkTop10of0Scores-4            	 1000000	      1150 ns/op	    2304 B/op	      15 allocs/op
BenchmarkTop10of3Scores-4            	 1000000	      1417 ns/op	    2304 B/op	      18 allocs/op
BenchmarkTop10of10Scores-4           	 1000000	      2133 ns/op	    2312 B/op	      25 allocs/op
BenchmarkTop10of25Scores-4           	  500000	      3410 ns/op	    2464 B/op	      26 allocs/op
BenchmarkTop10of50Scores-4           	  300000	      5174 ns/op	    2464 B/op	      26 allocs/op
BenchmarkTop10of10000Scores-4        	    5000	    342955 ns/op	    2488 B/op	      26 allocs/op
BenchmarkTop100of0Scores-4           	  300000	      4796 ns/op	   18320 B/op	      15 allocs/op
BenchmarkTop100of3Scores-4           	  300000	      5160 ns/op	   18352 B/op	      19 allocs/op
BenchmarkTop100of10Scores-4          	  200000	      6354 ns/op	   18408 B/op	      26 allocs/op
BenchmarkTop100of25Scores-4          	  200000	     10023 ns/op	   18568 B/op	      41 allocs/op
BenchmarkTop100of50Scores-4          	  100000	     16821 ns/op	   18832 B/op	      66 allocs/op
BenchmarkTop100of10000Scores-4       	    3000	    508989 ns/op	   19760 B/op	     117 allocs/op
BenchmarkTop1000of10000Scores-4      	    1000	   1814198 ns/op	  184768 B/op	    1017 allocs/op
BenchmarkTop10000of100000Scores-4    	      50	  26623920 ns/op	 1939592 B/op	   19024 allocs/op
BenchmarkTop10of100000Scores-4       	     500	   3730204 ns/op	    2496 B/op	      26 allocs/op
BenchmarkTop100of100000Scores-4      	     300	   4057127 ns/op	   19912 B/op	     117 allocs/op
BenchmarkTop1000of100000Scores-4     	     200	   6390180 ns/op	  186200 B/op	    1017 allocs/op
BenchmarkTop10000of1000000Scores-4   	      20	  82785756 ns/op	 1963897 B/op	   19024 allocs/op
PASS
ok  	github.com/blevesearch/bleve/search/collector	31.537s

Previously with heap:

go test -run=xxx -bench=. -benchmem
BenchmarkTop10of0Scores-4            	 1000000	      1216 ns/op	    2288 B/op	      15 allocs/op
BenchmarkTop10of3Scores-4            	 1000000	      1593 ns/op	    2320 B/op	      19 allocs/op
BenchmarkTop10of10Scores-4           	  500000	      2734 ns/op	    2376 B/op	      26 allocs/op
BenchmarkTop10of25Scores-4           	  300000	      5077 ns/op	    2520 B/op	      27 allocs/op
BenchmarkTop10of50Scores-4           	  200000	      6875 ns/op	    2528 B/op	      27 allocs/op
BenchmarkTop10of10000Scores-4        	    3000	    351210 ns/op	    2552 B/op	      27 allocs/op
BenchmarkTop100of0Scores-4           	  300000	      4846 ns/op	   18304 B/op	      15 allocs/op
BenchmarkTop100of3Scores-4           	  300000	      5357 ns/op	   18336 B/op	      19 allocs/op
BenchmarkTop100of10Scores-4          	  200000	      6462 ns/op	   18392 B/op	      26 allocs/op
BenchmarkTop100of25Scores-4          	  200000	     10012 ns/op	   18552 B/op	      41 allocs/op
BenchmarkTop100of50Scores-4          	  100000	     17089 ns/op	   18816 B/op	      66 allocs/op
BenchmarkTop100of10000Scores-4       	    3000	    528193 ns/op	   19744 B/op	     117 allocs/op
BenchmarkTop1000of10000Scores-4      	    1000	   1859447 ns/op	  184752 B/op	    1017 allocs/op
BenchmarkTop10000of100000Scores-4    	      50	  28005664 ns/op	 1939576 B/op	   19024 allocs/op
BenchmarkTop10of100000Scores-4       	     300	   4120091 ns/op	    2560 B/op	      27 allocs/op
BenchmarkTop100of100000Scores-4      	     300	   4325227 ns/op	   19896 B/op	     117 allocs/op
BenchmarkTop1000of100000Scores-4     	     200	   6799804 ns/op	  186184 B/op	    1017 allocs/op
BenchmarkTop10000of1000000Scores-4   	      20	  88494230 ns/op	 1963881 B/op	   19024 allocs/op
PASS
ok  	github.com/blevesearch/bleve/search/collector	30.198s

Previously with slice:

go test -run=xxx -bench=. -benchmem
BenchmarkTop10of0Scores-4            	 1000000	      1202 ns/op	    2288 B/op	      15 allocs/op
BenchmarkTop10of3Scores-4            	 1000000	      1453 ns/op	    2288 B/op	      18 allocs/op
BenchmarkTop10of10Scores-4           	 1000000	      2162 ns/op	    2296 B/op	      25 allocs/op
BenchmarkTop10of25Scores-4           	  500000	      3420 ns/op	    2448 B/op	      26 allocs/op
BenchmarkTop10of50Scores-4           	  300000	      5336 ns/op	    2448 B/op	      26 allocs/op
BenchmarkTop10of10000Scores-4        	    5000	    356733 ns/op	    2472 B/op	      26 allocs/op
BenchmarkTop100of0Scores-4           	  300000	      4877 ns/op	   18304 B/op	      15 allocs/op
BenchmarkTop100of3Scores-4           	  300000	      5132 ns/op	   18304 B/op	      18 allocs/op
BenchmarkTop100of10Scores-4          	  200000	      5787 ns/op	   18312 B/op	      25 allocs/op
BenchmarkTop100of25Scores-4          	  200000	      8083 ns/op	   18344 B/op	      40 allocs/op
BenchmarkTop100of50Scores-4          	  100000	     14419 ns/op	   18400 B/op	      65 allocs/op
BenchmarkTop100of10000Scores-4       	    2000	    665401 ns/op	   18848 B/op	     116 allocs/op
BenchmarkTop1000of10000Scores-4      	     100	  15417063 ns/op	  176560 B/op	    1016 allocs/op
BenchmarkTop10000of100000Scores-4    	       1	1860011022 ns/op	 1857960 B/op	   19023 allocs/op
BenchmarkTop10of100000Scores-4       	     300	   4099276 ns/op	    2480 B/op	      26 allocs/op
BenchmarkTop100of100000Scores-4      	     300	   4533645 ns/op	   18984 B/op	     116 allocs/op
BenchmarkTop1000of100000Scores-4     	      50	  30519235 ns/op	  178008 B/op	    1016 allocs/op
BenchmarkTop10000of1000000Scores-4   	       1	3483977385 ns/op	 1882072 B/op	   19023 allocs/op
PASS
ok  	github.com/blevesearch/bleve/search/collector	31.666s

It appears that this sucessfully gets the best of both, in these particular benchmark sizes.
2017-04-27 08:57:13 -04:00
Nicholas Wiersma b12c902457 Allow pre parsing query strings 2017-04-25 16:30:47 +02:00
Marty Schoch 17e21be71a Merge pull request #578 from seiflotfy/index-advanced
Add new IndexAdvanced function
2017-04-19 09:14:44 -04:00
Marty Schoch d855acf7fb Merge pull request #581 from mschoch/more-collector-benchmarks
add more collector benchmarks
2017-04-18 21:35:32 -04:00