0
0
Commit Graph

4 Commits

Author SHA1 Message Date
Marty Schoch
c5465eccb1 change from const to var so apps can adjust value 2016-08-31 16:43:50 -04:00
Marty Schoch
60efecc8e9 cap preallocation by the collector to reasonable value
the collector has optimizations to avoid allocation and reslicing
during the common case of searching for top hits

however, in some cases users request an a very large number of
search hits to be returned (attempting to get them all)  this
caused unnecessary allocation of ram.

to address this we introduce a new constant PreAllocSizeSkipCap
it defaults the value of 1000.  if your search+skip is less than
this constant, you get the optimized behavior.  if your
search+skip is greater than this, we cap the preallcations to
this lower value.  additional space is acquired on an as needed
basis by growing the DocumentMatchPool and reslicing the
collector backing slice

applications can change the value of PreAllocSizeSkipCap to suit
their own needs

fixes #408
2016-08-31 15:25:17 -04:00
Marty Schoch
c9310b906d introduced new collector store impl based on slice
counter-intuitively the list impl was faster than the heap
the theory was the heap did more comparisons and swapping
so even though it benefited from no interface and some cache
locality, it was still slower

the idea was to just use a raw slice kept in order
this avoids the need for interface, but can take same comparison
approach as the list

it seems to work out:

 go test -run=xxx -bench=. -benchmem -cpuprofile=cpu.out
BenchmarkTop10of100000Scores-4     	    5000	    299959 ns/op	    2600 B/op	      36 allocs/op
BenchmarkTop100of100000Scores-4    	    2000	    601104 ns/op	   20720 B/op	     216 allocs/op
BenchmarkTop10of1000000Scores-4    	     500	   3450196 ns/op	    2616 B/op	      36 allocs/op
BenchmarkTop100of1000000Scores-4   	     500	   3874276 ns/op	   20856 B/op	     216 allocs/op
PASS
ok  	github.com/blevesearch/bleve/search/collectors	7.440s
2016-08-26 11:52:49 -04:00
Marty Schoch
47c239ca7b refactored data structure out of collector
the TopNCollector now can either use a heap or a list

i did not code it to use an interface, because this is a very hot
loop during searching.  rather, it lets bleve developers easily
toggle between the two (or other ideas) by changing 2 lines

The list is faster in the benchmark, but causes more allocations.
The list is once again the default (for now).

To switch to the heap implementation, change:

store *collectStoreList
to
store *collectStoreHeap

and

newStoreList(...
to
newStoreHeap(...
2016-08-26 10:29:50 -04:00