bleve

Author	SHA1	Message	Date
Marty Schoch	2332455bd2	nicer formatting of license header	2016-10-02 10:13:14 -04:00
Marty Schoch	c487f29a46	BREAKING CHANGE - rename numeric_util to numeric	2016-09-30 12:36:43 -04:00
Marty Schoch	79cc39a67e	refactor mapping to inteface and move into separate package the index mapping contains some relatively messy logic and the top-level bleve package only cares about a relatively small portion of this the motivation for this change is to codify the part that the top-level bleve package cares about into an interface then move all the details into its own package NOTE: the top-level bleve package still has hard dependency on the actual implementation (for now) because it must deserialize mappings from JSON and simply assumes it is this one instance. this is seen as OK for now, and this issue could be revisited in a future change. moving the logic into a separate package is seen as a simplification of top-level bleve, even though we still depend on the one particular implementation.	2016-09-29 14:53:18 -04:00
Marty Schoch	3fd2a64872	BREAKING CHANGE - removed DumpXXX() methods from bleve.Index The DumpXXX() methods were always documented as internal and unsupported. However, now they are being removed from the public top-level API. They are still available on the internal IndexReader, which can be accessed using the Advanced() method. The DocCount() and DumpXXX() methods on the internal index have moved to the internal index reader, since they logically operate on a snapshot of an index.	2016-09-13 12:40:01 -04:00
Marty Schoch	60750c1614	improved implementation to address perf regressions primary change is going back to sort values be []string and not []interface{}, this avoid allocatiosn converting into the interface{} that sounds obvious, so why didn't we just do that first? because a common (default) sort is score, which is naturally a number, not a string (like terms). converting into the number was also expensive, and the common case. so, this solution also makes the change to NOT put the score into the sort value list. instead you see the dummy value "_score". this is just a placeholder, the actual sort impl knows that field of the sort is the score, and will sort using the actual score. also, several other aspets of the benchmark were cleaned up so that unnecessary allocations do not pollute the cpu profiles Here are the updated benchmarks: $ go test -run=xxx -bench=. -benchmem -cpuprofile=cpu.out BenchmarkTop10of100000Scores-4 3000 465809 ns/op 2548 B/op 33 allocs/op BenchmarkTop100of100000Scores-4 2000 626488 ns/op 21484 B/op 213 allocs/op BenchmarkTop10of1000000Scores-4 300 5107658 ns/op 2560 B/op 33 allocs/op BenchmarkTop100of1000000Scores-4 300 5275403 ns/op 21624 B/op 213 allocs/op PASS ok github.com/blevesearch/bleve/search/collectors 7.188s Prior to this PR, master reported: $ go test -run=xxx -bench=. -benchmem BenchmarkTop10of100000Scores-4 3000 453269 ns/op 360161 B/op 42 allocs/op BenchmarkTop100of100000Scores-4 2000 519131 ns/op 388275 B/op 219 allocs/op BenchmarkTop10of1000000Scores-4 200 7459004 ns/op 4628236 B/op 52 allocs/op BenchmarkTop100of1000000Scores-4 200 8064864 ns/op 4656596 B/op 232 allocs/op PASS ok github.com/blevesearch/bleve/search/collectors 7.385s So, we're pretty close on the smaller datasets, and we scale better on the larger datasets. We also show fewer allocations and bytes in all cases (some of this is artificial due to test cleanup).	2016-08-25 15:47:07 -04:00
Marty Schoch	ce0b299d6f	switch sort impl to use interface this improves perf in the case where we're not doing any sorting as we avoid allocating memory and converting scores into numeric terms	2016-08-24 19:02:22 -04:00
Marty Schoch	0322ecd441	adjust new sort functionality to also work with MultiSearch	2016-08-24 14:07:10 -04:00
Marty Schoch	2a703376ea	fix ineffectual assignments	2016-04-02 22:42:56 -04:00
Marty Schoch	194ee82c80	gofmt simplifications	2016-04-02 21:54:33 -04:00
Marty Schoch	d7292ed891	add support for gathering stats via map for easier consumption	2016-03-07 18:37:46 -05:00
Marty Schoch	0b2380d9bf	introduce ability for searches to timeout or be cancelled our implementation uses: golang.org/x/net/context New method SearchInContext() allows the user to run a search in the provided context. If that context is cancelled or exceeds its deadline Bleve will attempt to stop and return as soon as possible. This is a best effort attempt at this time and may not be in a timely manner. If the caller must return very near the timeout, the call should also be wrapped in a goroutine. The IndexAlias implementation is affected in a slightly more complex way. In order to return partial results when a timeout occurs on some indexes, the timeout is strictly enforced, and at the moment this does introduce an additional goroutine. The Bleve implementation honoring the context is currently very course-grained. Specifically we check the Done() channel between each DocumentMatch produced during the search. In the future we will propogate the context deeper into the internals of Bleve, and this will allow finer-grained timeout behavior.	2016-03-02 17:30:21 -05:00
Marty Schoch	496fd365fd	fix broken test expectations	2016-02-23 13:05:16 -05:00
Marty Schoch	214b67ad66	SearchResult now includes a Status section the Status section can report on the number of total/fail/success indexes when querying across multiple indexes through IndexAlias Further, searching an IndexAlias will now return partial results, the burden is on the caller to check the number of failed indexes and decide how to handle this situation.	2016-02-22 16:50:40 -05:00
Marty Schoch	84ec206fec	add some tests for index names in results	2015-12-08 14:38:46 -05:00
Marty Schoch	aa7658bbb0	give indexes names, make stats available via expvar by default	2015-12-06 14:01:03 -05:00
Marty Schoch	93e01a803e	fix issues identified by errcheck part of #169	2015-04-07 14:52:00 -04:00
Marty Schoch	522f9d5cc7	significant change to index format, support dictionary rows this introduces disk format v4 now the summary rows for a term are stored in their own "dictionary row" format, previously the same information was stored in special term frequency rows this now allows us to easily iterate all the terms for a field in sorted order (useful for many other fuzzy data structures) at the top-level of bleve you can now browse terms within a field using the following api on the Index interface: FieldDict(field string) (index.FieldDict, error) FieldDictRange(field string, startTerm []byte, endTerm []byte) (index.FieldDict, error) FieldDictPrefix(field string, termPrefix []byte) (index.FieldDict, error) fixes #127	2015-03-10 16:22:19 -04:00
Marty Schoch	af356acff0	changed batch behavior now created through the index itself mapping problems reported early at the time data is added to the batch, previously these were not reported until the batch was executed	2015-03-09 08:20:39 -04:00
Marty Schoch	0771f813ce	SearchResult Took field now returns full time in Search() likewise, MultiSearch used by aliases spanning multiple will also return full time in MultiSearch() closes #163	2015-02-19 12:11:40 +05:30
Marty Schoch	daeaa2c129	fix bad math in multi search, and return original reqest in res related to #164	2015-02-18 17:24:22 +05:30
Marty Schoch	68712cd142	support for accessing the underlying index/store impls now you can access the underlying index/store implementations using the Advanced() method. this is intedned for advanced usage only, and can lead to problems if misused. also, there is a new method NewUsing(...) which allows callers of the top-level API to choose which underlying k/v store they want to use.	2014-12-27 13:23:46 -08:00
Marty Schoch	0355525d93	added another set of tests for IndexAlias with single Index	2014-11-25 14:56:42 -05:00
Marty Schoch	5fa93c8540	added index alias tests for multiple aliases	2014-11-25 14:25:56 -05:00
Marty Schoch	b3841fa335	added more tests for MultiSearch	2014-11-25 13:50:15 -05:00
Marty Schoch	3c886276ed	fix error message typo	2014-11-24 17:14:44 -05:00
Marty Schoch	69d69e4516	fix panic in MultiSearch when all indexes return error fixes #126	2014-11-24 17:12:16 -05:00

26 Commits