0
0
Fork 0
Commit Graph

71 Commits

Author SHA1 Message Date
Marty Schoch 9ec2ddd757 initial refactor of query into separate package 2016-09-29 14:54:16 -04:00
Marty Schoch 79cc39a67e refactor mapping to inteface and move into separate package
the index mapping contains some relatively messy logic
and the top-level bleve package only cares about a relatively
small portion of this
the motivation for this change is to codify the part that the
top-level bleve package cares about into an interface
then move all the details into its own package

NOTE: the top-level bleve package still has hard dependency on
the actual implementation (for now) because it must deserialize
mappings from JSON and simply assumes it is this one instance.
this is seen as OK for now, and this issue could be revisited
in a future change.  moving the logic into a separate package
is seen as a simplification of top-level bleve, even though
we still depend on the one particular implementation.
2016-09-29 14:53:18 -04:00
Marty Schoch fb0f4bbecd BREAKING CHANGE - new method to create memory only index
Previously bleve allowed you to create a memory-only index by
simply passing "" as the path argument to the New() method.

This was not clear when reading the code, and led to some
problematic error cases as well.

Now, to create a memory-only index one should use the
NewMemOnly() method.  Passing "" as the path argument
to the New() method will now return os.ErrInvalid.

Advanced users calling NewUsing() can create disk-based or
memory-only indexes, but the change here is that pass ""
as the path argument no longer defaults you into getting
a memory-only index.  Instead, the KV store is selected
manually, just as it is for the disk-based solutions.

Here is an example use of the NewUsing() method to create
a memory-only index:

NewUsing("", indexMapping, Config.DefaultIndexType,
         Config.DefaultMemKVStore, nil)

Config.DefaultMemKVStore is just a new default value
added to the configuration, it currently points to
gtreap.Name (which could have been used directly
instead for more control)

closes #427
2016-09-27 14:11:40 -04:00
Marty Schoch 04fd62dec3 further tweaks, now all bleve tests pass 2016-09-11 20:29:15 -04:00
Marty Schoch 60efecc8e9 cap preallocation by the collector to reasonable value
the collector has optimizations to avoid allocation and reslicing
during the common case of searching for top hits

however, in some cases users request an a very large number of
search hits to be returned (attempting to get them all)  this
caused unnecessary allocation of ram.

to address this we introduce a new constant PreAllocSizeSkipCap
it defaults the value of 1000.  if your search+skip is less than
this constant, you get the optimized behavior.  if your
search+skip is greater than this, we cap the preallcations to
this lower value.  additional space is acquired on an as needed
basis by growing the DocumentMatchPool and reslicing the
collector backing slice

applications can change the value of PreAllocSizeSkipCap to suit
their own needs

fixes #408
2016-08-31 15:25:17 -04:00
Marty Schoch 27f5c6ec92 expose simple string slice based sorting to top-level bleve
this change means simple sort requirements no longer require
importing the search package (high-level API goal)

also the sort test at the top-level was changed to use this form
2016-08-17 14:49:06 -07:00
Marty Schoch 750e0ac16c change sort field impl to use indexed values not stored values 2016-08-17 09:20:44 -07:00
Marty Schoch 0bb69a9a1c Merge branch 'master' of https://github.com/dtylman/bleve into sort-by-field-try2 2016-08-12 14:23:55 -04:00
Danny Tylman b585c5786b removing mock data generation packages from unit-tests
fixing wrong sort order on certain fields
2016-08-11 11:35:08 +03:00
Danny Tylman 5164e70f6e Adding sort to SearchRequest. 2016-08-09 16:18:53 +03:00
Marty Schoch aa3ae3d39c enable read_only mode for boltdb indexes
fixes #405
2016-08-06 10:47:34 -04:00
Marty Schoch 9089de251f remove byte_array_conveters
fixes #392
fixes #100
2016-07-01 10:21:41 -04:00
Marty Schoch b8a2fbb887 fix data race in bleve batch reuse
Currently bleve batch is build by user goroutine
Then read by bleve gourinte
This is still safe when used correctly
However, Reset() will modify the map, which is now a data race

This fix is to simply make batch.Reset() alloc new maps.
This provides a data-access pattern that can be used safely.
Also, this thread argues that creating a new map may be faster
than trying to reuse an existing one:

https://groups.google.com/d/msg/golang-nuts/UvUm3LA1u8g/jGv_FobNpN0J

Separate but related, I have opted to remove the "unsafe batch"
checking that we did.  This was always limited anyway, and now
users of Go 1.6 are just as likely to get a panic from the
runtime for concurrent map access anyway.  So, the price paid
by us (additional mutex) is not worth it.

fixes #360 and #260
2016-04-08 15:32:13 -04:00
Marty Schoch e00577f265 change registry cache implementation to allow concurrent use
previously we just used a Go builtin map
this was not safe for concurrent read/write and upon upgrading
to Go 1.6 we were notified of the problem

fixes #349
2016-03-07 16:05:34 -05:00
Marty Schoch 0b2380d9bf introduce ability for searches to timeout or be cancelled
our implementation uses: golang.org/x/net/context

New method SearchInContext() allows the user to run a search
in the provided context.  If that context is cancelled or
exceeds its deadline Bleve will attempt to stop and return
as soon as possible.  This is a *best effort* attempt at this
time and may *not* be in a timely manner.  If the caller must
return very near the timeout, the call should also be wrapped
in a goroutine.

The IndexAlias implementation is affected in a slightly more
complex way.  In order to return partial results when a timeout
occurs on some indexes, the timeout is strictly enforced, and
at the moment this does introduce an additional goroutine.

The Bleve implementation honoring the context is currently
very course-grained.  Specifically we check the Done() channel
between each DocumentMatch produced during the search.  In the
future we will propogate the context deeper into the internals
of Bleve, and this will allow finer-grained timeout behavior.
2016-03-02 17:30:21 -05:00
Marty Schoch e523bf757e test slow timer with different way to avoid windows 15ms timer 2016-02-09 15:48:08 -05:00
Marty Schoch 9a1e6e1905 fix some test failures on windows 2016-02-09 13:33:11 -05:00
Marty Schoch 2479ddef2e fixed errcheck issues 2016-01-13 17:10:13 -05:00
slavikm 680be52f87 Implemented boolean field support 2016-01-11 17:18:03 -08:00
Marty Schoch 8efbd556a3 fix indexing bug with data coming from arrays
fixes #295
2015-12-21 14:59:32 -05:00
Steve Yen 2d4cd7a696 go fmt index_text.go 2015-11-23 09:28:09 -08:00
Marty Schoch 97735ac2b6 set github issue number in testcase name 2015-11-23 08:41:34 -05:00
Mark Mindenhall 17d8391b2f Fixes datetime mapping from JSON, using DateTimeFieldMapping 2015-11-20 19:15:35 -07:00
Patrick Mezard 498e4a0de7 simplify FieldMapping.analyzerForField()
I stumbled onto that while trying to understand how analyzers are
resolved. The new code looks simpler to me and removes useless calls to
DocumentMapping.defaultAnalyzerName() when an analyzer is set at
FieldMapping level.

The slight change to TestStoredFieldPreserved avoids a stacktrace when
the test fails.
2015-10-02 15:45:43 +02:00
Patrick Mezard 7db27aeba1 implement document static mappings
DocumentMapping.Dynamic was ignored by everything but the marshalling
code.

Issue #235
2015-09-29 11:32:36 +02:00
Marty Schoch cddf90c0ee don't allow operations on empty doc id
fixes #239
2015-09-28 17:00:08 -04:00
Marty Schoch f81b2be334 major refactor of bleve configuration
see #221 for full details
2015-09-16 17:10:59 -04:00
Marty Schoch 3682c25467 update to correctly work with composite fields
also updated search results to return array positions
2015-07-31 11:16:11 -04:00
Marty Schoch 2af47cea75 fix query string query syntax when term starts with a number
fixes #207
2015-05-21 15:43:13 -04:00
Marty Schoch 5b8c9f2d73 added unit test for bug #207 2015-05-21 07:49:41 -04:00
Marty Schoch 8f70def63b properly use the stored array positions when loading a document
fixes #205
2015-05-15 15:47:54 -04:00
Marty Schoch 328bc73ed0 clarify Batch is not threadsafe in docs
in some limited cases we can detect unsafe usage
in these cases, do not trip over ourselves and panic
instead return a strongly typed error upside_down.UnsafeBatchUseDetected
also, introduced Batch.Reset() to allow batch reuse
this is currently still experimental
closes #195
2015-05-15 15:04:52 -04:00
Marty Schoch 7b871fde6a add test comparing search that matches everyting with doc count 2015-05-09 14:51:07 -04:00
Marty Schoch 6f28f3e5bd check error returned in test to make errcheck happy 2015-05-08 08:40:46 -04:00
Marty Schoch 57cd67fa88 fix data race on index metadata (docCount)
closes #198
2015-05-08 08:07:20 -04:00
Marty Schoch 056d74901e fix test to actually work reliably 2015-04-08 11:17:34 -04:00
Marty Schoch 8581e73cef added String method for Batch
also changed Batch methods to pointer receiver
closes #180
2015-04-08 10:41:42 -04:00
Marty Schoch 539aeb8dc7 fix errors identified by errcheck
part of #169
2015-04-07 18:05:41 -04:00
Marty Schoch 56c4a09de1 fix issues identified by errcheck
part of #169
2015-04-07 15:39:56 -04:00
Marty Schoch 93e01a803e fix issues identified by errcheck
part of #169
2015-04-07 14:52:00 -04:00
Marty Schoch 52712b9537 add missing index close causing tests to sometimes fail 2015-04-03 16:41:11 -04:00
Sathyanarayanan Gunasekaran 5c7aa21643 Add test for index.Stats 2015-03-19 14:06:59 -04:00
Sathyanarayanan Gunasekaran d9a7a2e3a0 Add test for index.FieldDictPrefix 2015-03-19 14:06:59 -04:00
Sathyanarayanan Gunasekaran 5b4ee3e598 Add test for index.FieldDictRange 2015-03-19 14:06:59 -04:00
Marty Schoch 522f9d5cc7 significant change to index format, support dictionary rows
this introduces disk format v4
now the summary rows for a term are stored in their own
"dictionary row" format, previously the same information
was stored in special term frequency rows
this now allows us to easily iterate all the terms for a field
in sorted order (useful for many other fuzzy data structures)

at the top-level of bleve you can now browse terms within a field
using the following api on the Index interface:

  FieldDict(field string) (index.FieldDict, error)
  FieldDictRange(field string, startTerm []byte, endTerm []byte) (index.FieldDict, error)
  FieldDictPrefix(field string, termPrefix []byte) (index.FieldDict, error)

fixes #127
2015-03-10 16:22:19 -04:00
Marty Schoch af356acff0 changed batch behavior
now created through the index itself
mapping problems reported early at the time
data is added to the batch, previously these
were not reported until the batch was executed
2015-03-09 08:20:39 -04:00
Marty Schoch 521d6101fd fix issue identified by go vet 2015-01-19 15:50:07 -05:00
Marty Schoch 7e3ba85b9d added test and fixed behavior to ensure correct value is stored
optimization introduced last week inadvertently meant we were
not preserving the original byte values of text fields that
were stored
2015-01-19 15:40:18 -05:00
Steve Yen d442713de6 typo in storage type error message 2015-01-06 09:18:36 -08:00
Marty Schoch 5978f50b8c added ability to log slow searches
closes #88
2014-12-28 19:34:16 -08:00