0
0
Fork 0
Commit Graph

44 Commits

Author SHA1 Message Date
Marty Schoch c74e08f039 BREAKING API CHANGE - use stdlib context pkg
update all references to context to use std lib pkg
2018-02-27 11:33:43 -08:00
Andrey Khomenko dc9f994d95 Update index.go 2017-07-20 12:06:45 -04:00
Marty Schoch 0cbe211120 add support for BleveType() alternative for type detection
Many existing structs already have a Type field or method which
conflicts with the bleve Classifier interface.  To address this
without breaking existing applications, we introduce an
alternate BleveType() method which will be checked first.  The
interface describing this method is private, as it should never
need to be referenced outside this package.

fixes #283
2017-05-19 09:22:12 -04:00
Seif Lotfy 06b4daed87 Add new IndexAdvanced function 2017-04-12 00:31:51 +02:00
Rob McColl 414d21a541 Add comment about JSON serialization of kvconfig 2016-10-19 10:19:14 -04:00
Marty Schoch 2332455bd2 nicer formatting of license header 2016-10-02 10:13:14 -04:00
Marty Schoch 79cc39a67e refactor mapping to inteface and move into separate package
the index mapping contains some relatively messy logic
and the top-level bleve package only cares about a relatively
small portion of this
the motivation for this change is to codify the part that the
top-level bleve package cares about into an interface
then move all the details into its own package

NOTE: the top-level bleve package still has hard dependency on
the actual implementation (for now) because it must deserialize
mappings from JSON and simply assumes it is this one instance.
this is seen as OK for now, and this issue could be revisited
in a future change.  moving the logic into a separate package
is seen as a simplification of top-level bleve, even though
we still depend on the one particular implementation.
2016-09-29 14:53:18 -04:00
Marty Schoch fb0f4bbecd BREAKING CHANGE - new method to create memory only index
Previously bleve allowed you to create a memory-only index by
simply passing "" as the path argument to the New() method.

This was not clear when reading the code, and led to some
problematic error cases as well.

Now, to create a memory-only index one should use the
NewMemOnly() method.  Passing "" as the path argument
to the New() method will now return os.ErrInvalid.

Advanced users calling NewUsing() can create disk-based or
memory-only indexes, but the change here is that pass ""
as the path argument no longer defaults you into getting
a memory-only index.  Instead, the KV store is selected
manually, just as it is for the disk-based solutions.

Here is an example use of the NewUsing() method to create
a memory-only index:

NewUsing("", indexMapping, Config.DefaultIndexType,
         Config.DefaultMemKVStore, nil)

Config.DefaultMemKVStore is just a new default value
added to the configuration, it currently points to
gtreap.Name (which could have been used directly
instead for more control)

closes #427
2016-09-27 14:11:40 -04:00
Marty Schoch 3fd2a64872 BREAKING CHANGE - removed DumpXXX() methods from bleve.Index
The DumpXXX() methods were always documented as internal and
unsupported.  However, now they are being removed from the
public top-level API.  They are still available on the internal
IndexReader, which can be accessed using the Advanced() method.

The DocCount() and DumpXXX() methods on the internal index
have moved to the internal index reader, since they logically
operate on a snapshot of an index.
2016-09-13 12:40:01 -04:00
Marty Schoch d7292ed891 add support for gathering stats via map for easier consumption 2016-03-07 18:37:46 -05:00
Marty Schoch 0b2380d9bf introduce ability for searches to timeout or be cancelled
our implementation uses: golang.org/x/net/context

New method SearchInContext() allows the user to run a search
in the provided context.  If that context is cancelled or
exceeds its deadline Bleve will attempt to stop and return
as soon as possible.  This is a *best effort* attempt at this
time and may *not* be in a timely manner.  If the caller must
return very near the timeout, the call should also be wrapped
in a goroutine.

The IndexAlias implementation is affected in a slightly more
complex way.  In order to return partial results when a timeout
occurs on some indexes, the timeout is strictly enforced, and
at the moment this does introduce an additional goroutine.

The Bleve implementation honoring the context is currently
very course-grained.  Specifically we check the Done() channel
between each DocumentMatch produced during the search.  In the
future we will propogate the context deeper into the internals
of Bleve, and this will allow finer-grained timeout behavior.
2016-03-02 17:30:21 -05:00
opennota 8517feb1c6 Fix some typos 2016-01-15 05:46:27 +07:00
Marty Schoch ab67b2f642 Merge pull request #267 from pmezard/doc-dump-methods
index: document DumpAll, DumpDoc and DumpFields methods
2016-01-05 09:55:35 -05:00
Marty Schoch aa7658bbb0 give indexes names, make stats available via expvar by default 2015-12-06 14:01:03 -05:00
Patrick Mezard 03b78deb5c index: do not mention locking in DumpAll documentation
The behaviour depends on the nature of the KVStore.
2015-11-13 17:01:18 +01:00
Patrick Mezard 97529b1925 index: document DumpAll, DumpDoc and DumpFields methods 2015-11-03 18:11:02 +01:00
Patrick Mezard 2fa334fc27 doc: talk about "documents" not "indexed or stored documents" 2015-10-20 20:24:24 +02:00
Patrick Mezard b174c137fd doc: document DocIDReader, and some Index bits 2015-10-20 20:24:24 +02:00
Patrick Mezard ed1bdbf599 doc: document field analyzer resolution 2015-10-02 17:00:45 +02:00
Marty Schoch 09bde6ca87 Merge pull request #237 from pmezard/document-mapping-rules
doc: document indexed value/mappings interactions
2015-09-29 12:51:46 -04:00
Marty Schoch cddf90c0ee don't allow operations on empty doc id
fixes #239
2015-09-28 17:00:08 -04:00
Patrick Mezard f72172a902 doc: document indexed value/mappings interactions
This is not the final word on this but it would have helped me a lot
starting with bleve. I left out details about value processing and
custom parsers. I also ignored that named FieldMapping can directly
resolve value fields without having a parent SubDocumentMapping because
it did not appear in any example I read.

Let's consider this as a starting point for documentation improvements.
2015-09-23 19:57:38 +02:00
Marty Schoch dbb93b75a4 refactoring to allow pluggable index encodings
this lays the foundation for supporting the new firestorm
indexing scheme.  i'm merging these changes ahead of
the rest of the firestorm branch so i can continue
to make changes to the analysis pipeline in parallel
2015-09-02 13:12:08 -04:00
Marty Schoch 328bc73ed0 clarify Batch is not threadsafe in docs
in some limited cases we can detect unsafe usage
in these cases, do not trip over ourselves and panic
instead return a strongly typed error upside_down.UnsafeBatchUseDetected
also, introduced Batch.Reset() to allow batch reuse
this is currently still experimental
closes #195
2015-05-15 15:04:52 -04:00
Marty Schoch 8581e73cef added String method for Batch
also changed Batch methods to pointer receiver
closes #180
2015-04-08 10:41:42 -04:00
Marty Schoch 522f9d5cc7 significant change to index format, support dictionary rows
this introduces disk format v4
now the summary rows for a term are stored in their own
"dictionary row" format, previously the same information
was stored in special term frequency rows
this now allows us to easily iterate all the terms for a field
in sorted order (useful for many other fuzzy data structures)

at the top-level of bleve you can now browse terms within a field
using the following api on the Index interface:

  FieldDict(field string) (index.FieldDict, error)
  FieldDictRange(field string, startTerm []byte, endTerm []byte) (index.FieldDict, error)
  FieldDictPrefix(field string, termPrefix []byte) (index.FieldDict, error)

fixes #127
2015-03-10 16:22:19 -04:00
Marty Schoch af356acff0 changed batch behavior
now created through the index itself
mapping problems reported early at the time
data is added to the batch, previously these
were not reported until the batch was executed
2015-03-09 08:20:39 -04:00
Marty Schoch 1368d7b3b4 NewUsing persists the provided config to index meta
new method OpenUsing allows user to override values
in the persisted config
example would be opening the index, but using a different
buffer size for leveldb (not actually supported yet, but that
is the idea)
closes #138
2015-01-06 17:19:46 -05:00
Marty Schoch 68712cd142 support for accessing the underlying index/store impls
now you can access the underlying index/store implementations
using the Advanced() method.  this is intedned for advanced
usage only, and can lead to problems if misused.

also, there is a new method NewUsing(...) which allows callers
of the top-level API to choose which underlying k/v store they
want to use.
2014-12-27 13:23:46 -08:00
Marty Schoch 6141a5aad3 make batch internals private
closes #119
2014-11-25 11:11:28 -05:00
Marty Schoch c7443fe52b refactored API a bit
more things can return error now
in a couple of places we had to swallow errors because they didn't
fit the existing API.  in these case and proactively in a few
others we now return error as well.

also the batch API has been updated to allow performing
set/delete internal within the batch
2014-10-31 09:40:23 -04:00
Marty Schoch 0500a572af exposed Get/Set/Delete Internal methods
these are to be used to store side-channel information
along with the index
2014-10-22 16:03:55 -04:00
Marty Schoch 64b0066121 added support for tracking index stats and exposing via expvar
closes #83
2014-10-02 11:12:49 -07:00
Marty Schoch 209f808722 improve go docs at the top level
part of #79
2014-08-31 10:55:22 -04:00
Marty Schoch 37d3f0205d cleanup spacing between license and package 2014-08-29 14:18:36 -04:00
Marty Schoch 1161361bea rename imports from couchbaselabs to blevesearch 2014-08-28 15:38:57 -04:00
Marty Schoch 34afb0929e made it safe to use bleve.Index object from multiple threads
an RWMutext ensures that only one write op is done at a time, and
that all other ops have finished prior to closing
2014-08-25 09:06:53 -04:00
Marty Schoch 27f001bc14 overhauled top-level New/Open API
New is now used to create new indexes
Open is used to open existing indexes
calls to Open no longer specify a mapping because the mapping
is serialized and stored along with the index
2014-08-20 16:58:20 -04:00
Marty Schoch c33f1668f7 refactor dump methods
improved test coverage
2014-08-15 13:12:55 -04:00
Marty Schoch e5d4e6f1e4 refactored index layer to support batch operations
this change was then exposed at the higher levels
also the beer-sample app was upgraded to index in batches of 100
by default.  this yieled an indexing speed up from 27s to 16s.
closes #57
2014-08-11 16:27:18 -04:00
Marty Schoch 42895649de further streamlined the API
introduced concept of byte array converters
right now only wired up to top-level index mapping
allowing the removal of the JSON methods, now at the top level
we default to parsing []byte as JSON, override if thats not
the behavior you want.

future enhancements will allow use of these byte array converters
to control how byte arrays are handled elsewhere in documents
this would allow for handing binary attachments, etc in the future

closes #59
2014-08-11 12:47:29 -04:00
Marty Schoch 4ae9eb895c added method to list fields in the index
also added a corresponding http handler
2014-07-31 11:47:36 -04:00
Marty Schoch 216767953c introduced a config option to disable creating indexes if they don't already exist
closes #23 and closes #24
2014-07-30 14:29:26 -04:00
Marty Schoch 2968d3538a major refactor, apologies for the large commit
removed analyzers (these are now built as needed through config)
removed html chacter filter (now built as needed through config)
added missing license header
changed constructor signature of filters that cannot return errors
filter constructors that can have errors, now have Must variant which panics
change cdl2 tokenizer into filter (should only see lower-case input)
new top level index api, closes #5
refactored index tests to not rely directly on analyzers
moved query objects to top-level
new top level search api, closes #12
top score collector allows skipping results
index mapping supports _all by default, closes #3 and closes #6
index mapping supports disabled sections, closes #7
new http sub package with reusable http.Handler's, closes #22
2014-07-30 12:30:38 -04:00