0
0
Commit Graph

54 Commits

Author SHA1 Message Date
slavikm
680be52f87 Implemented boolean field support 2016-01-11 17:18:03 -08:00
Marty Schoch
d73beac3b9 search result hits now have a field with the name of the index
this allows you to figure out where a result actually came
from when using aliases
2015-12-08 13:55:04 -05:00
Marty Schoch
aa7658bbb0 give indexes names, make stats available via expvar by default 2015-12-06 14:01:03 -05:00
Marty Schoch
64ce81c283 Merge branch 'master' into newkvstore 2015-09-29 14:06:27 -04:00
Marty Schoch
da40935e22 Merge branch 'codesimplification' of https://github.com/Shugyousha/bleve into Shugyousha-codesimplification 2015-09-29 13:02:56 -04:00
Marty Schoch
cddf90c0ee don't allow operations on empty doc id
fixes #239
2015-09-28 17:00:08 -04:00
Marty Schoch
1c9feaf792 fix backwards compatibility when index meta does not specify
the index type
2015-09-25 09:57:09 -07:00
Marty Schoch
900f1b4a67 major kvstore interface and impl overhaul
clarified the interface contract
2015-09-23 11:25:47 -07:00
Silvan Jegen
3414701fca Simplify returns 2015-09-21 20:47:10 +02:00
Marty Schoch
f81b2be334 major refactor of bleve configuration
see #221 for full details
2015-09-16 17:10:59 -04:00
Marty Schoch
dbb93b75a4 refactoring to allow pluggable index encodings
this lays the foundation for supporting the new firestorm
indexing scheme.  i'm merging these changes ahead of
the rest of the firestorm branch so i can continue
to make changes to the analysis pipeline in parallel
2015-09-02 13:12:08 -04:00
Marty Schoch
a9c07acbfa refactor of kvstore api to support native merge in rocksdb
refactor to share code in emulated batch
refactor to share code in emulated merge
refactor index kvstore benchmarks to share more code
refactor index kvstore benchmarks to be more repeatable
2015-04-24 17:13:50 -04:00
Marty Schoch
11262c793f fix bug, internal ops must check that index is open
possibly fixes https://github.com/couchbaselabs/cbft/issues/49
2015-04-03 18:05:24 -04:00
Sathyanarayanan Gunasekaran
93e749bc0c Check all return errors
- Fix the following errors found by errcheck :
  $ bleve git:(master) errcheck github.com/blevesearch/bleve
  github.com/blevesearch/bleve/index_impl.go:206:25  defer indexReader.Close()
  github.com/blevesearch/bleve/index_impl.go:317:25  defer indexReader.Close()
  github.com/blevesearch/bleve/index_impl.go:353:25  defer indexReader.Close()
  github.com/blevesearch/bleve/index_impl.go:359:22  defer searcher.Close()
  github.com/blevesearch/bleve/index_impl.go:497:25  defer indexReader.Close()
  github.com/blevesearch/bleve/index_impl.go:644:20  defer reader.Close()
  github.com/blevesearch/bleve/index_meta.go:67:27   defer indexMetaFile.Close()
2015-03-11 01:28:51 -04:00
Marty Schoch
522f9d5cc7 significant change to index format, support dictionary rows
this introduces disk format v4
now the summary rows for a term are stored in their own
"dictionary row" format, previously the same information
was stored in special term frequency rows
this now allows us to easily iterate all the terms for a field
in sorted order (useful for many other fuzzy data structures)

at the top-level of bleve you can now browse terms within a field
using the following api on the Index interface:

  FieldDict(field string) (index.FieldDict, error)
  FieldDictRange(field string, startTerm []byte, endTerm []byte) (index.FieldDict, error)
  FieldDictPrefix(field string, termPrefix []byte) (index.FieldDict, error)

fixes #127
2015-03-10 16:22:19 -04:00
Marty Schoch
af356acff0 changed batch behavior
now created through the index itself
mapping problems reported early at the time
data is added to the batch, previously these
were not reported until the batch was executed
2015-03-09 08:20:39 -04:00
Marty Schoch
0771f813ce SearchResult Took field now returns full time in Search()
likewise, MultiSearch used by aliases spanning multiple
will also return full time in MultiSearch()
closes #163
2015-02-19 12:11:40 +05:30
Marty Schoch
ba978ea27e improving log messages 2015-01-16 14:07:47 -05:00
Marty Schoch
1368d7b3b4 NewUsing persists the provided config to index meta
new method OpenUsing allows user to override values
in the persisted config
example would be opening the index, but using a different
buffer size for leveldb (not actually supported yet, but that
is the idea)
closes #138
2015-01-06 17:19:46 -05:00
Marty Schoch
435058a928 fix go vet issue 2014-12-28 19:44:03 -08:00
Marty Schoch
5978f50b8c added ability to log slow searches
closes #88
2014-12-28 19:34:16 -08:00
Marty Schoch
68712cd142 support for accessing the underlying index/store impls
now you can access the underlying index/store implementations
using the Advanced() method.  this is intedned for advanced
usage only, and can lead to problems if misused.

also, there is a new method NewUsing(...) which allows callers
of the top-level API to choose which underlying k/v store they
want to use.
2014-12-27 13:23:46 -08:00
Silvan Jegen
ef18dfe4cd Fix typos in comments and strings 2014-12-18 18:43:12 +01:00
Marty Schoch
6141a5aad3 make batch internals private
closes #119
2014-11-25 11:11:28 -05:00
Marty Schoch
c7443fe52b refactored API a bit
more things can return error now
in a couple of places we had to swallow errors because they didn't
fit the existing API.  in these case and proactively in a few
others we now return error as well.

also the batch API has been updated to allow performing
set/delete internal within the batch
2014-10-31 09:40:23 -04:00
Marty Schoch
0500a572af exposed Get/Set/Delete Internal methods
these are to be used to store side-channel information
along with the index
2014-10-22 16:03:55 -04:00
Marty Schoch
7bf44e1ba7 added ability to return all document fields by requesting field * 2014-10-15 19:16:16 -04:00
Marty Schoch
64b0066121 added support for tracking index stats and exposing via expvar
closes #83
2014-10-02 11:12:49 -07:00
Marty Schoch
97902e2619 text analysis now moved out of index write lock onto goroutine
1. text analysis is now done before the write lock is acquired
2. there is now a pool of analysis workers
3. the size of this pool is configurable
4. this allows for documents in a batch to be analyzed concurrently

as a part of benchmarking these changes i've also introduce a new
null storage implementation.  this should never be used, as it
does not actualy build an index.  it does however let us go
through all the normal indexing machinery, without incuring
any indexing I/O.  this is very helpful in measuring improvements
made to the text analsysis pipeline, which are often overshadowed
by indexing times in benchmarks actually building an index.
2014-09-24 08:13:14 -04:00
Marty Schoch
198ca1ad4d major refactor of kvstore/index internals, see below
In the index/store package
introduce KVReader
  creates snapshot
  all read operations consistent from this snapshot
  must close to release

introduce KVWriter
  only one writer active
  access to all operations
  allows for consisten read-modify-write
  must close to release

introduce AssociativeMerge operation on batch
  allows efficient read-modify-write
  for associative operations
  used to consolidate updates to the term summary rows
  saves 1 set and 1 get op per shared instance of term in field

In the index package
introduced an IndexReader
  exposes a consisten snapshot of the index for searching

At top level
  All searches now operate on a consisten snapshot of the index
2014-09-12 17:21:35 -04:00
Marty Schoch
28980c4da1 fix issues identified by go lint 2014-09-02 17:40:46 -04:00
Marty Schoch
bbc6fadf69 changed error constants to camel case
all caps constants are not idiomatic go
2014-09-02 14:14:05 -04:00
Marty Schoch
2ee7289bc8 major refactor of search package
this started initially to relocate highlighting into
a self contained package, which would then also use
the registry
however, it turned into a much larger refactor in
order to avoid cyclic imports
now facets, searchers, scorers and collectors
are also broken out into subpackages of search
2014-09-01 11:15:38 -04:00
Marty Schoch
209f808722 improve go docs at the top level
part of #79
2014-08-31 10:55:22 -04:00
Marty Schoch
862205f184 fix deadlock
Search() would attempt to reacquire mutex when invoking Document()
should instead call index.Document(), read mutex is already
acquired

closes #87
2014-08-30 14:49:16 -04:00
Marty Schoch
8c6427959c made more of index mapping private 2014-08-30 00:06:16 -04:00
Marty Schoch
2ea1f526e7 made more mapping methods private 2014-08-29 23:50:47 -04:00
Marty Schoch
7313e7247e renamed SyntaxQuery QueryStringQuery
also made IndexMeta private
2014-08-29 15:19:02 -04:00
Marty Schoch
37d3f0205d cleanup spacing between license and package 2014-08-29 14:18:36 -04:00
Marty Schoch
1161361bea rename imports from couchbaselabs to blevesearch 2014-08-28 15:38:57 -04:00
Marty Schoch
c9423aa24b change default highlighter to return 1 fragment 2014-08-28 14:45:51 -04:00
Marty Schoch
45a7a6dd8e fix two missing Close calls holding iterators open 2014-08-25 15:13:15 -04:00
Marty Schoch
34afb0929e made it safe to use bleve.Index object from multiple threads
an RWMutext ensures that only one write op is done at a time, and
that all other ops have finished prior to closing
2014-08-25 09:06:53 -04:00
Marty Schoch
27f001bc14 overhauled top-level New/Open API
New is now used to create new indexes
Open is used to open existing indexes
calls to Open no longer specify a mapping because the mapping
is serialized and stored along with the index
2014-08-20 16:58:20 -04:00
Marty Schoch
c33f1668f7 refactor dump methods
improved test coverage
2014-08-15 13:12:55 -04:00
Marty Schoch
c526a38369 major refactor of analysis files, now wired up to registry
ultimately this is make it more convenient for us to wire up
different elements of the analysis pipeline, without having to
preload everything into memory before we need it

separately the index layer now has a mechanism for storing
internal key/value pairs.  this is expected to be used to
store the mapping, and possibly other pieces of data by the
top layer, but not exposed to the user at the top.
2014-08-13 21:14:47 -04:00
Marty Schoch
e5d4e6f1e4 refactored index layer to support batch operations
this change was then exposed at the higher levels
also the beer-sample app was upgraded to index in batches of 100
by default.  this yieled an indexing speed up from 27s to 16s.
closes #57
2014-08-11 16:27:18 -04:00
Marty Schoch
42895649de further streamlined the API
introduced concept of byte array converters
right now only wired up to top-level index mapping
allowing the removal of the JSON methods, now at the top level
we default to parsing []byte as JSON, override if thats not
the behavior you want.

future enhancements will allow use of these byte array converters
to control how byte arrays are handled elsewhere in documents
this would allow for handing binary attachments, etc in the future

closes #59
2014-08-11 12:47:29 -04:00
Marty Schoch
7bbaa8ecd5 added support for returning facet results with requests
supports terms, numeric ranges, and date ranges
closes #14
2014-08-11 11:03:29 -04:00
Marty Schoch
41d4f67ee2 fix storing/retrieving numeric and date fields
also includes new ability to request stored fields be returned with results

closes #55 and closes #56 and closes #58
2014-08-06 13:52:20 -04:00