0
0
Commit Graph

261 Commits

Author SHA1 Message Date
Marty Schoch
64b0066121 added support for tracking index stats and exposing via expvar
closes #83
2014-10-02 11:12:49 -07:00
Marty Schoch
97902e2619 text analysis now moved out of index write lock onto goroutine
1. text analysis is now done before the write lock is acquired
2. there is now a pool of analysis workers
3. the size of this pool is configurable
4. this allows for documents in a batch to be analyzed concurrently

as a part of benchmarking these changes i've also introduce a new
null storage implementation.  this should never be used, as it
does not actualy build an index.  it does however let us go
through all the normal indexing machinery, without incuring
any indexing I/O.  this is very helpful in measuring improvements
made to the text analsysis pipeline, which are often overshadowed
by indexing times in benchmarks actually building an index.
2014-09-24 08:13:14 -04:00
Marty Schoch
1dc466a800 modified token filters to avoid creating new token stream
often the result stream was the same length, so can reuse the
existing token stream
also, in cases where a new stream was required, set capacity to
the length of the input stream.  most output stream are at least
as long as the input, so this may avoid some subsequent resizing
2014-09-23 18:41:32 -04:00
Marty Schoch
95e6e37e67 added build tag to fix runngin tests without tag 2014-09-16 11:28:44 -04:00
Marty Schoch
608b9163a3 Merge branch 'master' of github.com:blevesearch/bleve 2014-09-16 11:22:01 -04:00
Marty Schoch
55c0e84665 relocated kagome tokenizer and introduced ja analyzer 2014-09-16 11:21:29 -04:00
Silvan Jegen
29bdc094a9 Use byte positions instead of character positions 2014-09-14 13:19:30 +02:00
Marty Schoch
3dc66b5338 Merge pull request #99 from jingweno/patch-1
Update README.md
2014-09-13 22:45:21 -04:00
Jingwen Owen Ou
79691770c4 Update README.md
Fix broken example.
2014-09-13 19:38:24 -07:00
Silvan Jegen
a8ec7f7af2 Add tests for the Kagome tokenizer 2014-09-13 17:45:22 +02:00
Silvan Jegen
ebf100c097 Add the Kagome tokenizer for Japanese 2014-09-13 17:45:19 +02:00
Marty Schoch
198ca1ad4d major refactor of kvstore/index internals, see below
In the index/store package
introduce KVReader
  creates snapshot
  all read operations consistent from this snapshot
  must close to release

introduce KVWriter
  only one writer active
  access to all operations
  allows for consisten read-modify-write
  must close to release

introduce AssociativeMerge operation on batch
  allows efficient read-modify-write
  for associative operations
  used to consolidate updates to the term summary rows
  saves 1 set and 1 get op per shared instance of term in field

In the index package
introduced an IndexReader
  exposes a consisten snapshot of the index for searching

At top level
  All searches now operate on a consisten snapshot of the index
2014-09-12 17:21:35 -04:00
Marty Schoch
7819deb447 added boltdb benchmark, same as others 2014-09-12 16:55:50 -04:00
Marty Schoch
2294b24b9d remove forestdb for now
not any benfefit in maintaining this for the time being
2014-09-12 16:55:11 -04:00
Marty Schoch
8c16d68c00 include cjk analyzer in default config 2014-09-11 10:44:14 -04:00
Marty Schoch
1a1cf32a86 introducing cjk_bigram filter and cjk analyzer
closes #34
2014-09-11 10:39:05 -04:00
Marty Schoch
cb5ccd2b1d fix whitespace tokenizer
previously would fail to split ascii running into ideographic
2014-09-11 10:38:02 -04:00
Marty Schoch
8debf26cb7 changed many components to not have defaults
many of these defaults were arbitrary, and not having
defaults lets us more easily flag them for configuration
added a shingle filter
introduce new toke type for shingles
2014-09-09 18:15:14 -04:00
Marty Schoch
8dd8fb8910 fix compilation 2014-09-07 14:13:32 -04:00
Marty Schoch
6b4c86b35a changed whitespace tokenizer to work better on cjk input
now it will return each cjk character as a separate token
this will pair well with a cjk bigram filter for indexing
2014-09-07 14:11:01 -04:00
Marty Schoch
933d99c576 rename the configurable token map from standard to custom
this makes it consistent with the "custom" analyzer
which operates similarly
also, added it to the config.go so its registerd and
available for use
2014-09-07 14:09:38 -04:00
Marty Schoch
22911888c4 refactor registry package and bleve_registry utility 2014-09-07 14:07:42 -04:00
Marty Schoch
9e78643bad icu tokenier uses brk status to set token type
part of #34
2014-09-07 10:24:02 -04:00
Marty Schoch
44df73d317 apply doc fix patch from rakoo
closes #95
2014-09-07 09:09:47 -04:00
Marty Schoch
f87a22e24c added json struct tag to http doc count response 2014-09-05 12:16:26 -04:00
Marty Schoch
b1dd4215fc added features to readme 2014-09-04 15:09:19 -04:00
Marty Schoch
f384f9dead added link to wiki search to readme 2014-09-04 14:43:25 -04:00
Marty Schoch
d90697f725 added features to readme 2014-09-04 14:31:26 -04:00
Marty Schoch
afdb5f057f added convenience method to add field to highlight request 2014-09-04 10:13:13 -04:00
Marty Schoch
9d2187706e another round of golint 2014-09-03 19:53:59 -04:00
Marty Schoch
8b9255f52f even more golint cleanups 2014-09-03 19:32:27 -04:00
Marty Schoch
e21935f850 another round of golint cleanup 2014-09-03 19:16:46 -04:00
Marty Schoch
e1b77956d4 more golint cleanups 2014-09-03 18:47:02 -04:00
Marty Schoch
377ae090d0 additional golint issues resolved 2014-09-03 18:17:26 -04:00
Marty Schoch
d534b0836b converted ALL_CAPS constants to CamelCase 2014-09-03 17:48:40 -04:00
Marty Schoch
53b25195d6 further refactoring of index mappings 2014-09-03 16:40:10 -04:00
Marty Schoch
7fbd44224d get correct field first, then use it for looking up related 2014-09-03 16:09:51 -04:00
Marty Schoch
8e6c8e5644 continued refactoring of the mapping code
also renamed some constant that didnt follow go convetions
2014-09-03 13:02:10 -04:00
Marty Schoch
a151bda2ad moved some logic from mapping_index to mapping_document
part of #92
2014-09-03 10:51:21 -04:00
Marty Schoch
28980c4da1 fix issues identified by go lint 2014-09-02 17:40:46 -04:00
Marty Schoch
d75d836c09 change another variable capitalization 2014-09-02 14:22:21 -04:00
Marty Schoch
bbc6fadf69 changed error constants to camel case
all caps constants are not idiomatic go
2014-09-02 14:14:05 -04:00
Marty Schoch
f6a3831687 remove some unused vars 2014-09-02 13:58:27 -04:00
Marty Schoch
45e1b2dfc6 removing gouchstore store impl
this implementation didn't really adhere to the contract
and now that we have boltdb we have a better pure go impl
2014-09-02 13:56:35 -04:00
Marty Schoch
7a7eb2e94c add newline between license and package
this avoids cluttering godocs with the license
2014-09-02 10:54:50 -04:00
Marty Schoch
a1f0c02cab remove flag that is no longer used 2014-09-02 10:27:38 -04:00
Marty Schoch
3bc165d77b renamed/moved examples/bleve_index_json to utils/bleve_index 2014-09-01 16:14:29 -04:00
Marty Schoch
7c0ea53ea2 added utility bleve_create 2014-09-01 14:54:47 -04:00
Marty Schoch
5d435bd022 moved bleve_query from examples to utils 2014-09-01 14:45:02 -04:00
Marty Schoch
ac6176f14c removed old example from gitignore 2014-09-01 14:37:22 -04:00