606fd6344b
Previously term entries were encoded pairwise (field/term), so you'd have data like: F1/T1 F1/T2 F1/T3 F2/T4 F3/T5 As you can see, even though field 1 has 3 terms, we repeat the F1 part in the encoded data. This is a bit wasteful. In the new format we encode it as a list of terms for each field: F1/T1,T2,T3 F2/T4 F3/T5 When fields have multiple terms, this saves space. In unit tests there is no additional waste even in the case that a field has only a single value. Here are the results of an indexing test case (beer-search): $ benchcmp indexing-before.txt indexing-after.txt benchmark old ns/op new ns/op delta BenchmarkIndexing-4 11275835988 10745514321 -4.70% benchmark old allocs new allocs delta BenchmarkIndexing-4 25230685 22480494 -10.90% benchmark old bytes new bytes delta BenchmarkIndexing-4 4802816224 4741641856 -1.27% And here are the results of a MatchAll search building a facet on the "abv" field: $ benchcmp facet-before.txt facet-after.txt benchmark old ns/op new ns/op delta BenchmarkFacets-4 439762100 228064575 -48.14% benchmark old allocs new allocs delta BenchmarkFacets-4 9460208 3723286 -60.64% benchmark old bytes new bytes delta BenchmarkFacets-4 260784261 151746483 -41.81% Although we expect the index to be smaller in many cases, the beer-search index is about the same in this case. However, this may be due to the underlying storage (boltdb) in this case. Finally, the index version was bumped from 5 to 7, since smolder also used version 6, which could lead to some confusion. |
||
---|---|---|
.. | ||
analysis_test.go | ||
analysis.go | ||
benchmark_all.sh | ||
benchmark_boltdb_test.go | ||
benchmark_common_test.go | ||
benchmark_cznicb_test.go | ||
benchmark_forestdb_test.go | ||
benchmark_goleveldb_test.go | ||
benchmark_gorocksdb_test.go | ||
benchmark_gtreap_test.go | ||
benchmark_leveldb_test.go | ||
benchmark_null_test.go | ||
dump_test.go | ||
dump.go | ||
field_dict_test.go | ||
field_dict.go | ||
index_reader.go | ||
reader_test.go | ||
reader.go | ||
row_merge_test.go | ||
row_merge.go | ||
row_test.go | ||
row.go | ||
stats.go | ||
upsidedown_test.go | ||
upsidedown.go | ||
upsidedown.pb.go | ||
upsidedown.proto |