bleve

gibheer

bleve

Author	SHA1	Message	Date
abhinavdangeti	18cfcd11d1	MB-28782: Error handling in merger/persister when index is closed When the index is closed, do not fire an AsyncError (fatal) from either the merger or the persister that is actively working. This is quite a probable situation, so exit the loop within the goroutine.	2018-03-22 14:29:59 -07:00
abhinavdangeti	60bdf6d247	Return an error when the snapshotEpoch is invalid Avoiding this stacktrace (SIGSEGV) while using bleve scorch cmd-line utility when snapshotEpoch provided is invalid: github.com/blevesearch/bleve/index/scorch.(IndexSnapshot).Segments(...) /Users/abhinavdangeti/Documents/couchbaseV/godeps/src/github.com/blevesearch/bleve/index/scorch/snapshot_index.go:56 github.com/blevesearch/bleve/cmd/bleve/cmd/scorch.glob..func1(0x1f347e0, 0xc4201f1400, 0x2, 0x2, 0x0, 0x0) /Users/abhinavdangeti/Documents/couchbaseV/godeps/src/github.com/blevesearch/bleve/cmd/bleve/cmd/scorch/ascii.go:43 +0xe4 github.com/blevesearch/bleve/cmd/bleve/vendor/github.com/spf13/cobra.(Command).execute(0x1f347e0, 0xc4201f12e0, 0x2, 0x2, 0x1f347e0, 0xc4201f12e0) /Users/abhinavdangeti/Documents/couchbaseV/godeps/src/github.com/blevesearch/bleve/cmd/bleve/vendor/github.com/spf13/cobra/command.go:646 +0x3e8 github.com/blevesearch/bleve/cmd/bleve/vendor/github.com/spf13/cobra.(Command).ExecuteC(0x1f334c0, 0x0, 0x0, 0x0) /Users/abhinavdangeti/Documents/couchbaseV/godeps/src/github.com/blevesearch/bleve/cmd/bleve/vendor/github.com/spf13/cobra/command.go:737 +0x2fe github.com/blevesearch/bleve/cmd/bleve/vendor/github.com/spf13/cobra.(Command).Execute(0x1f334c0, 0x0, 0x0) /Users/abhinavdangeti/Documents/couchbaseV/godeps/src/github.com/blevesearch/bleve/cmd/bleve/vendor/github.com/spf13/cobra/command.go:695 +0x2b github.com/blevesearch/bleve/cmd/bleve/cmd.Execute() /Users/abhinavdangeti/Documents/couchbaseV/godeps/src/github.com/blevesearch/bleve/cmd/bleve/cmd/root.go:74 +0x31 main.main() /Users/abhinavdangeti/Documents/couchbaseV/goproj/src/github.com/couchbase/cbft/cmd/cbft-bleve/main.go:39 +0x1cb	2018-03-16 11:43:25 -07:00
Sreekanth Sivasankaran	b04909d3ee	adding the integer parser utility	2018-03-09 11:05:17 +05:30
Sreekanth Sivasankaran	fa5de8e09a	making NumSnapshotsToKeep configurable	2018-03-06 16:22:11 +05:30
Steve Yen	a5253bfe2b	scorch persister goes through introducer to affect root This change allows the introducer to become the only goroutine to modify the root, which in turn allows the introducer to greatly reduce its root lock holding surface area.	2018-03-02 16:14:28 -08:00
Steve Yen	1b661ef844	stats cleanup, renaming, gauges replaced with counters	2018-02-28 17:03:28 -08:00
Sreekanth Sivasankaran	4b742505aa	adding stats for scorch	2018-02-28 15:31:55 +05:30
Steve Yen	99ed127176	scorch zap merge optimize newDocNums lookup to outside of loop And, also a "go fmt".	2018-02-26 14:23:55 -08:00
Sreekanth Sivasankaran	f0a65f041d	cleaning up the wait loop	2018-02-25 20:58:53 +05:30
Sreekanth Sivasankaran	3a571ad283	Merge branch 'master' into persister_pause	2018-02-24 23:57:20 +05:30
Sreekanth Sivasankaran	874829759b	cleaning up the wait loop	2018-02-24 23:53:49 +05:30
Steve Yen	c50d9b4023	scorch conditional merging during persistSnapshot() As part of this change, there are nw helper methods -- persistSnapshotMaybeMerge() and persistSnapshotDirect().	2018-02-23 09:17:02 -08:00
Sreekanth Sivasankaran	a8ebf2a553	lowering epochDistance to 5, fixing the lastMergedEpoch value updates	2018-02-21 17:25:14 +05:30
Sreekanth Sivasankaran	35611f4287	Merge branch 'master' into persister_pause	2018-02-14 16:53:06 +05:30
Sreekanth Sivasankaran	6f2797bec3	Adding a pause to persister until the merger catches up	2018-02-14 16:39:26 +05:30
Steve Yen	6f5f90cd41	scorch zap segment cleanup handling for some edge cases Two cases in this commit... If we're shutting down, the merger might not have handed off its latest merged segment to the introducer yet, so the merger still owns the segment and needs to Close() that segment itself. In persistSnapshot(), there migth be cases where the persister might not be able to swap in its newly persisted segments -- so, the persistSnapshot() needs to Close() those segments itself.	2018-02-08 14:04:04 -08:00
Steve Yen	83272a9629	scorch persistSnapshot() err handling & propagation	2018-02-08 14:03:59 -08:00
Steve Yen	dee6a2b1c6	scorch persistSnapshot() consistently uses err to commit vs abort Some codepaths in persistSnapshot() were saving errors into an err2 local variable, which might lead incorrectly to commit during an error situation rather than abort.	2018-02-08 14:02:35 -08:00
Steve Yen	91ac0d011a	scorch uses segment.id to encode boltdb sub-bucket key fixes #764	2018-02-08 13:25:16 -08:00
Steve Yen	d0644fec12	scorch persistSnapshot comments update See also: https://github.com/blevesearch/bleve/issues/763	2018-02-08 12:22:58 -08:00
Marty Schoch	1af90936c4	Merge pull request #751 from sreekanth-cb/merger_persister_handshake_fix fix for merger persister handshake stalemate	2018-02-08 11:03:01 -05:00
Sreekanth Sivasankaran	feecce1eb2	fix for merger persister handshake stalemate The slow merger was lagging behind the fast persister to a persister notify send-loop while the persister awaits for any new introductions from introducer totally blocking the merger This fix along with the deleted files eligibilty flipping makes the file count to around 6 to 11 files per shard for both travel and beer samples	2018-02-08 11:00:21 +05:30
Sreekanth Sivasankaran	9636209ae5	Update persister.go comment updated	2018-02-05 20:49:30 +05:30
Sreekanth Sivasankaran	678c412157	unblock the files for clean up, esp for merged new segment files	2018-02-02 14:44:02 +05:30
Steve Yen	5a035dc9aa	scorch zap in-memory segment representation (SegmentBase) The zap SegmentBase struct is a refactoring of the zap Segment into the subset of fields that are needed for read-only ops, without any persistence related info. This allows us to use zap's optimized data encoding as scorch's in-memory segments. The zap Segment struct now embeds a zap SegmentBase struct, and layers on persistence. Both the zap Segment and zap SegmentBase implement scorch's Segment interface.	2018-01-27 11:35:10 -08:00
Marty Schoch	e756c7acf0	add initial support for async error callback	2018-01-05 16:43:16 -05:00
Marty Schoch	57a075afdb	improving command-line tool for scorch	2018-01-05 11:50:07 -05:00
Marty Schoch	c691cd2bb5	refactor scorch/zap command-line tools under bleve zap command-line tool added to main bleve command-line tool this required physical relocation due to the vendoring used only on the bleve command-line tool (unforseen limitation) a new scorch command-line tool has also been introduced and for the same reasons it is physically store under the top-level bleve command-line tool as well	2018-01-05 10:17:18 -05:00
Marty Schoch	1a59a1bb99	attempt to fix core reference counting issues Observed problem: Persisted index state (in root bolt) would contain index snapshots which pointed to index files that did not exist. Debugging this uncovered two main problems: 1. At the end of persisting a snapshot, the persister creates a new index snapshot with the SAME epoch as the current root, only it replaces in-memory segments with the new disk based ones. This is problematic because reference counting an index segment triggers "eligible for deletion". And eligible for deletion is keyed by epoch. So having two separate instances going by the same epoch is problematic. Specifically, one of them gets to 0 before the other, and we wrongly conclude it's eligible for deletion, when in fact the "other" instance with same epoch is actually still in use. To address this problem, we have modified the behavior of the persister. Now, upon completion of persistence, ONLY if new files were actually created do we proceed to introduce a new snapshot. AND, this new snapshot now gets it's own brand new epoch. BOTH of these are important because since the persister now also introduces a new epoch, it will see this epoch again in the future AND be expected to persist it. That is OK (mostly harmless), but we cannot allow it to form a loop. Checking that new files were actually introduced is what short-circuits the potential loop. The new epoch introduced by the persister, if seen again will not have any new segments that actually need persisting to disk, and the cycle is stopped. 2. The implementation of NumSnapshotsToKeep, and related code to deleted old snapshots from the root bolt also contains problems. Specifically, the determination of which snapshots to keep vs delete did not consider which ones were actually persisted. So, lets say you had set NumSnapshotsToKeep to 3, if the introducer gets 3 snapshots ahead of the persister, what can happen is that the three snapshots we choose to keep are all in memory. We now wrongly delete all of the snapshots from the root bolt. But it gets worse, in this instant of time, we now have files on disk that nothing in the root bolt points to, so we also go ahead and delete those files. Those files were still being referenced by the in-memory snapshots. But, now even if they get persisted to disk, they simply have references to non-existent files. Opening up one of these indexes results in lost data (often everything). To address this problem, we made large change to the way this section of code operates. First, we now start with a list of all epochs actually persisted in the root bolt. Second, we set aside NumSnapshotsToKeep of these snapshots to keep. Third, anything else in the eligibleForRemoval list will be deleted. I suspect this code is slower and less elegant, but I think it is more correct. Also, previously NumSnapshotsToKeep defaulted to 0, I have now defaulted it to 1, which feels like saner out-of-the-box behavior (though it's debatable if the original intent was perhaps instead for "extra" snapshots to keep, but with the variable named as it is, 1 makes more sense to me) Other minor changes included in this change: - Location of 'nextSnapshotEpoch', 'eligibleForRemoval', and 'ineligibleForRemoval' members of Scorch struct were moved into the paragraph with 'rootLock' to clarify that you must hold the lock to access it. - TestBatchRaceBug260 was updated to properly Close() the index, which leads to occasional test failures.	2018-01-03 12:05:00 -05:00
abhinavdangeti	055d3e12df	Adding onEvent callback support for scorch Event types: - EventKindCloseStart - EventKindClose - EventKindMergerProgress - EventKindPersisterProgress - EventKindBatchIntroductionStart - EventKindBatchIntroduction	2017-12-29 09:47:25 -07:00
abhinavdangeti	becd4677cd	Adding num_items_introduced, num_items_persisted stats + Adding new entries to the stats struct of scorch. + These stats are atomically incremented upon every segment introduction, and upon successful persistence.	2017-12-28 14:07:44 -07:00
Steve Yen	04ac9d5b1f	scorch removeOldBoltSnapshots() deletes from correct bucket	2017-12-20 14:46:48 -08:00
Steve Yen	34f5e2175f	scorch fix persister for lost notifications on no-data batches With the previous commit, there can be a scenario where batches that had internal-updates-only can be rapidly introduced by the app, but the persisted notifications on only the very last IndexSnapshot would be fired. The persisted notifications on the in-between batches might be missed. The solution was to track the persisted notification channels at a higher Scorch struct level, instead of tracking the persisted channels at the IndexSnapshot and SegmentSnapshot levels. Also, the persister double-check looping was simplified, which avoids a race where an introducer might incorrectly not notify the persister.	2017-12-17 12:30:05 -08:00
Steve Yen	ecbb3d2df4	scorch handles non-updating batches better This commit improves handling when an incoming batch has internal-data updates only and no doc updates. In this case, a nil segment instead of an empty segment instance is used in the segmentIntroduction. The segmentIntroduction, that is, might now hold only internal-data updates only. To handle synchronous persistence, a new field that's a slice of persisted notification channels is added to the IndexSnapshot struct, which the persister goroutine will close as each IndexSnapshot is persisted. Also, as part of this change, instead of checking the unsafeBatch flag in several places, we instead check for non-nil'ness of these persisted chan's.	2017-12-17 08:51:23 -08:00
Marty Schoch	a575be4d56	fix issue where we incorrectly seed the nextSegmentID on Open()	2017-12-15 19:26:23 -05:00
Steve Yen	506aa1c325	scorch fix data race w/ AddEligibleForRemoval Found from "go test -race ./..." WARNING: DATA RACE Read at 0x00c420088060 by goroutine 48: github.com/blevesearch/bleve/index/scorch.(Scorch).AddEligibleForRemoval() /Users/steveyen/go/src/github.com/blevesearch/bleve/index/scorch/scorch.go:348 +0x6d Previous write at 0x00c420088060 by goroutine 31: github.com/blevesearch/bleve/index/scorch.(Scorch).loadFromBolt.func1() /Users/steveyen/go/src/github.com/blevesearch/bleve/index/scorch/persister.go:332 +0x87b github.com/boltdb/bolt.(DB).View() /Users/steveyen/go/src/github.com/boltdb/bolt/db.go:629 +0xc1 github.com/blevesearch/bleve/index/scorch.(Scorch).loadFromBolt() /Users/steveyen/go/src/github.com/blevesearch/bleve/index/scorch/persister.go:290 +0xa1 github.com/blevesearch/bleve/index/scorch.(Scorch).Open() /Users/steveyen/go/src/github.com/blevesearch/bleve/index/scorch/scorch.go:121 +0x77f github.com/blevesearch/bleve/index/scorch.TestIndexOpenReopen() /Users/steveyen/go/src/github.com/blevesearch/bleve/index/scorch/scorch_test.go:115 +0x1351 testing.tRunner() /usr/local/Cellar/go/1.9/libexec/src/testing/testing.go:746 +0x16c Goroutine 48 (running) created at: github.com/blevesearch/bleve/index/scorch.(IndexSnapshot).DecRef() /Users/steveyen/go/src/github.com/blevesearch/bleve/index/scorch/snapshot_index.go:72 +0x23e github.com/blevesearch/bleve/index/scorch.(Scorch).loadFromBolt.func1() /Users/steveyen/go/src/github.com/blevesearch/bleve/index/scorch/persister.go:330 +0x8f4 github.com/boltdb/bolt.(DB).View() /Users/steveyen/go/src/github.com/boltdb/bolt/db.go:629 +0xc1 github.com/blevesearch/bleve/index/scorch.(Scorch).loadFromBolt() /Users/steveyen/go/src/github.com/blevesearch/bleve/index/scorch/persister.go:290 +0xa1 github.com/blevesearch/bleve/index/scorch.(Scorch).Open() /Users/steveyen/go/src/github.com/blevesearch/bleve/index/scorch/scorch.go:121 +0x77f github.com/blevesearch/bleve/index/scorch.TestIndexOpenReopen() /Users/steveyen/go/src/github.com/blevesearch/bleve/index/scorch/scorch_test.go:115 +0x1351 testing.tRunner() /usr/local/Cellar/go/1.9/libexec/src/testing/testing.go:746 +0x16c	2017-12-14 14:40:33 -08:00
Steve Yen	2be5eb4427	scorch tracks zap files that can't be removed yet A race & solution found by Marty Schoch... consider a case when the merger might grab a nextSegmentID, like 4, but takes awhile to complete. Meanwhile, the persister grabs the nextSegmentID of 5, but finishes its persistence work fast, and then loops to cleanup any old files. The simple approach of checking a "highest segment ID" of 5 is wrong now, because the deleter now thinks that segment 4's zap file is (incorrectly) ok to delete. The solution in this commit is to track an ephemeral map of filenames which are ineligibleForRemoval, because they're still being written (by the merger) and haven't been fully incorporated into the rootBolt yet. The merger adds to that ineligibleForRemoval map as it starts a merged zap file, the persister cleans up entries from that map when it persists zap filenames into the rootBolt, and the deleter (part of the persister's loop) consults the map before performing any actual zap file deletions.	2017-12-14 10:49:33 -08:00
Marty Schoch	bd742caf65	don't try to close a nil segment if err opening	2017-12-14 10:29:19 -05:00
Marty Schoch	149a26b5c1	merge deletion and cacheddocs fixes discussed in meeting	2017-12-14 10:27:39 -05:00
Steve Yen	b7dff6669f	scorch cleanup of *.zap files not listed in the rootBolt	2017-12-13 17:09:50 -08:00
Steve Yen	c0cc46a2be	scorch cleanup of the rootBolt of old snapshots A new global variable, NumSnapshotsToKeep, represents the default number of old snapshots that each scorch instance should maintain -- 0 is the default. Apps that need rollback'ability may want to increase this value in early initialization. The Scorch.eligibleForRemoval field tracks epoches which are safe to delete from the rootBolt. The eligibleForRemoval is appended to whenever the ref-count on an IndexSnapshot drops to 0. On startup, eligibleForRemoval is also initialized with any older epoch's found in the rootBolt. The newly introduced Scorch.removeOldSnapshots() method is called on every cycle of the persisterLoop(), where it maintains the eligibleForRemoval slice to under a size defined by the NumSnapshotsToKeep. A future commit will remove actual storage files in order to match the "source of truth" information found in the rootBolt.	2017-12-13 15:53:31 -08:00
Steve Yen	c13ff85aaf	scorch ref-counting Future commits will provide actual cleanup when ref-counts reach 0.	2017-12-13 14:48:07 -08:00
Marty Schoch	cd45487cb3	fsync rootBolt when persisting snapshot	2017-12-13 13:55:06 -05:00
Marty Schoch	f83c9f2a20	initial cut of merger that actually introduces changes	2017-12-13 13:41:03 -05:00
Marty Schoch	414899618b	switch from bolt format to zap in the persister	2017-12-09 14:28:50 -05:00
Marty Schoch	e470105635	fix issues identified by errcheck	2017-12-06 18:36:14 -05:00
Marty Schoch	adac4f41db	initial version of scorch which persists index to disk	2017-12-06 18:33:47 -05:00

47 Commits