WL#3523: Repeatable-Read Isolation in Falcon is implemented as Consistent ReadAffects: Falcon-6.0 — Status: Code-Review — Priority: MediumNote: Much of the following was written and/or described by Ann Harrison in a series of emails. The rest is by Kevin Lewis. Falcon implementation of Repeatable Read Isolation The Repeatable Read transaction isolation that is selected through the SQL syntax is implemented in Falcon as Consistent Read. The SQL standard defines "Repeatable Read" in a way that is not repeatable. The standard's definitions of transaction modes are based on the artifacts of locking systems. Repeatable read is exactly read/write record locking without range locking, so new records can appear in repeated queries. "Consistent read" is what you get with Mult-Version Concurrency Control (MVCC) which is the way Falcon handles concurrency. Locks are not needed to isolate transactions from one another. The isolation can be done in either a Consistent Read or Read Committed fashion with minimal extra code. Using the Falcon engine, when you select 'Repeatable Read', the only changes you see are those you make. Each record that undergoes changes in an MVCC database engine has a chain of prior records versions from newest to oldest. The transactions that wrap these changes are identifiable by their age. The chain never branches because only one transaction can change a record at a time. There are three tasks in the engine that need to use this prior record Chain; Reading the correct record version. Deleting (or Scavenging) old record versions. Checking to see if a duplicate key value exists. We will discuss those three tasks referencing the following hypothetical record chain. Each record version is identified by the transaction ID that made it; T34->T19->T18->T17->T10->T5->T3 Reading the correct record version. Assume that T34 is still active/pending and the rest were previously committed. T34 must have used the record version committed by T19 to make it's changes to this record. If transaction T20 is a Consistent Read transaction and needs to read this record, it will need to use a record version that was visible when it started. Assuming T19 was committed before T20 started it will use T19. But if T19 was still active when T20 started, T20 will have to use the record version committed by T18. If T19 was a Consistent Read Transaction, T18 must have committed before T19 started. If T19 is Read Committed, then it is possible for both T19 and T18 to be running when T20 started, which means T20 will need to read an even earlier version. In the same way, T15 will need to use T10, T5, or T3. T8 will need to use T5 or T3. Deleting (or Scavenging) old record versions. Periodically a background thread in Falcon will come along and clean up the record cache by deleting old record versions that are no longer needed. The trick is to determine which records may still be needed. If the oldest active transaction is T15, then you need at least record T10 in this chain. But it is more complicated than that. Suppose that T10 was active when T15 started. In that case, T15 would not see the change made by T10 to this record. The scavenger thread needs to know not to delete record versions made by T10 even if T15 is currently the oldest active transaction. So it is important for the scavenger thread to know what the oldest active transaction was when the currently oldest active transaction started. Falcon does this in the constructor of each Consistent Read transaction. It makes a list of each of the active transactions at the time each new Consistent Read transaction starts. In this way, it knows which committed changes are visible and which are not. Also, using this list, the Scavenger can determine the oldest transaction that was active when the oldest active transaction started, thereby keeping all record versions that may be needed by that oldest active transaction, and any other active transaction, and deleting all the rest of the old record versions. Checking to see if a duplicate key value exists. Transactions cannot insert or update a record to create two occurrences of the same key value. If this is attempted. an update conflict error will occur on the insert or update. There are three types of record versions which can cause a duplicate conflict. The first is a pending record with a duplicate value in a unique index. When this conflict occurs, Falcon will cause the transaction to wait on the completion of the conflicting transaction. This is appropriate since you don't know if the other transaction will commit or rollback. It is not appropriate to block on a unique key value that gets rolled back, so that it appears to the user that the conflict was a mistake. The second kind of conflict is with the currently committed record. Although Read Committed transactions can always read this record, even if it was committed after the transaction started, Consistent Read transactions may not be able to read this record. But a duplicate entry cannot be allowed because that record is committed, whether it is visible or not. The third type of update conflict is when a Consistent Read transaction attempts to create a key value that is the same as an older, but still visible record version. Even though a newer key value has been committed by a concurrent transaction, the Consistent Read transaction does not see it. Instead, it sees the old value. If this transaction tried to make another record with this value, it would 'appear' as if there are duplicates in a unique index. So this also, is not allowed in the Falcon storage engine. Let's take the previous record version stack as an example of these three kinds of duplicate conflicts. Record Versions T34-> T19 -> T18 -> T17 -> T10 -> T5 -> T3 Key values 6 5 4 3 2 1 0 As before, let's assume T34 is still active, T15 started while T10 was active and T15 is still active. Now T15 inserts a new record. What unique key values are not permissible? First, value 6 is still pending, so if it would be committed at some point in the future, it would become the only allowed key value 6. In this case T15 would wait for T34 to complete and then determine if there is a conflict. Second, value 5 is the currently committed key value, so it is not allowed. Third, a select of this record by T15 would see value 1, previously committed by T5. Thus, a value of 1 would also cause an update conflict. Of course, if T15 were a read committed transaction, it would only see value block on values 6 (with a wait state) and 5 (with an update conflict error). Changes to the Falcon Engine BUG#22211 had to do with the wait state that occured when a conflict was found with a pending record version. In this case, the pending record version is on a savepoint that is rolled back. But the transaction is committed. So when the waiting thread wakes up, it immediately returned the Update Conflict error, since the transaction committed. The fix is to note if a wait state occurred, and if so, retry the duplicate search from the current head of the record chain, just in case the duplicate found was not actually committed. As part of the fix for BUG#22211, it was noticed that Consistent Read transactions were not blocking on duplicate values within visible but old records. This gave the impression of having duplicate values within a unique index while the transaction was active. The change involved being able to skip previously committed record versions that need not be checked. A third aspect of these improvements to Falcon's Consistent Read behavior is that the scavenger thread was using the oldest active transaction instead of the oldest transaction that was active when the oldest active transaction was started. This is a subtle difference, and it would not be noticed unless a Consistent Read transaction lasted a very long time on a very busy server. No Comments yet |
VotesWatches0 members are watching this worklog
You must be logged in to track this worklog.
Provide Feedback
You must be logged in to comment
|