WL#4654: Falcon: new dependency manager

Affects: Falcon-6.0 — Status: Assigned — Priority: Medium

Falcon's current transaction dependency manager has due to optimizations
introduced for read-only transactions ended up in fairly complex code that has
caused several bugs in high-concurrency tests. These have been hard to find and
track down.

In order to both simplify the architecture and the implementation of the
dependency manager a new dependency manager will be made. This new dependency
manager is based on an idea from Jim Starkey.

The old dependency manager uses a states array and a dependency counter for each
transaction. When a transaction is started the states array is filled with
information and pointers to all currently active transactions. At the same time
the dependency counter in each of the currently active transactions is
incremented. As transactions commits, the state array and the dependency
counters are updated. When a committed transaction's dependency counter reaches
zero the transaction object can be deleted (given that its log records are fully
processed). The new dependency manager will eliminate the need for maintaining
both the states array and the dependency counter.

The new dependency manager will use a much simpler data structure for
determining how transactions relate to each other (the dependency between
transactions). The transaction manager will maintain an event counter that is
incremented each time an "event" occurs. When a transaction is started it is
assigned a "start event count" and put on the end of the active transaction
list. When the transaction commits it is given a "commit event count" and moved
from the active transaction list to the committed transaction list.

The dependency manager is used for two main purposes. First (and most
importantly) it is used to for determining the relative state between
transactions, particularly between active and committed transactions. The second
use if for determining when old committed transaction objects can be removed.

Using the assigned event counts a transaction can determine its relative state
to a committed transaction by comparing its "start event count" to the other
transaction's "commit event count". If the start event count is lower than the
other transaction's commit event count, it started before the other committed.
If the start event count is larger than the other's transaction's commit event
count, it started after the other transaction committed. 

Old committed transactions can be removed when their "commit event count" is
lower than the "start event count" of the oldest active transaction (and all
other requirements for removing a transaction is fulfilled).



Data structure:
===============

The new dependency manager relies on the following data structure:

1. The transaction objects needs to have a "start event" number and a "commit
event" number. Since the "start event" will be similar to the current
"transactionId" we will use this as the "start event". For the "commit event" a
new data member will be introduces called "commitId".

1. The transaction manager: The transaction manager will assign a "start event"
to the transaction when it starts (as the transactionId) and a "commit event" to
the transaction when it commits (as the commitId). These events has t be taken
from a "transition event counter" that is incremented each time a transaction
either
starts or commits (not for rollback). Since the transaction manager already have
such an counter that is currently used for generating transaction ids, this will
be used for this purpose.

This new data structure replaces the old transaction manager's states array and
dependencies counter.


Calling Transaction::commitRecords() and purging old transaction objects
========================================================================

One of the uses of the old (as well as the new) dependency manager is to
determine when it is safe to call Transaction::commitRecords() (and in most
cases also being able to delete/release the transaction object itself). When
using the old dependency manager one of the requirement was that the dependency
counter had reached zero before calling Transaction::commitRecords().

In the code using the old dependency manager Transaction::commitRecords() could
be called in four/five different places:

1. Transaction::commit()

2 a+b). Transaction::releaseDependency() called either from
Transaction::expungeTransaction() or Transaction::releaseDependencies().

3. Transaction::writeComplete() (callback from serial log/gopher thread)

4. TransactionManager::purgeTransaction() (clean-up of transaction by the
scavenger).

By instrumenting Falcon and tracing where the actual call to
Transaction::commitRecords initially is done showed that the majority of the
calls happened in Transaction::writeComplete() with some also called from
Transaction::releaseDependency.
This suggest the following for the new dependency manager for how and when to
call commitRecords() (and delete committed transaction objects):

* Case 1 above (in Transaction::commit()) is not necessary at all - remove it.

* Case 2a+b: will be removed since the code is part of the old dependency manger

* Case 3: is very useful but in order to re-implement the "old dependency
check" that just checks that the dependencies counter is zero, with the new
dependency manager we would need to introduce a shared lock on the active
transaction list to get the "startEvent" (also now known as the transactionId)
of the oldest active transaction in order to do the "new dependency check".
This does not seem like the best thing to do.

* Case 4: TransactionManager::purgeTransaction(), not really needed but can be
useful in order to delete transaction objects that otherwise would hang around
for a long time (see further down).

Proposal for when to call Transaction::commitRecords and delete old transaction
objects:

1. The best (and easiest) way to call Transaction::commitRecords() and delete
old transaction objects will be to do this each time we commit a record when we
anyway have the exclusive lock on both the active and committed lists. This
code will call commitRecords() and release the transaction objects from the
front of the committed transaction list as long as they have a commitId
(commitEvent) that is smaller than the transactionId (startEvent) of the oldest
active transactions. This code will be implemented in
TransactionManager.purgeTransactions().
The cost per transaction object should be very "similar" to the cost of doing
the same in the scavenger.

2. Handle situation where there for a long period of time is no update
transactions that commits (for instance during periods with only read-only
transactions) we still let the scavenger call
TransactionManager.purgeTransactions().

One extra benefit of this that instead of having five code paths that lead to
commitRecords() in the old implementation, the new implementation will be
simpler with commitRecords() only executed in a single place.

Handling of read-only transaction:
==================================

Much of the complexity of the old dependency manager was caused due to
having to optimize for read-only transactions. This will be much
simpler with the new dependency manager since there is no data
structures "pointing between transactions". Still, in this patch the
optimization for read-only transactions is not included and all
transaction goes through the regular commit code path (which is good
for testing of it).

Thus: Transaction::commitNoUpdate() is currently not called.





Transaction::visible() and Transaction::getRelativeState()
==========================================================

These have been updated to use the new dependency manager for
determining the correct state for transactions.


Transaction::getInfo()
======================

This still reports information about number of dependencies (equal to
zero). I did not remove this yet because this required some changes to
the information schema definiiton for this (I think). Will be removed
in a follow-up patch.

Question: Do we want to include information about the start and commit
event for transactions in this table?

You must be logged in to tag this worklog

No Comments yet

Votes

Not yet rated.
You must be logged in to vote.

Watches

0 members are watching this worklog
You must be logged in to track this worklog.

Provide Feedback

Please note:
HTML will be purified, but we allow for a number of HTML tags so that you have the flexibility to decorate your comment text to some extent. The comments allow the following HTML tags:

strong, b, em, blockquote, a, code, pre

To put code into your comment, simply encapsulate your code with
[code language="XXX"][/code], where XXX is any common language, for instance "PHP", "SQL", "C", etc.



You must be logged in to comment