User:Timothy Smith/Top-down merge flow
Contents |
[edit] What is top-down flow?
With top-down merge flow, code changes are first pushed to the bleeding-edge branch, and then cherry-picked down into successively lower branches. Code is never merged from lower-versioned branches into higher-versioned branches.
This essay is related to the Development Cycle discussion. --Timothy Smith 01:21, 3 April 2009 (CEST)
[edit] Top-down Example
This example uses the team-tree structure in use during the MySQL 5.1 RC phase. It should be easy to adapt to other team branch structures.
Ramil is fixing an optimizer bug which is known to affect all versions of MySQL Server, starting from 5.0 (i.e., 5.0, 5.1, and 6.0). He clones the mysql-current-opt branch to mysql-opt-bugNNNNN, and starts hacking. After code review he pushes his fix to mysql-current-opt for testing (revision A). When the tests pass, he branches mysql-6.0-opt, cherry-picks revision A into it (possibly with some changes due to -current features which are not in -6.0) and pushes his merge to mysql-6.0-opt (revision B). His fix is deemed important, but somewhat risky, so he flags this bug to be back-ported to 5.1 and 5.0 after three months of exposure in the 6.0-beta version. When alerted by the bugs system, he branches mysql-5.1-opt and cherry-picks revision B into it, and stages that merge (revision C). Likewise with mysql-5.0-opt, creating one last merge (revision D).
Tatjana is the merge captain responsible for integrating the *-opt trees into main. She merges mysql-current into mysql-current-opt, mysql-6.0 into mysql-6.0-opt, and so on. If a conflict exists, she must resolve it during each merge. However, she does not need to merge from mysql-6.0-opt up to mysql-current-opt, or from mysql-5.1-opt up to mysql-6.0-opt, etc.
[edit] How is it different from the current flow?
The current flow used in MySQL development is bottom-up. Code changes are first pushed to the lowest relevant branch, and then merged up into successively higher branches. No changes exist in lower-versioned branches which are not merged into higher-versioned branches.
[edit] Bottom-up Example
Ramil creates GCA clones of the mysql-*-opt branches for 5.0, 5.1 and 6.0. He starts hacking in mysql-5.0-bugNNNNN, and commits (revision A). After code review, he merges revision A from mysql-5.0-bugNNNNN into mysql-5.1-bugNNNNN (possibly with some changes due to -5.1 features which are not in -5.0) and commits (revision B). He repeats the process, merging revision B from mysql-5.1-bugNNNNN into mysql-6.0-bugNNNNN (revision C). Now he merges mysql-6.0-bugNNNNN into mysql-6.0-opt, mysql-5.1-bugNNNNN into mysql-5.1-opt, mysql-5.0-bugNNNNN into mysql-5.0-opt. If there are no unmerged changesets, he can then merge his merges up from mysql-5.0-opt to mysql-5.1-opt, and from mysql-5.1-opt up to mysql-6.0-opt. Finally, he pushes mysql-6.0-opt, mysql-5.1-opt, and then mysql-5.0-opt.
Tatjana starts by merging up any unmerged changesets from mysql-5.0-opt to mysql-5.1-opt, and then from mysql-5.1-opt to mysql-6.0-opt. If she is unable to do these merges, she hunts down the developer who forgot to up-merge and waits for him to push the merges. Or she gives up, and leaves unmerged changes in lower versions, which eventually could build up to a serious merge jam. She then merges mysql-5.0 into mysql-5.0-opt, mysql-5.1 into mysql-5.1-opt, and so on. If a conflict exists, she must resolve it during each merge. If there are no unmerged changesets, she can merge up her merges from mysql-5.0-opt into mysql-5.1-opt, and from mysql-5.1-opt into mysql-6.0-opt. Finally, she pushes to mysql-6.0, mysql-5.1 and then mysql-5.0.
[edit] GCA Clones
(Greatest) Common Ancestor clones is a term to describe a set of branches with no unmerged changes in lower-versioned branches. That is, branch N contains all of the revisions in branch N-1. With bzr, they can be created like:
bzr branch mysql-6.0 60 bzr branch -rancestor:./60 mysql-5.1 51 bzr branch -rancestor:./51 mysql-5.0 50
[edit] Advantages of top-down flow
The main advantage is that code changes can be made first in the most volatile branch, and can mature there before the decision is made to back-port them to more stable branches. A developer does not need to wait to know the lowest version a bug should be fixed in before starting to work on the patch. She can find, report, fix, get reviewed, and push a bug fix without concern for its triage category or ultimate lowest version.
Similarly, back-porting of bug fixes to earlier versions, as is occasionally requested by Support or Community, is not an exceptional case. It is part of the normal flow, and doesn't require awkward handling (null-merge changes up, and ignore the redundant "this was pushed to version X" that result, etc.).
On a related note, the developer is focused on providing a correct fix in the development branch. He is encouraged to refactor appropriately, and code in the most appropriate way possible. When back-porting the fix, he can then reduce the changes as required to produce a minimal patch with the least likely chance to cause a regression. It is less likely for a developer to focus first on a minimal patch, and then to do the "extra" work of refactoring, etc., when merging up. That "forward porting" results in the perpetuation of stale, inappropriate abstractions, because we start working in the most conservative environment.
Less back-and-forth merging is required. Merging up, and merging up the merges, and null merging back-ports all go away. Especially with Bazaar, these merges lead to a lot of wasted time. Merge jams go away, because there is no need to merge up from lower branches.
Chances of performing an incorrect merge go way down, for the same reason. For example, it's not uncommon for a new error message to be added to mysql-5.0, and then merged up to mysql-5.1 and then mysql-6.0. The error message may end up at the end of each version's list of errors during the merge, and the merger doesn't realize that she has introduced an incompatibility between versions. While this is possible in the top-down flow (and is avoidable in bottom-up, if one is careful), it is less likely to happen in top-down. The person cherry-picking the fix into the lower version is focused on exactly that one fix, and is much more likely to be aware of the ramifications of the merge. A merge captain for a team (or, worse, someone from a different team who is stuck trying to overcome a merge jam) is more likely to miss such a problem when doing a bottom-up merge. This is just one example, and I know there are proper ways to make this problem go away, but the idea is sound.
[edit] Disadvantages of top-down flow
The main challenge I see is the integration with bugs.mysql.com. In order to properly cherry-pick code changes, *all* changes that affect a certain Bug # or WL # must be properly marked as such. If a bug fix is accumulated into a single revision, then it's not a problem. But if follow-up fixes are required, these could be missed during a back-port if care isn't taken to record them properly.
If a developer knows the back-port will be required, and does it at the same time as the change is being made for the -current branch, then this is not a problem. All the work is done in a bug branch, and it is easy to grab the last N revisions and merge them into the lower branch. However, if the backport happens at a later time, there is a risk of missing some part of the fix if it isn't tagged properly with the Bug or Worklog number.
The situation isn't much different for the bottom-up approach, however. If the developer makes the fix for 5.1 and up, and later finds that it should also go into 5.0, the same problem shows up when doing that backport. The only "advantage" bottom-up has here is that it forces the developer to choose, before starting work on the bug, which versions will be fixed. That is, it removes this problem, usually, by eliminating flexibility.