Hi Kent!
I would ask for clarification on the terms "Code base" and "code
stream". When I see "codebase" I think "repository". When I see "code
stream", I think "branch" or "codeline". Yet I get the impression you
are using the terms synonymously.
Also, when you write "code stream", you seem to be referring to one
particular kind/usage: that of maintaining a long-lived variant, either
for a concurrently maintained/supported release (often called
"multi-release" or "multi-project") or else for multiple market/platform
variations within a release (often called "multi-variant").
In my opinion:
1) Transient branches are fine (even ones that last more than a few
hours) and do not cause the waste/retrofitting you describe. But you do
need to follow some rules regarding integration structure and frequency
2) Variant branches are evil, and should be solved with good
architecture/factoring or else configuration that happens at
later-binding-time
3) Multiple release branches are often a necessity in order to support
multiple releases. And supporting multiple releases is highly
undesirable, but often unavoidably mandated by the business/customer
Permit me to elaborate at some considerable length ...
TRANSIENT BRANCHING
===================
There are other uses of branching besides multi-project and
multi-variant, and their purpose is not to create a separately evolving
divergent line. These do not create additional integration-fan out. They
do not need to be propagated or retrofitted to multiple places. They are
often little more than a separate sandbox with the ability to checkin
"private versions" -- something some folks do anyway without using branches.
These "temporary" branches can be very useful, and I do not see them
have the bad/wasteful effects you describe. When I do, the reason is not
inherent to branching - the reason is because the interval/duration
between when the code is branched versus when it is merged-back is too
long/infrequent, which is something that can happen in a developer or
integration sandbox just as easily as when it doesnt use a branch.
PRIVATE BRANCHES
----------------
In fact I often see private developer branches be a boon rather than a
bane, because they give the developers a "safe haven" to experiment, and
checkpoint their experiment, and possibly fail, without disrupting the
team. They can commit their changes once they know it wont "break" the
codeline for the rest of the time.
Until then, they can checkpoint their various stages of experimentation
for making a new test pass or a new refactoring or prior to
updating/rebasing/syncing-up with the codeline. I see this practice
often helps build confidence, making developers less fearful of
committing their changes more frequently.
Some worry it makes them more likely to checkin on their branch and NOT
commit changes. I dont usually see this. WHat I see is that if they
didnt have the private versions, they still would not be inclined to
commit their changes - because they fear committing them. With the
private versions available on a private developer branch, at least the
intermediate experimental states are checked-into the repository, while
still being prevented from "breaking" the codeline.
THIRD PARTY CODELINE
--------------------
This is another case where I regularly see the use of branching being a
help rather than a hindrance. If the reality is that you receive code
from a vendor/supplier which you subsequently modify (rather than
"extend"), then the reality is that you and your vendor are both
developing the same codebase in parallel.
- If you can manage to avoid that (say by having the vendor
incorporate your changes) that is ideal.
- If you cannot (and there may be valid businesss reasons, such as Im
charging additional money to create additional value), then having a
separate vendor branch to represent the vendor's evolution of the code
in parallel tends to be the most effective way to integrate subsequent
vendor releases with your own custom modifications to the vendor's
previous release.
ORGANIZATIONAL "COPING" BRANCHES
--------------------------------
Obviously, we'd rather get rid of the organizational silos/stovepipes
and all work together as one big happy collocated team. The common credo
their is "change your organization, or change your organization!" If you
choose to stick it out, then organizational change typically doesnt
happen overnight and an interim strategy is needed.
A common strategy hear that Ive seen QUITE effective (despite the fact
that it adds another level of indirection) is when development has to
satisfy some separate/independent group for integration, or QA, or
(etc.). Even tho its possible to create tags/labels for these other
groups to reference (rather than the tip of the codeline), something
about the multiple rhythms/cadence of activity on the codeline makes it
conceptually harder to deal with.
So even tho, in theory, it should be possible not to branch, when the
two groups (e.g., development and integration) have incompatible
policies or tolerances or "rhythms", it often seems to work-out better
to give each their own codeline, and let development maintain their own
"Active Development Line" while the other group "pulls" from that to
their own "Release Line" at their own pace/tolerance level.
MULTIPLE MAINTENANCE
=====================
Branching for multiple releases is not desirable, BUT if you must
support multiple releases, THEN branching is one of the most effective
ways to do it. Supporting multiple releases is often a business decision
mandated by a customer.
At that point, we must acknowledge that supporting the non-latest
release(s) is akin to a very big and redundant "story" that is deemed to
be of enough business value to the customer that they are willing to pay
for it. If you agree to it, Im inclined to suggest treating the
agreement as an SLA of sorts, and be clear about expectations of how
long you will do it for a release before retiring/obsoleting it.
Perhaps the cost of doing so could somehow be estimated, and added as a
"rider" or weight factor/multiplier for each "story" that had to be
multi-integrated. Then maybe it would help dissuade the customer from
asking for such concurrent release support, or at least have them ask
for a whole lot less of it (or be more willing to take the latest
release, or be more willing to explore their reasons fro why they wont
and try to work those instead).
I nonetheless acknowledge that DualMaintenance or MultipleMaintenance
(as it is described on the Wiki) is often a business-reality mandated by
business-folk that technical folk dont get to second guess. (See the
UseOneCodeline Wiki-page for some more discussion) .
MULTIPLE VARIANTS
=================
Branching for multiple variants is precisely the kind of evil/waste
Kent's post describes. I see it happen primarily in two cases:
Case 1) Golly gee - project B is sort of like project A, lets create our
own branch of it at release "X" and "run with it from there". No need to
try to talk to or collaborate with the folks still developing project "A".
In the immortal words of "Cool hand Luke": What we have here is a
failure to communicate. Actually its worse than that: Its a failure to
even TRY. I have to think that the solution for such things should be
architectural (in the code structure) rather than temporal (in the
branching structure).
The codebase is a knowledge-base. Multi-variant branching of this sort
splinters the knowledge into "thought streams" that make the "whole"
into something very much incoherent, and which takes an additional
40%-80% more effort to support and maintain as a result of the added
"integration fan-out".
Case 2) Customer Fubar wants fix/enhancement 1, 2, and 3 ... but
Customer Bahfoo wants only 1, 5, and 7 ... and customer Bazoo wants 2,
3, and 7. Why wont each one accept all the fixes/enhancements? Seems to
typically be that:
a) Either not all of them have "value" to each one
b) or they dont trust that we wont "destabilize" what they want when
we fix/enhance something they dont value,
c) or they claim competitive advantage in the order of delivery by
getting those changes before the other changes
Here again in most of these cases I think the issue is a
failure/unwillingness to communicate/cooperate. In the case of a) and c)
its the customers who wont cooperate and come together to speak with a
single unified voice-of-the-customer.
In the case of b) probably not enough communication between customer and
producer because trust has been lost. Trust needs to be earned back -
and adding a variant branch will likely have the opposite of the desired
effect.
MAINLINE
========
If you do branch, then mainlining is essential. The "Mainline" pattern
is sort of like "refactoring" for branches. It helps you minimize the
breadth and depth of the branching hierarchy and also organize it in
such a way that minimizes the size and complexity of the merging that
occurs between codelines.
CONCLUSION
==========
So I'd say I have to say that as a blanket generalization, I'd have to
disagree with the proclamation that "There is only one code stream". I'd
be willing say that, "for a given project, there should be only one
integration stream" as a more feasible and more desirable ideal to
strive for.
Then I'd add that we must acknowledge that anytime we develop+support
multiple releases and/or multiple variants in parallel, that each
additional release/variant is in actuality an additional PROJECT with
all the associated impliciatios of added cost, effort, management, and
administration. And if Multi-Tasking the same team-member is something
to be avoided, then Multi-Project-ing the same team|codebase is an even
grander scale thing to avoid like the plague, for many of the same
reasons (scaled-up or "super-sized" :-)
SEE ALSO
========
* <http://cmwiki.bradapp.net/BranchingAndMerging>
* <http://cmwiki.bradapp.net/ContinuousIntegration>
* <http://cmwiki.bradapp.net/AgileSCMArticles>
--
Brad Appleton <brad-***@public.gmane.org> www.bradapp.net
Software CM Patterns (www.scmpatterns.com)
Effective Teamwork, Practical Integration
"And miles to go before I sleep" --Robert Frost