Practice: Single Code Base

Discussion:

Kent Beck

2005-04-27 04:45:19 UTC

There is only one code stream. You can develop in a temporary branch, but
never let it live longer than a few hours.
Multiple code streams are an enormous source of waste in software
development. I fix a defect in the currently deployed software. Then I have
to retrofit the fix to all the other deployed versions and the active
development branch. Then you find that my fix broke something you were
working on and you interrupt me to fix my fix. And on and on.
There are legitimate reasons for having multiple versions of the source
code active at one time. Sometimes, though, all that is at work is simple
expedience, a micro-optimization taken without a view to the
macro-consequences. If you have multiple code bases, put a plan in place for
reducing them gradually. You can improve the build system to create several
products from a single code base. You can move the variation into
configuration files. Whatever you have to do, improve your process until you
no longer need them.
One of my clients had seven different code bases for seven different
customers and it was costing them more than they could afford. Development
was taking far longer than it used to. Programmers were creating far more
defects than before. Programming just wasn't as fun as it had been
initially. When I pointed out the costs of the multiple code bases and the
impossibility of scaling such a practice, the client responded that they
simply couldn't afford the work of reuniting the code. I couldn't convince
the client to even try reducing from seven to six versions or adding the
next customer as a variation of one of the existing versions.
Don't make more versions of your source code. Rather than add more code
bases, fix the underlying design problem that is preventing you from running
from a single code base. If you have a legitimate reason for having multiple
versions, look at those reasons as assumptions to be challenged rather than
absolutes. It might take a while to unravel deep assumptions, but that
unraveling may open the door to the next round of improvement.

William Pietri

2005-04-27 19:19:52 UTC

Permalink

Post by Kent Beck
There is only one code stream. You can develop in a temporary branch, but
never let it live longer than a few hours.

What experiences have people had with trying this in a multi-team
environment?

I've recently heard vigorous arguments for keeping teams on different
branches, and letting them integrate when ready. Since these are the
same arguments I heard for single-developer branches, I'm unconvinced. I
think in a company with an agile spirit and good test coverage could
solve the real issues with minimal use of branching. But I've never
actually tried it at that scale, so I'd love to hear what others have
had work (or not work).

William

Ron Jeffries

2005-04-27 21:06:47 UTC

Permalink

Post by William Pietri

Post by Kent Beck
There is only one code stream. You can develop in a temporary branch, but
never let it live longer than a few hours.

What experiences have people had with trying this in a multi-team
environment?
I've recently heard vigorous arguments for keeping teams on different
branches, and letting them integrate when ready. Since these are the
same arguments I heard for single-developer branches, I'm unconvinced. I
think in a company with an agile spirit and good test coverage could
solve the real issues with minimal use of branching. But I've never
actually tried it at that scale, so I'd love to hear what others have
had work (or not work).

Is Brad Appleton on the list? He can work up pretty good support for
branching, if I recall ...

Ron Jeffries
www.XProgramming.com
I could be wrong, but I'm not. --Eagles, Victim of Love

Brad Appleton

2005-04-28 09:37:14 UTC

Permalink

Post by William Pietri
I've recently heard vigorous arguments for keeping teams on different
branches, and letting them integrate when ready. Since these are the
same arguments I heard for single-developer branches, I'm unconvinced.

I think single-developer branches are different from separate
feature-team branches in this regard. Single-developer branches do not
cause or encourage increasing the interval between 'checkout' and
'commit'. Its like using a sandbox with a private-checkpointing facility
built-in. Its not keeping in-progress code "separate" any longer than it
would be without the branch in the first place.

Feature-team branches are a different story. They typically promote
keeping partiall-completed (in-progress) features separate without
attempting to integrate with the rest of the world until the feature is
done, or at a significant milestone. The result is either big-bang
feature integration, or else incremental integration using
sigificant-sized increments.

However, it is possible to use feature/team branches in such a way that
everyone still stays reasonably up-to-date with everyone else. The basic
idea is that there are different tolerance levels of breakage. A
feature-branch can tolerate changes that might "break" other features
than the one currently under development in the feature branch, without
hindering any of the feature-teams. So in theory:
* we "commit" a change to the feature-branch instead of to the
project-wide branch. The "commit" shouldnt break the feature branch, but
it could concievably break the project integration branch if it were
integrated there
* we commit the "state" of a feature-branch to the project-wide
integration branch whenever it wouldnt break the other features.
* we update from the project integration branch to the feature branch
as frequently as feasible

The basic theory is the same as if multiple components were being
develoepd this way rather than multiple features. But with multiple
component, in theory at least, there should be no overlap between files.
Whereas it would be possible for the same file to be touched by >1 "feature"

However, its still adding another non-private level of scale of
integration scope+frequency that developer branches dont add.

I might try to see if I could do a "Continuous Staging" approach first
before resulting to feature-branches in this case (see
<http://www.cmcrossroads.com/newsletter/articles/agilemar04.pdf>)

--
Brad Appleton <brad-***@public.gmane.org> www.bradapp.net
Software CM Patterns (www.scmpatterns.com)
Effective Teamwork, Practical Integration
"And miles to go before I sleep" --Robert Frost

George Dinwiddie

2005-04-28 01:15:00 UTC

Permalink

Post by William Pietri

Post by Kent Beck
There is only one code stream. You can develop in a temporary branch, but
never let it live longer than a few hours.

What experiences have people had with trying this in a multi-team
environment?

At my most recent previous job, we had a multi-team approach on a single
application, broken at somewhat arbitrary boundaries. The use of
branching was a constant headache. It took months, many broken builds,
and repeated integration hells to get people off of that idea.

At my current engagement, I've stumbled into a branch situation. There
are three applications, and four projects (all within a small team).
One of the projects is the stuff common to the other projects. As two
applications were being enhanced, the third was neglected and would no
longer even compile, much less work, with the common code.

As the build procedure was tricky and the coupling was strong, I
undertook to revamp the third application with a copy of the common
code. In this way I was able to make the build procedure for it much
more robust (requiring a single build instead of successive builds in
the right order), and to break some of the coupling for testability.

The downside is that I've had to work at keeping the two copies of the
common code more or less in sync, though with different build
procedures. This has been for a longer time than I thought it would be,
as I didn't realize how long I would be frozen from re-integrating
because of the imminent release of the other two apps.

- George

--
----------------------------------------------------------------------
When I remember bygone days George Dinwiddie
I think how evening follows morn; iDIA Computing, LLC
So many I loved were not yet dead, gdinwiddie-HLNzD44B1AikJOqCEYON2AC/***@public.gmane.org
So many I love were not yet born. http://www.idiacomputing.com
'The Middle' by Ogden Nash http://www.agilemaryland.org
----------------------------------------------------------------------

SirGilligan

2005-04-27 20:08:20 UTC

Permalink

I agree and I have seen it personally.

Recently I have experienced the costs of multiple code bases once
again. This is the scenario:

Internet applications running 24/7.
New architecture to address issues which takes several months to
develop and is not an evolution of the current solution.

The code was branched with full knowledge of the costs. The reason is
that during these several months while developing the new system
there would be changes to the existing system that are necessary to
meet business needs. It sufices to say that the system is very
dynamic and requires changes to code when changes to meta data, xml
files, ini files, and other configuration or definition files do not
suffice. (Hint to why we develop a new system).

It has been an exercise in discipline to keep these two branches
correct. It has had its problems. We expected such. A problem that is
common to all multi-base situations is a fix in one necessitates a
fix in the other. But it isn't simple. Our new system uses code from
the old (if it isn't broke...).

It isn't simple because if a problem is found in one branch and you
fix it and you take the fix to the other branch and that area of the
branch as changes you have dependencies on those changes and possibly
new code so you can't just insert the fix.

If you are in the new code and you find a bug in some of the code
that came over and you fix it you can not go back to the old code and
insert the fix from the new code because it will be coupled to many
things and will cause you to bring more and more and more of the new
system into the old. The coupling, the cohesion, the integrated
behaviors, all of this (and probably more) makes it intractable.

Currently we have the debate on the final steps. There are two camps
that are vocal about the problem and here are the ideas:

When the new system is complete we:
1) Merge the new system back into the old tree.
2) Overwrite the old system with the new system.

I am in camp (2) and feel strongly about it.
The camp (1) people have their reasons.

Just some current experiences for you guys to hear about. Nothing
new. I can't imagine 7 code bases.

Geoff

Brad Appleton

2005-04-28 09:19:29 UTC

Permalink

Interesting story. So the branching was done because replacement
functionality (ort replacement implementation of functionality) was
being done in parallel with maintenance of existing functionality?

If that is correct, do you think it could have been done without
branching? Instead of creating a new "node" in the version-branching
tree, could it have been solved with new "nodes" in the
directory-hierarchy and/or class-hierarchy or component|module
hierarchy? Would it have been possible to use patterns such as Bridge,
Factory, Wrapper-Facade, etc. to mock/stub old versus new functionality
(possibly even reusing many of the tests) and switch on/off the
old-vs-new funcitonality in a given build?

Post by SirGilligan
I agree and I have seen it personally.
Recently I have experienced the costs of multiple code bases once
Internet applications running 24/7.
New architecture to address issues which takes several months to
develop and is not an evolution of the current solution.
The code was branched with full knowledge of the costs. The reason is
that during these several months while developing the new system
there would be changes to the existing system that are necessary to
meet business needs. It sufices to say that the system is very
dynamic and requires changes to code when changes to meta data, xml
files, ini files, and other configuration or definition files do not
suffice. (Hint to why we develop a new system).
It has been an exercise in discipline to keep these two branches
correct. It has had its problems. We expected such. A problem that is
common to all multi-base situations is a fix in one necessitates a
fix in the other. But it isn't simple. Our new system uses code from
the old (if it isn't broke...).
It isn't simple because if a problem is found in one branch and you
fix it and you take the fix to the other branch and that area of the
branch as changes you have dependencies on those changes and possibly
new code so you can't just insert the fix.
If you are in the new code and you find a bug in some of the code
that came over and you fix it you can not go back to the old code and
insert the fix from the new code because it will be coupled to many
things and will cause you to bring more and more and more of the new
system into the old. The coupling, the cohesion, the integrated
behaviors, all of this (and probably more) makes it intractable.
Currently we have the debate on the final steps. There are two camps
1) Merge the new system back into the old tree.
2) Overwrite the old system with the new system.
I am in camp (2) and feel strongly about it.
The camp (1) people have their reasons.
Just some current experiences for you guys to hear about. Nothing
new. I can't imagine 7 code bases.
Geoff
------------------------------------------------------------------------
*Yahoo! Groups Links*
http://groups.yahoo.com/group/xpbookdiscussiongroup/
* Your use of Yahoo! Groups is subject to the Yahoo! Terms of
Service <http://docs.yahoo.com/info/terms/>.

--
Brad Appleton <brad-***@public.gmane.org> www.bradapp.net
Software CM Patterns (www.scmpatterns.com)
Effective Teamwork, Practical Integration
"And miles to go before I sleep" --Robert Frost

SirGilligan

2005-04-28 16:49:45 UTC

Permalink

Post by Brad Appleton
Interesting story. So the branching was done because replacement
functionality (ort replacement implementation of functionality) was
being done in parallel with maintenance of existing functionality?
If that is correct, do you think it could have been done without
branching? Instead of creating a new "node" in the version-

branching

Post by Brad Appleton
tree, could it have been solved with new "nodes" in the
directory-hierarchy and/or class-hierarchy or component|module
hierarchy? Would it have been possible to use patterns such as

Bridge,

Post by Brad Appleton
Factory, Wrapper-Facade, etc. to mock/stub old versus new

functionality

Post by Brad Appleton
(possibly even reusing many of the tests) and switch on/off the
old-vs-new funcitonality in a given build?

Excellent questions. I can tell you have given this some thought and
probably have been there before as well.

Creating the new nodes in the tree would have covered 85% of what we
are doing because that much is new. The 15% of changes to the
existing nodes become the issue.

Using some adapter pattern to go between the two implies a
translation of one domain to the other and there are so few
similarities that it would be like mapping a connected graph of
objects into a tree structure and not loose any connections! :-)

Geoff

Brad Appleton

2005-04-28 09:10:35 UTC

Permalink

Hi Kent!

I would ask for clarification on the terms "Code base" and "code
stream". When I see "codebase" I think "repository". When I see "code
stream", I think "branch" or "codeline". Yet I get the impression you
are using the terms synonymously.

Also, when you write "code stream", you seem to be referring to one
particular kind/usage: that of maintaining a long-lived variant, either
for a concurrently maintained/supported release (often called
"multi-release" or "multi-project") or else for multiple market/platform
variations within a release (often called "multi-variant").

In my opinion:
1) Transient branches are fine (even ones that last more than a few
hours) and do not cause the waste/retrofitting you describe. But you do
need to follow some rules regarding integration structure and frequency

2) Variant branches are evil, and should be solved with good
architecture/factoring or else configuration that happens at
later-binding-time

3) Multiple release branches are often a necessity in order to support
multiple releases. And supporting multiple releases is highly
undesirable, but often unavoidably mandated by the business/customer

Permit me to elaborate at some considerable length ...

TRANSIENT BRANCHING
===================

There are other uses of branching besides multi-project and
multi-variant, and their purpose is not to create a separately evolving
divergent line. These do not create additional integration-fan out. They
do not need to be propagated or retrofitted to multiple places. They are
often little more than a separate sandbox with the ability to checkin
"private versions" -- something some folks do anyway without using branches.

These "temporary" branches can be very useful, and I do not see them
have the bad/wasteful effects you describe. When I do, the reason is not
inherent to branching - the reason is because the interval/duration
between when the code is branched versus when it is merged-back is too
long/infrequent, which is something that can happen in a developer or
integration sandbox just as easily as when it doesnt use a branch.

PRIVATE BRANCHES
----------------

In fact I often see private developer branches be a boon rather than a
bane, because they give the developers a "safe haven" to experiment, and
checkpoint their experiment, and possibly fail, without disrupting the
team. They can commit their changes once they know it wont "break" the
codeline for the rest of the time.

Until then, they can checkpoint their various stages of experimentation
for making a new test pass or a new refactoring or prior to
updating/rebasing/syncing-up with the codeline. I see this practice
often helps build confidence, making developers less fearful of
committing their changes more frequently.

Some worry it makes them more likely to checkin on their branch and NOT
commit changes. I dont usually see this. WHat I see is that if they
didnt have the private versions, they still would not be inclined to
commit their changes - because they fear committing them. With the
private versions available on a private developer branch, at least the
intermediate experimental states are checked-into the repository, while
still being prevented from "breaking" the codeline.

THIRD PARTY CODELINE
--------------------

This is another case where I regularly see the use of branching being a
help rather than a hindrance. If the reality is that you receive code
from a vendor/supplier which you subsequently modify (rather than
"extend"), then the reality is that you and your vendor are both
developing the same codebase in parallel.

- If you can manage to avoid that (say by having the vendor
incorporate your changes) that is ideal.

- If you cannot (and there may be valid businesss reasons, such as Im
charging additional money to create additional value), then having a
separate vendor branch to represent the vendor's evolution of the code
in parallel tends to be the most effective way to integrate subsequent
vendor releases with your own custom modifications to the vendor's
previous release.

ORGANIZATIONAL "COPING" BRANCHES
--------------------------------

Obviously, we'd rather get rid of the organizational silos/stovepipes
and all work together as one big happy collocated team. The common credo
their is "change your organization, or change your organization!" If you
choose to stick it out, then organizational change typically doesnt
happen overnight and an interim strategy is needed.

A common strategy hear that Ive seen QUITE effective (despite the fact
that it adds another level of indirection) is when development has to
satisfy some separate/independent group for integration, or QA, or
(etc.). Even tho its possible to create tags/labels for these other
groups to reference (rather than the tip of the codeline), something
about the multiple rhythms/cadence of activity on the codeline makes it
conceptually harder to deal with.

So even tho, in theory, it should be possible not to branch, when the
two groups (e.g., development and integration) have incompatible
policies or tolerances or "rhythms", it often seems to work-out better
to give each their own codeline, and let development maintain their own
"Active Development Line" while the other group "pulls" from that to
their own "Release Line" at their own pace/tolerance level.

MULTIPLE MAINTENANCE
=====================

Branching for multiple releases is not desirable, BUT if you must
support multiple releases, THEN branching is one of the most effective
ways to do it. Supporting multiple releases is often a business decision
mandated by a customer.

At that point, we must acknowledge that supporting the non-latest
release(s) is akin to a very big and redundant "story" that is deemed to
be of enough business value to the customer that they are willing to pay
for it. If you agree to it, Im inclined to suggest treating the
agreement as an SLA of sorts, and be clear about expectations of how
long you will do it for a release before retiring/obsoleting it.

Perhaps the cost of doing so could somehow be estimated, and added as a
"rider" or weight factor/multiplier for each "story" that had to be
multi-integrated. Then maybe it would help dissuade the customer from
asking for such concurrent release support, or at least have them ask
for a whole lot less of it (or be more willing to take the latest
release, or be more willing to explore their reasons fro why they wont
and try to work those instead).

I nonetheless acknowledge that DualMaintenance or MultipleMaintenance
(as it is described on the Wiki) is often a business-reality mandated by
business-folk that technical folk dont get to second guess. (See the
UseOneCodeline Wiki-page for some more discussion) .

MULTIPLE VARIANTS
=================

Branching for multiple variants is precisely the kind of evil/waste
Kent's post describes. I see it happen primarily in two cases:

Case 1) Golly gee - project B is sort of like project A, lets create our
own branch of it at release "X" and "run with it from there". No need to
try to talk to or collaborate with the folks still developing project "A".

In the immortal words of "Cool hand Luke": What we have here is a
failure to communicate. Actually its worse than that: Its a failure to
even TRY. I have to think that the solution for such things should be
architectural (in the code structure) rather than temporal (in the
branching structure).

The codebase is a knowledge-base. Multi-variant branching of this sort
splinters the knowledge into "thought streams" that make the "whole"
into something very much incoherent, and which takes an additional
40%-80% more effort to support and maintain as a result of the added
"integration fan-out".

Case 2) Customer Fubar wants fix/enhancement 1, 2, and 3 ... but
Customer Bahfoo wants only 1, 5, and 7 ... and customer Bazoo wants 2,
3, and 7. Why wont each one accept all the fixes/enhancements? Seems to
typically be that:
a) Either not all of them have "value" to each one
b) or they dont trust that we wont "destabilize" what they want when
we fix/enhance something they dont value,
c) or they claim competitive advantage in the order of delivery by
getting those changes before the other changes

Here again in most of these cases I think the issue is a
failure/unwillingness to communicate/cooperate. In the case of a) and c)
its the customers who wont cooperate and come together to speak with a
single unified voice-of-the-customer.

In the case of b) probably not enough communication between customer and
producer because trust has been lost. Trust needs to be earned back -
and adding a variant branch will likely have the opposite of the desired
effect.

MAINLINE
========

If you do branch, then mainlining is essential. The "Mainline" pattern
is sort of like "refactoring" for branches. It helps you minimize the
breadth and depth of the branching hierarchy and also organize it in
such a way that minimizes the size and complexity of the merging that
occurs between codelines.

CONCLUSION
==========

So I'd say I have to say that as a blanket generalization, I'd have to
disagree with the proclamation that "There is only one code stream". I'd
be willing say that, "for a given project, there should be only one
integration stream" as a more feasible and more desirable ideal to
strive for.

Then I'd add that we must acknowledge that anytime we develop+support
multiple releases and/or multiple variants in parallel, that each
additional release/variant is in actuality an additional PROJECT with
all the associated impliciatios of added cost, effort, management, and
administration. And if Multi-Tasking the same team-member is something
to be avoided, then Multi-Project-ing the same team|codebase is an even
grander scale thing to avoid like the plague, for many of the same
reasons (scaled-up or "super-sized" :-)

SEE ALSO
========

* <http://cmwiki.bradapp.net/BranchingAndMerging>
* <http://cmwiki.bradapp.net/ContinuousIntegration>
* <http://cmwiki.bradapp.net/AgileSCMArticles>

--
Brad Appleton <brad-***@public.gmane.org> www.bradapp.net
Software CM Patterns (www.scmpatterns.com)
Effective Teamwork, Practical Integration
"And miles to go before I sleep" --Robert Frost

Kent Beck

2005-05-06 05:28:56 UTC

Permalink

Brad,

Thank you for the extensive explanation. I think I understand the
configuration manager point of view better now.

The point of Single Code Base is that development is more effective if you
avoid the multiple maintenance problem. The less multiple maintenance you
have to do, the better. There are large-scale problems, like a relationship
with a customer where they won't use the latest release, where multiple
maintenance is a necessity. If you want software development to be more
effective, you should address the root cause of such problems and cope as
well as you can in the meantime. How does this compare to what you said?

Kent Beck
Three Rivers Institute

-----Original Message-----
Brad Appleton
Sent: Thursday, April 28, 2005 2:11 AM
Subject: Re: [xpe2e] Practice: Single Code Base
Hi Kent!
I would ask for clarification on the terms "Code base" and "code
stream". When I see "codebase" I think "repository". When I see "code
stream", I think "branch" or "codeline". Yet I get the impression you
are using the terms synonymously.
Also, when you write "code stream", you seem to be referring to one
particular kind/usage: that of maintaining a long-lived
variant, either
for a concurrently maintained/supported release (often called
"multi-release" or "multi-project") or else for multiple
market/platform
variations within a release (often called "multi-variant").
1) Transient branches are fine (even ones that last more than a few
hours) and do not cause the waste/retrofitting you describe.
But you do
need to follow some rules regarding integration structure and
frequency
2) Variant branches are evil, and should be solved with good
architecture/factoring or else configuration that happens at
later-binding-time
3) Multiple release branches are often a necessity in order
to support
multiple releases. And supporting multiple releases is highly
undesirable, but often unavoidably mandated by the business/customer
Permit me to elaborate at some considerable length ...
TRANSIENT BRANCHING
===================
There are other uses of branching besides multi-project and
multi-variant, and their purpose is not to create a
separately evolving
divergent line. These do not create additional
integration-fan out. They
do not need to be propagated or retrofitted to multiple
places. They are
often little more than a separate sandbox with the ability to checkin
"private versions" -- something some folks do anyway without
using branches.
These "temporary" branches can be very useful, and I do not see them
have the bad/wasteful effects you describe. When I do, the
reason is not
inherent to branching - the reason is because the interval/duration
between when the code is branched versus when it is
merged-back is too
long/infrequent, which is something that can happen in a developer or
integration sandbox just as easily as when it doesnt use a branch.
PRIVATE BRANCHES
----------------
In fact I often see private developer branches be a boon
rather than a
bane, because they give the developers a "safe haven" to
experiment, and
checkpoint their experiment, and possibly fail, without
disrupting the
team. They can commit their changes once they know it wont
"break" the
codeline for the rest of the time.
Until then, they can checkpoint their various stages of
experimentation
for making a new test pass or a new refactoring or prior to
updating/rebasing/syncing-up with the codeline. I see this practice
often helps build confidence, making developers less fearful of
committing their changes more frequently.
Some worry it makes them more likely to checkin on their
branch and NOT
commit changes. I dont usually see this. WHat I see is that if they
didnt have the private versions, they still would not be inclined to
commit their changes - because they fear committing them. With the
private versions available on a private developer branch, at
least the
intermediate experimental states are checked-into the
repository, while
still being prevented from "breaking" the codeline.
THIRD PARTY CODELINE
--------------------
This is another case where I regularly see the use of
branching being a
help rather than a hindrance. If the reality is that you receive code
from a vendor/supplier which you subsequently modify (rather than
"extend"), then the reality is that you and your vendor are both
developing the same codebase in parallel.
- If you can manage to avoid that (say by having the vendor
incorporate your changes) that is ideal.
- If you cannot (and there may be valid businesss reasons,
such as Im
charging additional money to create additional value), then having a
separate vendor branch to represent the vendor's evolution of
the code
in parallel tends to be the most effective way to integrate
subsequent
vendor releases with your own custom modifications to the vendor's
previous release.
ORGANIZATIONAL "COPING" BRANCHES
--------------------------------
Obviously, we'd rather get rid of the organizational silos/stovepipes
and all work together as one big happy collocated team. The
common credo
their is "change your organization, or change your
organization!" If you
choose to stick it out, then organizational change typically doesnt
happen overnight and an interim strategy is needed.
A common strategy hear that Ive seen QUITE effective (despite
the fact
that it adds another level of indirection) is when development has to
satisfy some separate/independent group for integration, or QA, or
(etc.). Even tho its possible to create tags/labels for these other
groups to reference (rather than the tip of the codeline), something
about the multiple rhythms/cadence of activity on the
codeline makes it
conceptually harder to deal with.
So even tho, in theory, it should be possible not to branch, when the
two groups (e.g., development and integration) have incompatible
policies or tolerances or "rhythms", it often seems to
work-out better
to give each their own codeline, and let development maintain
their own
"Active Development Line" while the other group "pulls" from that to
their own "Release Line" at their own pace/tolerance level.
MULTIPLE MAINTENANCE
=====================
Branching for multiple releases is not desirable, BUT if you must
support multiple releases, THEN branching is one of the most
effective
ways to do it. Supporting multiple releases is often a
business decision
mandated by a customer.
At that point, we must acknowledge that supporting the non-latest
release(s) is akin to a very big and redundant "story" that
is deemed to
be of enough business value to the customer that they are
willing to pay
for it. If you agree to it, Im inclined to suggest treating the
agreement as an SLA of sorts, and be clear about expectations of how
long you will do it for a release before retiring/obsoleting it.
Perhaps the cost of doing so could somehow be estimated, and
added as a
"rider" or weight factor/multiplier for each "story" that had to be
multi-integrated. Then maybe it would help dissuade the customer from
asking for such concurrent release support, or at least have them ask
for a whole lot less of it (or be more willing to take the latest
release, or be more willing to explore their reasons fro why
they wont
and try to work those instead).
I nonetheless acknowledge that DualMaintenance or MultipleMaintenance
(as it is described on the Wiki) is often a business-reality
mandated by
business-folk that technical folk dont get to second guess. (See the
UseOneCodeline Wiki-page for some more discussion) .
MULTIPLE VARIANTS
=================
Branching for multiple variants is precisely the kind of evil/waste
Case 1) Golly gee - project B is sort of like project A, lets
create our
own branch of it at release "X" and "run with it from there".
No need to
try to talk to or collaborate with the folks still developing
project "A".
In the immortal words of "Cool hand Luke": What we have here is a
failure to communicate. Actually its worse than that: Its a
failure to
even TRY. I have to think that the solution for such things should be
architectural (in the code structure) rather than temporal (in the
branching structure).
The codebase is a knowledge-base. Multi-variant branching of
this sort
splinters the knowledge into "thought streams" that make the "whole"
into something very much incoherent, and which takes an additional
40%-80% more effort to support and maintain as a result of the added
"integration fan-out".
Case 2) Customer Fubar wants fix/enhancement 1, 2, and 3 ... but
Customer Bahfoo wants only 1, 5, and 7 ... and customer Bazoo
wants 2,
3, and 7. Why wont each one accept all the
fixes/enhancements? Seems to
a) Either not all of them have "value" to each one
b) or they dont trust that we wont "destabilize" what they
want when
we fix/enhance something they dont value,
c) or they claim competitive advantage in the order of delivery by
getting those changes before the other changes
Here again in most of these cases I think the issue is a
failure/unwillingness to communicate/cooperate. In the case
of a) and c)
its the customers who wont cooperate and come together to
speak with a
single unified voice-of-the-customer.
In the case of b) probably not enough communication between
customer and
producer because trust has been lost. Trust needs to be earned back -
and adding a variant branch will likely have the opposite of
the desired
effect.
MAINLINE
========
If you do branch, then mainlining is essential. The
"Mainline" pattern
is sort of like "refactoring" for branches. It helps you minimize the
breadth and depth of the branching hierarchy and also organize it in
such a way that minimizes the size and complexity of the merging that
occurs between codelines.
CONCLUSION
==========
So I'd say I have to say that as a blanket generalization,
I'd have to
disagree with the proclamation that "There is only one code
stream". I'd
be willing say that, "for a given project, there should be only one
integration stream" as a more feasible and more desirable ideal to
strive for.
Then I'd add that we must acknowledge that anytime we develop+support
multiple releases and/or multiple variants in parallel, that each
additional release/variant is in actuality an additional PROJECT with
all the associated impliciatios of added cost, effort,
management, and
administration. And if Multi-Tasking the same team-member is
something
to be avoided, then Multi-Project-ing the same team|codebase
is an even
grander scale thing to avoid like the plague, for many of the same
reasons (scaled-up or "super-sized" :-)
SEE ALSO
========
* <http://cmwiki.bradapp.net/BranchingAndMerging>
* <http://cmwiki.bradapp.net/ContinuousIntegration>
* <http://cmwiki.bradapp.net/AgileSCMArticles>
--
Software CM Patterns (www.scmpatterns.com)
Effective Teamwork, Practical Integration
"And miles to go before I sleep" --Robert Frost
Yahoo! Groups Links

Brad Appleton

2005-05-06 06:57:10 UTC

Permalink

Post by Kent Beck
The point of Single Code Base is that development is more effective if you
avoid the multiple maintenance problem. The less multiple maintenance you
have to do, the better. There are large-scale problems, like a relationship
with a customer where they won't use the latest release, where multiple
maintenance is a necessity. If you want software development to be more
effective, you should address the root cause of such problems and cope as
well as you can in the meantime. How does this compare to what you said?

I like the above much better because it names the problem (multiple
maintenance) rather than one particular technique (branching) for
solving it. This by itself helps others more quickly identify the root
cause IMHO.

What still confuses me is the usage of "codebase" -vs- "code stream". I
agree there is a single "codebase", but I think you mean to say
something stronger than "one codebase". I think you really mean "Single
Code-Stream", or (more accurately IMHO) "Single Release Stream" (feel
free to say "Codeline" or "Release-Line" if you prefer the term "line"
over "stream" :-)

If it were me, I would prefer to see "Single Release Stream" because I
feel it rules out multiple concurrent releases and variants without
exlcuding "transient" branches that all end-up flowing back into the
sole release-stream.

Something about this also reminds me of the traditional advice for not
rushing to parallelize a computer program (or maybe its more similar to
the guideline that a subroutine and/or loop should have exactly one
exit-point :-)

--
Brad Appleton <brad-***@public.gmane.org> www.bradapp.net
Software CM Patterns (www.scmpatterns.com)
Effective Teamwork, Practical Integration
"And miles to go before I sleep" --Robert Frost

Andrew McDonagh

2005-05-09 10:24:55 UTC

Permalink

Post by Kent Beck
There is only one code stream. You can develop in a temporary branch, but
never let it live longer than a few hours.
Multiple code streams are an enormous source of waste in software
development. I fix a defect in the currently deployed software. Then I have
to retrofit the fix to all the other deployed versions and the active
development branch. Then you find that my fix broke something you were
working on and you interrupt me to fix my fix. And on and on.
There are legitimate reasons for having multiple versions of the source
code active at one time. Sometimes, though, all that is at work is simple
expedience, a micro-optimization taken without a view to the
macro-consequences. If you have multiple code bases, put a plan in place for
reducing them gradually. You can improve the build system to create several
products from a single code base. You can move the variation into
configuration files. Whatever you have to do, improve your process until you
no longer need them.
One of my clients had seven different code bases for seven different
customers and it was costing them more than they could afford. Development
was taking far longer than it used to. Programmers were creating far more
defects than before. Programming just wasn't as fun as it had been
initially. When I pointed out the costs of the multiple code bases and the
impossibility of scaling such a practice, the client responded that they
simply couldn't afford the work of reuniting the code. I couldn't convince
the client to even try reducing from seven to six versions or adding the
next customer as a variation of one of the existing versions.
Don't make more versions of your source code. Rather than add more code
bases, fix the underlying design problem that is preventing you from running
from a single code base. If you have a legitimate reason for having multiple
versions, look at those reasons as assumptions to be challenged rather than
absolutes. It might take a while to unravel deep assumptions, but that
unraveling may open the door to the next round of improvement.

On our product, we have to maintain 'at least' one version prior to the
latest version. Shortly before we had to start our second version, I
looked at most/all of the various techniques (Brad's book is excellent
for helping here) and quickly came to the conclusion you are referring
to above - one code base is a lot easier and safer than a multitude of
them. It took some selling to management, but I pushed for and won
branching to be done during build and runtime.

We are now onto our sixth version, using the same original code base -
we can still build each of the six versions today from the same
mainline, just by setting the desired version before starting the
release build.

There is a trade off, the 'branch'ing still exists, albeit in code
rather than SCM, and so its just as possible to introduce bugs into one
version but not another. However, this is largely mitigated by because
we can build and test each of them at the same time from a single source
code base, rather than having to use the SCM tool's branching support.

Brad Appleton

2005-05-12 05:30:12 UTC

Permalink

Hi Andrew!

Post by Andrew McDonagh
On our product, we have to maintain 'at least' one version prior to the
latest version. Shortly before we had to start our second version, I
looked at most/all of the various techniques (Brad's book is excellent
for helping here) and quickly came to the conclusion you are referring
to above - one code base is a lot easier and safer than a multitude of
them. It took some selling to management, but I pushed for and won
branching to be done during build and runtime.
We are now onto our sixth version, using the same original code base -
we can still build each of the six versions today from the same
mainline, just by setting the desired version before starting the
release build.

Thanks for the story (and the praise of the book :)

Post by Andrew McDonagh
There is a trade off, the 'branch'ing still exists, albeit in code
rather than SCM, and so its just as possible to introduce bugs into one
version but not another. However, this is largely mitigated by because
we can build and test each of them at the same time from a single source
code base, rather than having to use the SCM tool's branching support.

Can you say more about what it means for the branching to be "in code
rather than SCM" ? I didnt entirely understand and want to learn more
about it.

--
Brad Appleton <brad-***@public.gmane.org> www.bradapp.net
Software CM Patterns (www.scmpatterns.com)
Effective Teamwork, Practical Integration
"And miles to go before I sleep" --Robert Frost

Andrew McDonagh

2005-05-12 09:25:57 UTC

Permalink

Post by Brad Appleton
Hi Andrew!

Hi Brad,

snipped.

Post by Brad Appleton
Thanks for the story (and the praise of the book :)

My pleasure, your help in SCM matters have served me (and others) very
well indeed over the 8 years I've been a developer. Starting fro CCIUG
days to now.

Post by Brad Appleton

Can you say more about what it means for the branching to be "in code
rather than SCM" ? I didnt entirely understand and want to learn more
about it.

Sure...

As we know, typically different versions of the same product branch at
SCM level, by Branch I mean a 'Variant' code base (project-oriented
branching), based upon the previous version.

Mainline for Product

0
|
1---1.1---.1.2--- 1.3 (no longer supported by company)
|
2---2.1 (only bug fixes and minor enhancements)
|
3 (Latest version of product - all singing all dancing best ever)

The problem I've seen in other projects, has been to do with having to
support 2.0 & 2.1 whilst at the same time work on the latest version 3.
Bugs fixed in 2.0 have to be merged into 2.1. If they exist in 3.0,
then they are merged there if possible, however, a lot of the time, its
not possible to merge the fixes between 2.x and 3.0 branches because the
code has moved on so much and so the fix has to be reimplemented.

Every now and a again we'll have one customer who wont upgrade to the
latest version of any of the offically supported versions (2.x), yet
still want and get support from the company - in these cases its a
business discussion as to whether we fix the issue the customer has, or
not and convince them to upgrade.

All this leads to the usual problems of fixes being applied to multiple
branch lines, and all of the associated work with those activities.

Having been burnt enough by this in the past I wanted a situation where
we could mitigate the need for duplicating the work, either through
having to merge the fixes/enhancements or simply having to re-implement
them afresh on each branch.

So how is this achievable when we still need to have the concept of
'branching' the product - as we need to support differing functionality
for the various releases?

The options I came up with are:

1) Branch within the SCM as above - nope
2) Branch at build time - using build parameters, #defines,
configuration scripts.
3) Runtime - using various patterns (Strategy, Command), config files
and behaviour registration techniques.

In our case, Runtime branching essentially uses a lot of build time
branchings support for creating/modifying any necessary config files -
although these could also be created/modified at runtime by the products
installation tool.

Within our application, there is a FeatureRegistry object that uses the
config files to determine what the product version is.
Within the apps startup code, features are registered with the
FeatureRegistry and provide a minimum product version that they are
allowed to be used with. This really is as simple as a product version
object and the features ID.

Then anywhere within the code that would make use of that feature,
checks the FeatureRegistry to see if the feature is enabled or not - the
FeatureRegistry simply checks the minimum version number against what it
thinks the product version is.

Using patterns like Strategy, Command, etc means we have a nice loosely
coupled design, which can change behaviour at runtime.

At present, there is no need to dynamically change the products version
number at runtime, but should we need to, then it wouldn't be to hard a
change to implement.

Its very much like how Java can change its look and feel at runtime, or
how a lot of products nowadays can have extra features enabled just by
buying a different licence key.

I believe (although I can't substantiate) the Quake game engine used a
simpler technique when they were developing Quake 2. They did start
with a branch, the continued with the Quake 1 code and extended it and
used config files (the Maps basically). This led to quicker release
times and the ability to give Quake 1 patches for bugs they fixed as the
went along.

HTH

Andrew

Brad Appleton

2005-05-14 22:07:53 UTC

Permalink

GREAT story of how you eliminated the need for version-branching by
deferring the feature-set selection logic to later binding time (run-time)!

It sounds kind of like Doug Schmidt's "Service Configurator" pattern. It
also seemed somewhat reminiscent of home some license-managers work to
lookup if a given feature is licensed for use by the current
user/installation (particularly when the software supports multiple
feature levels, say for eval/demo, home/personal-use, small-business,
and enterprise).

So are some of your your previous development CM problems now arising as
deployment (installation configuration) CM problems? Or are they new
problems instead? I know what kid of problems your solution avoided, Im
curious about your thoughts and experiences with the set of issues you
"traded-off" for instead, how you resolve them, and how/why you feel the
trade-off is worth it.

Post by Andrew McDonagh

Post by Brad Appleton
Hi Andrew!

Hi Brad,
snipped.

Post by Brad Appleton
Thanks for the story (and the praise of the book :)

My pleasure, your help in SCM matters have served me (and others) very
well indeed over the 8 years I've been a developer. Starting fro CCIUG
days to now.

Post by Brad Appleton

Can you say more about what it means for the branching to be "in code
rather than SCM" ? I didnt entirely understand and want to learn more
about it.

Sure...
As we know, typically different versions of the same product branch at
SCM level, by Branch I mean a 'Variant' code base (project-oriented
branching), based upon the previous version.
Mainline for Product
0
|
1---1.1---.1.2--- 1.3 (no longer supported by company)
|
2---2.1 (only bug fixes and minor enhancements)
|
3 (Latest version of product - all singing all dancing best ever)
The problem I've seen in other projects, has been to do with having to
support 2.0 & 2.1 whilst at the same time work on the latest version 3.
Bugs fixed in 2.0 have to be merged into 2.1. If they exist in 3.0,
then they are merged there if possible, however, a lot of the time, its
not possible to merge the fixes between 2.x and 3.0 branches because the
code has moved on so much and so the fix has to be reimplemented.
Every now and a again we'll have one customer who wont upgrade to the
latest version of any of the offically supported versions (2.x), yet
still want and get support from the company - in these cases its a
business discussion as to whether we fix the issue the customer has, or
not and convince them to upgrade.
All this leads to the usual problems of fixes being applied to multiple
branch lines, and all of the associated work with those activities.
Having been burnt enough by this in the past I wanted a situation where
we could mitigate the need for duplicating the work, either through
having to merge the fixes/enhancements or simply having to re-implement
them afresh on each branch.
So how is this achievable when we still need to have the concept of
'branching' the product - as we need to support differing functionality
for the various releases?
1) Branch within the SCM as above - nope
2) Branch at build time - using build parameters, #defines,
configuration scripts.
3) Runtime - using various patterns (Strategy, Command), config files
and behaviour registration techniques.
In our case, Runtime branching essentially uses a lot of build time
branchings support for creating/modifying any necessary config files -
although these could also be created/modified at runtime by the products
installation tool.
Within our application, there is a FeatureRegistry object that uses the
config files to determine what the product version is.
Within the apps startup code, features are registered with the
FeatureRegistry and provide a minimum product version that they are
allowed to be used with. This really is as simple as a product version
object and the features ID.
Then anywhere within the code that would make use of that feature,
checks the FeatureRegistry to see if the feature is enabled or not - the
FeatureRegistry simply checks the minimum version number against what it
thinks the product version is.
Using patterns like Strategy, Command, etc means we have a nice loosely
coupled design, which can change behaviour at runtime.
At present, there is no need to dynamically change the products version
number at runtime, but should we need to, then it wouldn't be to hard a
change to implement.
Its very much like how Java can change its look and feel at runtime, or
how a lot of products nowadays can have extra features enabled just by
buying a different licence key.
I believe (although I can't substantiate) the Quake game engine used a
simpler technique when they were developing Quake 2. They did start
with a branch, the continued with the Quake 1 code and extended it and
used config files (the Maps basically). This led to quicker release
times and the ability to give Quake 1 patches for bugs they fixed as the
went along.
HTH
Andrew
------------------------------------------------------------------------
*Yahoo! Groups Links*
http://groups.yahoo.com/group/xpbookdiscussiongroup/
* Your use of Yahoo! Groups is subject to the Yahoo! Terms of
Service <http://docs.yahoo.com/info/terms/>.

--
Brad Appleton <brad-***@public.gmane.org> www.bradapp.net
Software CM Patterns (www.scmpatterns.com)
Effective Teamwork, Practical Integration
"And miles to go before I sleep" --Robert Frost

Andrew McDonagh

2005-05-16 13:31:33 UTC

Permalink

Post by Brad Appleton
GREAT story of how you eliminated the need for version-branching by
deferring the feature-set selection logic to later binding time (run-time)!

Cheers

Post by Brad Appleton
It sounds kind of like Doug Schmidt's "Service Configurator" pattern.

Yep, or 'Service Locator', Dependency Injection' or Mark Grand's
'Dynamic Linkage*'*
*
*http://www.mindspring.com/~mgrand/pattern_synopses.htm#Dynamic%20Linkage
http://www.martinfowler.com/articles/injection.html#ServiceLocatorVsDependencyInjection

**

Post by Brad Appleton
It
also seemed somewhat reminiscent of home some license-managers work to
lookup if a given feature is licensed for use by the current
user/installation (particularly when the software supports multiple
feature levels, say for eval/demo, home/personal-use, small-business,
and enterprise).

Yes exactly - its remarkably flexible, yet simple.

Post by Brad Appleton
So are some of your your previous development CM problems now arising as
deployment (installation configuration) CM problems?

The previous problems included:

1) bugs not fixed in all release branches.

This does not happen.

2) Having to re-implement the bug fix due to the code base moving on
further in one release branch vs earlier ones.

This does not happen, the bug is fixed once for all runtime release
branches.

3) Likelihood of introducing new bugs as a result of fixing existing bugs.

This has happened, yet because of TDD and acceptance tests, so far only
two minor bugs (that I know of) have made it into the wild in the time
we have used this approach.

Post by Brad Appleton
Or are they new
problems instead? I know what kid of problems your solution avoided, Im
curious about your thoughts and experiences with the set of issues you
"traded-off" for instead, how you resolve them, and how/why you feel the
trade-off is worth it.

There is a new problem area introduced by working this way, mainly one
of a feature/or worst a part of it, being enabled in a release when it
should not be enabled. Like above, thanks to TDD and acceptance tests,
we have only had 3 occurrences of this (that I know of).

For me, the trade off's were mainly one of a *perceived *safety of
fixing bugs on separate release branches vs fixing the within the single
code base. I realised it was actually a a perception problem first, in
that it feels natural for most people to apply the 'it ain't broken
don't fix it' for when working with multiple release branches. There's a
lot of history of working this way and most people I know tend to have
it as a 'comfort zone' for developing in. By moving out of that
comfort zone, we naturally encounter resistance.

Secondly, from the XP context, I felt there was issue of 'DTSTTCPW' -
was it simple to have late/dynamic binding vs static compile time
binding & multiple code branches.

Depending upon the teams experience of SCMs and OO Patterns, one is
simpler than the other. In the end, the shared dislike of our SCM tool
actually helped motivate the team to look at different approaches to the
'usual' one of multiple branches.

Finally, the other practises of XP (test first and TDD) also helped to
reassure the team that at least we'd be able to verify the multiple
runtime release branches and their features, worked and were or were not
enabled etc.

There should be plenty of scope here for a new pattern in the next
edition of your SCM book ;-)

regards

Andrew