[p4] Changelist-at-a-time VS bulk integration

David Weintraub qazwart at gmail.com
Sat Dec 9 17:37:00 PST 2006


This is rather an interesting discussion. One of the earliest true SCM
products was Sablime (aka Sable) from AT&T/Bell Labs. It big powerful
feature was the integration of defect tracking and version control
*AND* its ability for you to cherry pick which fixes you wanted to
include in your build.

When you did a build, you selected the baseline, then a list of fixes
(called MRs) that you wanted to include in the next baseline. For
example, you could include the fix MR1004, but not MR1003. You then
created a new baseline based upon the old baseline and the MRs that
were fixed.

Sounds great until you select something to build, and you spend a few
hours working out dependencies. I want to build MR1004 but not MR1003.
However, MR1004 is dependent upon MR1002 which is dependent upon
MR1001. Now, I have to make a decision whether to include MR1002 and
MR1001 with MR1004 or to do a build and leave out MR1004.

Even worse is that I could end up doing a build of source code that
was never written and thus never really unit tested. MR1004 might not
depend directly upon MR1003, but there's a change in a file that
MR1003 made that MR1004 didn't change (and thus no dependency) but
still requires.

It really hasn't surprised me that no other version control system
emulated Sablime. Cherry picking which MRs to include was just not a
very smooth operation. What works better is the changelist concept
that Perforce came up with and is emulated in ClearCase UCM. Changes
are layers on your baseline. Each layer is a new baseline. You could
pick a particular layer to build, but you couldn't skip a layer since
that can cause dependency problems.

For this reason, I personally discourage cherry picking even if
Perforce could track it. If you want to merge changelist #1004 to
another branch, you should merge all changes up to changelist #1004.
Otherwise, you could be leaving out a change that changelist #1004
depended upon. Since that configuration wasn't tested on that branch,
you could be creating a problem by skipping that change.

Of course, there's going to be the time that changelist #1003
introduced a bug which was fixed in changelist #1005, but that
changelist #1004 contains a change that you need to merge onto the
other branch. However, changelist #1005 introduced another change you
don't want to merge onto that other branch. Now what do you do? Merge
to changelist #1004 knowing that you're introducing a bug? Merge to
changelist #1005 and include a change that you don't want? Or, allow
cherry picking?

Oh yes, the customer is a major customer, the release is running late,
and the CEO of the corporation has taken a sudden interest in all of
this. But, it's *YOUR* decision, so it's up to you. If you say no,
we'll just tell the CEO that it's your fault that the customer didn't
get the change they wanted. No pressure...

On 12/9/06, Oren Shemesh (oshemesh) <oshemesh at cisco.com> wrote:
> A bit about our experience:
>
> We used to do whole integrations (every few weeks, with the timing
> synced to release schedule, testing milesotes, and such), and have found
> it to be bad. Large integrations put a lot of pressure on the person
> resolving cnoflicts, leading to sub-optimal decisions. If the resolved
> branch fails testing (We test before submitting), hunting down the bad
> resolve can be very cumbersome. All this leads to the integration being
> risky (in terms of schedule), so project schedules started being
> affected - leading us to a bad place.
>
> We now use a home-brewed script that uses "integrate preview" to get a
> list of all files that need integration, then sorts the relevant
> changelists and integrate each one in order by itself. It does so
> automatically, never integrating CLs out-of-order, and it stops whenever
> there are conflicts, or it detects that a CL is not fully integrated (As
> in the case where someone integrated part of the CL manually), or a
> deletion is about to be integrated - in short, it stops when a human
> needs to do/review the resolve. When it stops, it E-mails the author of
> the original changelist, which is very scalable (It runs on a decidated
> machine and a decidated workspace, which everyone is allowed to access -
> but only for this purpose). It only attempts to to integrate changes
> that were built and tested OK (Using our nightly build and nightly
> regression), so it integrates only tested CLs.
>
> We are very happy with this script, we use it for all our production
> branches (i.e. release branches). We wrote it in order to improve the
> scalability and tracability of our merges, and it seems to pay off.
>
> For private branches we do whole integrations, but that's OK because
> there is usually no time pressure for them - it's just part of
> development work.
>
> However, to answer the original question, I would not mix whole
> integrations and cherry-picked ones as a methodology on the same pair of
> branches. To the best of my understanding, it should work, but I have
> seen cases of weird results - specifically when indirection is involved
> (probably because the selected 'base' is not always the proper one) -
> and I don't want to be the one debugging Perforce.
>
> Reagrds, Oren.
>
> -----Original Message-----
> From: perforce-user-bounces at perforce.com
> [mailto:perforce-user-bounces at perforce.com] On Behalf Of Brad Holt
> Sent: Saturday, December 09, 2006 2:19 AM
> To: Ed Mack; Rick Macdonald; Perforce Users Mailing List
> Subject: Re: [p4] Changelist-at-a-time VS bulk integration
>
> Same at our end.  I just wanted to point out some plusses and minuses to
> this by-change approach...
>
> Individual changes that were done out on the branch will remain itemized
> in the revision history back in main when the changes are integrated
> down.  If all the changes in the branch are batched up at once and
> integrated down, then it can make hunting for a particular change with
> revision history or time-lapse-view trickier (it depends on the history
> and resolve of the file when it returns to main).  Not a huge deal, but
> some of my folks used to complain about it quite a bit.
>
> Another advantage would be in the automation of the 2 processes.
> By-change lends itself to simpler farming out of resolves to the folks
> that made the changes causing the conflict.  You can just have them take
> care of the integration themselves if you like.  If all your conflicts
> are sitting in a single changelist, then you are stuck doing them since
> they are in your workspace, and part of a larger changelist.  So you
> gotta have someone come sit at your box (tricky when they are on the
> other side of the globe) to do their bit or somehow farm it out
> manually.
>
> A disadvantage to by-change is that all the branch changes do not get
> introduced atomically.  They will rather dribble in as they get
> resolved.  If there are dependent changes, then this can cause problems.
> We have had cases where an integration may have a few hundred conflicts
> that need resolving which will take some time to iron out.  If this were
> done as a mass integrate, then it may take a long time, but when it was
> finished, it could all be brought back and would conform to a build,
> label or some other benchmark on the branch which could then be tested
> against.  Integration by change cannot really achieve this without some
> additional policies, locking, protections, or maybe better automation
> than I cobbled.
>
> So there's good and bads either way.  Generally as the p4 guy stuck with
> doing the integrations and resolves, I prefer by-change.  Others may
> prefer the mass integration.
>
>
> -----Original Message-----
> From: perforce-user-bounces at perforce.com
> [mailto:perforce-user-bounces at perforce.com] On Behalf Of Ed Mack
> Sent: Friday, December 08, 2006 2:07 PM
> To: Rick Macdonald; Perforce Users Mailing List
> Subject: Re: [p4] Changelist-at-a-time VS bulk integration
>
> Our CM team used to do the mass merges, but now we integrate by
> changelist almost exclusively.  We have a script that calls the
> (undocumented) "interchanges" command to spit out the unintegrated
> changes (using a list of branchspecs).  The script parses the output and
> formats email notifications to the changelist owners of their list of
> changelists that need integrating.
>
> This works very well for us.
>
> Ed
>
> > -----Original Message-----
> > From: perforce-user-bounces at perforce.com
> > [mailto:perforce-user-bounces at perforce.com] On Behalf Of Rick
> > Macdonald
> > Sent: Friday, December 08, 2006 12:25 PM
> > To: Perforce Users Mailing List
> > Subject: [p4] Changelist-at-a-time VS bulk integration
> >
> > The current thread about Cherry-picking reminds me that I
> > never heard or
> > thought about integrating one changelist at a time. I just do
> > everything
> > at once and hope for the best. Mostly it's fine. Things can get
> > complicated, but it's always been doable.
> >
> > I first came across the one-at-a-time idea in Laura's book. I
> > won't give
> > an example; I'll assume folks here know what I'm referring to.
> >
> > Any comments about this? Does anybody do it always? Just when
> > you think
> > you might need to? Never?
> >
> > ...RickM...
> > _______________________________________________
> > perforce-user mailing list  -  perforce-user at perforce.com
> > http://maillist.perforce.com/mailman/listinfo/perforce-user
> >
>
> _______________________________________________
> perforce-user mailing list  -  perforce-user at perforce.com
> http://maillist.perforce.com/mailman/listinfo/perforce-user
>
> _______________________________________________
> perforce-user mailing list  -  perforce-user at perforce.com
> http://maillist.perforce.com/mailman/listinfo/perforce-user
>
> _______________________________________________
> perforce-user mailing list  -  perforce-user at perforce.com
> http://maillist.perforce.com/mailman/listinfo/perforce-user
>


-- 
--
David Weintraub
qazwart at gmail.com


More information about the perforce-user mailing list