[p4] Race condition with p4 counters

Rick Macdonald rickmacd at shaw.ca
Thu Apr 3 19:20:46 PDT 2008


Stephen -

I'll drag this out a bit more, even though my Job trick is much better. 
If there's a hole in this I'd like to know because I've done similar 
things with lock files on disk! Perhaps this is what you referred to as 
a "two-phase protocol".

I'm pretty sure my sequence will work, but I can see that it wasn't very 
clear.  I don't need an atomic set and test because everybody is 
creating a uniquely-named test counter as a lock, and I simply have to 
check that there is only one of these special lock counters. Once I own 
the only special lock counter, I'll have exclusive access to the actual 
counter that I want to increment.

Here is some pseudo-code. Say I want to get a new value of "foocount" 
without loosing a race condition:

# I'm not showing error checking for any failed p4 commands!
nlocks=0
while $nlocks != 1
   p4 counter lock$USERID 1
   nlocks=`p4 counters | egrep '^lock*' | wc -l`
   if $nlocks > 1
       # There exists my lock counter and at least one more. I don't 
know who got there first, but don't guess; bail out and keep trying.
       p4 counter -d lock$USERID
       sleep 1
   endif
endwhile
# Now, I'm the only one with a counter called "lock*", so nobody else 
can create a "lockUSERID" without finding there are two such locks.
# Even if somebody else creates a lock$USERID after I have found that 
nlocks=1 and I proceed, it doesn't matter. While I carry on, that person 
will see nlocks>1 and bail out and wait until I am done.
foocount=`p4 counter foocount` + 1
p4 counter foocount $foocount
p4 counter -d lock$USERID

Rick

Stephen Vance wrote:
> Whether you're dealing with the value or the presence of the counter, 
> you will have the same race condition because there isn't an atomic 
> test_set operation. I seem to remember that there are some ways to 
> deal with this, but it's been awhile since I've had to create my own 
> synchronization protocols. Go back to your CS texts on distributed 
> systems and check the algorithms. I think there's a two-phase protocol 
> that may do the trick for you.
>
> Steve
>
> Rick Macdonald wrote:
>> That's the way I see it too.
>>
>> There is another way, but not as nice as my shower revelation this 
>> morning.
>>
>> 1) create a unique counter, such as "lockrickm" where "rickm" is the 
>> userid (unique).
>> 2) "p4 counters" to list all counters
>> 3) parse the list of counters to see if "lockrickm" is the only 
>> "lock" counter.
>>      - if so, you now have a lock against other people running the 
>> same locking code and it's safe to get and increment any (private) 
>> counter.
>>      - if not, delete "lockrickm", sleep for a second, and try again.
>> 4) delete the "lockrickm" when done.
>>
>> However, this is at best five p4 executions (the python code below is 
>> 8), and suffers from the hassles of cleaning up stale locks if 
>> interrupted.
>>
>> Rick
>>
>> Stephen Vance wrote:
>>> Rick --
>>>
>>> I'm pretty sure it hides the race condition it's trying to avoid 
>>> with another race condition on the guard lock. Practically, it's 
>>> probably a little safer, but will still fail under heavy load.
>>>
>>> Steve
>>>
>>> Rick Macdonald wrote:
>>>> Two years ago I submitted a suggestion to have an atomic 
>>>> "get-next-value" for counters. They replied that they are 
>>>> considering it for the future and added my name to the list of 
>>>> people who have asked for this. As far as I can see, it has not 
>>>> been done.
>>>>
>>>> In the shower this morning, I came up with this idea, which surely 
>>>> must be done atomically within the Perforce server:
>>>>
>>>> $ p4 job -o | sed -e 's/<enter description here>/Temporary job to 
>>>> get next Job counter and increment it./' | p4 job -i
>>>> Job job000007 saved.
>>>> (parse the message above to get the job number)
>>>> $ p4 job -d job000007
>>>>
>>>> I think this will work for me. I don't care about monotonically 
>>>> increasing numbers for anything that I need these counters for 
>>>> (including Jobs, because I'll use a different prefix anyway). It's 
>>>> three p4 command executions, but we'd only do this a few times a day.
>>>>
>>>> Has anybody found a better idea?
>>>>
>>>> Here is an unsupported script that Perforce send me, but I think it 
>>>> still suffers from a race condition. Am I wrong?
>>>>
>>>> import os, time
>>>>  
>>>>         # wait for foolock counter to be set to zero
>>>>         # if it times out, just go ahead and grab the lock
>>>>         timeout = 4
>>>>         while int(os.popen('p4 counter foolock').read()) != 0 and 
>>>> timeout>0:
>>>>           print 'attempting to get lock...'
>>>>           timeout = timeout - 1
>>>>           time.sleep(1)
>>>>         if timeout == 0: print 'breaking lock'
>>>>  
>>>>         # grab lock
>>>>         os.popen('p4 counter foolock 1')
>>>>         if not int(os.popen('p4 counter foolock').read()) == 1:
>>>>           raise 'unable to obtain lock!'
>>>>  
>>>>         # get and increment foo counter
>>>>         foo = int(os.popen('p4 counter foo').read()) + 1
>>>>         os.popen('p4 counter foo %d' % foo)
>>>>         if not int(os.popen('p4 counter foo').read()) == foo:
>>>>           raise 'unable to set foo counter!'
>>>>  
>>>>         # release lock
>>>>         os.popen('p4 counter -d foolock')
>>>>         if not int(os.popen('p4 counter foolock').read()) == 0:
>>>>           raise 'unable to release lock!'
>>>> _______________________________________________
>>>> perforce-user mailing list  -  perforce-user at perforce.com
>>>> http://maillist.perforce.com/mailman/listinfo/perforce-user
>>>>
>>>>   
>>>
>>
>>
>


More information about the perforce-user mailing list