[p4] Race condition with p4 counters
Stephen Vance
steve at vance.com
Thu Apr 3 20:07:47 PDT 2008
Rick --
I think I missed the nuance of it first time around. This looks like it
should work as long as each user is only attempting once at a time by
some guarantee.
Steve
Rick Macdonald wrote:
> Stephen -
>
> I'll drag this out a bit more, even though my Job trick is much
> better. If there's a hole in this I'd like to know because I've done
> similar things with lock files on disk! Perhaps this is what you
> referred to as a "two-phase protocol".
>
> I'm pretty sure my sequence will work, but I can see that it wasn't
> very clear. I don't need an atomic set and test because everybody is
> creating a uniquely-named test counter as a lock, and I simply have to
> check that there is only one of these special lock counters. Once I
> own the only special lock counter, I'll have exclusive access to the
> actual counter that I want to increment.
>
> Here is some pseudo-code. Say I want to get a new value of "foocount"
> without loosing a race condition:
>
> # I'm not showing error checking for any failed p4 commands!
> nlocks=0
> while $nlocks != 1
> p4 counter lock$USERID 1
> nlocks=`p4 counters | egrep '^lock*' | wc -l`
> if $nlocks > 1
> # There exists my lock counter and at least one more. I don't
> know who got there first, but don't guess; bail out and keep trying.
> p4 counter -d lock$USERID
> sleep 1
> endif
> endwhile
> # Now, I'm the only one with a counter called "lock*", so nobody else
> can create a "lockUSERID" without finding there are two such locks.
> # Even if somebody else creates a lock$USERID after I have found that
> nlocks=1 and I proceed, it doesn't matter. While I carry on, that
> person will see nlocks>1 and bail out and wait until I am done.
> foocount=`p4 counter foocount` + 1
> p4 counter foocount $foocount
> p4 counter -d lock$USERID
>
> Rick
>
> Stephen Vance wrote:
>> Whether you're dealing with the value or the presence of the counter,
>> you will have the same race condition because there isn't an atomic
>> test_set operation. I seem to remember that there are some ways to
>> deal with this, but it's been awhile since I've had to create my own
>> synchronization protocols. Go back to your CS texts on distributed
>> systems and check the algorithms. I think there's a two-phase
>> protocol that may do the trick for you.
>>
>> Steve
>>
>> Rick Macdonald wrote:
>>> That's the way I see it too.
>>>
>>> There is another way, but not as nice as my shower revelation this
>>> morning.
>>>
>>> 1) create a unique counter, such as "lockrickm" where "rickm" is the
>>> userid (unique).
>>> 2) "p4 counters" to list all counters
>>> 3) parse the list of counters to see if "lockrickm" is the only
>>> "lock" counter.
>>> - if so, you now have a lock against other people running the
>>> same locking code and it's safe to get and increment any (private)
>>> counter.
>>> - if not, delete "lockrickm", sleep for a second, and try again.
>>> 4) delete the "lockrickm" when done.
>>>
>>> However, this is at best five p4 executions (the python code below
>>> is 8), and suffers from the hassles of cleaning up stale locks if
>>> interrupted.
>>>
>>> Rick
>>>
>>> Stephen Vance wrote:
>>>> Rick --
>>>>
>>>> I'm pretty sure it hides the race condition it's trying to avoid
>>>> with another race condition on the guard lock. Practically, it's
>>>> probably a little safer, but will still fail under heavy load.
>>>>
>>>> Steve
>>>>
>>>> Rick Macdonald wrote:
>>>>> Two years ago I submitted a suggestion to have an atomic
>>>>> "get-next-value" for counters. They replied that they are
>>>>> considering it for the future and added my name to the list of
>>>>> people who have asked for this. As far as I can see, it has not
>>>>> been done.
>>>>>
>>>>> In the shower this morning, I came up with this idea, which surely
>>>>> must be done atomically within the Perforce server:
>>>>>
>>>>> $ p4 job -o | sed -e 's/<enter description here>/Temporary job to
>>>>> get next Job counter and increment it./' | p4 job -i
>>>>> Job job000007 saved.
>>>>> (parse the message above to get the job number)
>>>>> $ p4 job -d job000007
>>>>>
>>>>> I think this will work for me. I don't care about monotonically
>>>>> increasing numbers for anything that I need these counters for
>>>>> (including Jobs, because I'll use a different prefix anyway). It's
>>>>> three p4 command executions, but we'd only do this a few times a day.
>>>>>
>>>>> Has anybody found a better idea?
>>>>>
>>>>> Here is an unsupported script that Perforce send me, but I think
>>>>> it still suffers from a race condition. Am I wrong?
>>>>>
>>>>> import os, time
>>>>>
>>>>> # wait for foolock counter to be set to zero
>>>>> # if it times out, just go ahead and grab the lock
>>>>> timeout = 4
>>>>> while int(os.popen('p4 counter foolock').read()) != 0 and
>>>>> timeout>0:
>>>>> print 'attempting to get lock...'
>>>>> timeout = timeout - 1
>>>>> time.sleep(1)
>>>>> if timeout == 0: print 'breaking lock'
>>>>>
>>>>> # grab lock
>>>>> os.popen('p4 counter foolock 1')
>>>>> if not int(os.popen('p4 counter foolock').read()) == 1:
>>>>> raise 'unable to obtain lock!'
>>>>>
>>>>> # get and increment foo counter
>>>>> foo = int(os.popen('p4 counter foo').read()) + 1
>>>>> os.popen('p4 counter foo %d' % foo)
>>>>> if not int(os.popen('p4 counter foo').read()) == foo:
>>>>> raise 'unable to set foo counter!'
>>>>>
>>>>> # release lock
>>>>> os.popen('p4 counter -d foolock')
>>>>> if not int(os.popen('p4 counter foolock').read()) == 0:
>>>>> raise 'unable to release lock!'
>>>>> _______________________________________________
>>>>> perforce-user mailing list - perforce-user at perforce.com
>>>>> http://maillist.perforce.com/mailman/listinfo/perforce-user
>>>>>
>>>>>
>>>>
>>>
>>>
>>
>
>
--
Stephen Vance
www.vance.com
More information about the perforce-user
mailing list