[p4] Perforce server lockup problem

Erik Johnson erik at valvesoftware.com
Tue Mar 15 13:27:11 PST 2005

We built a test case to reproduce the problem and setup filemon to watch
the disk.
We created a brand new branch for scratch from our main codeline, and
then went into integrating in back in. This operation took about 10
minutes, and during that time all other clients on the network were
essentially locked out of Perforce.
Scanning through the filemon logs, approximately 7 minutes of time is
spent doing this:
314822 11:21:43 AM p4s.exe:3736 READ U:\p4root\db.integed SUCCESS
Offset: 2195996672 Length: 8192 
314823 11:21:43 AM p4s.exe:3736 READ  U:\p4root\db.integed SUCCESS
Offset: 1690583040 Length: 8192 
314824 11:21:43 AM p4s.exe:3736 READ U:\p4root\db.integed SUCCESS
Offset: 1690583040 Length: 8192 

>From reading through release notes, it looks like there were some
changes to how integrations were dealt with from our original server
version (2001.1) and the version we're currently running. I don't know
enough about the inner workings of the database to have a clear idea as
to what the excessive reads of this file actually means.
Any help much appreciated.


From: perforce-user-bounces at perforce.com
[mailto:perforce-user-bounces at perforce.com] On Behalf Of Erik Johnson
Sent: Tuesday, March 08, 2005 12:11 PM
To: Bruce McPeek; perforce-user at perforce.com
Subject: RE: [p4] Perforce server lockup problem

We've been able to reproduce this problem with anti-virus running, and
without (running Etrust).
I'm fairly sure it's not a bottleneck on the drive array, as just
watching the bounds of reads/writes under normal operation are at a much
higher throughput than this. For what it's worth, they are four 250GB
Western Digital 7200RPM 8MB cache SATA drives, running in a RAID 10
array (mirrored and striped). We've benchmarked this setup for some of
our other business activities in identical hardware, and they don't
exhibit this behavior.
I'll take a look at Sysinternals Filemon and see what data it provides,
just to make sure it's Perforce creating the problem.
Sounds like it also makes to sense to upgrade to the very latest server
I'll let the list know what I find out.
Thanks for the info,


From: perforce-user-bounces at perforce.com
[mailto:perforce-user-bounces at perforce.com] On Behalf Of Bruce McPeek
Sent: Tuesday, March 08, 2005 9:04 AM
To: perforce-user at perforce.com
Subject: RE: [p4] Perforce server lockup problem

To me this, this sounds more like a hardware issue under load. I am
especially suspicious of the SATA RAID 5.
Could you describe the hardware upgrades you mentioned? How is your SATA
RAID 5 configured? Hardware RAID or software RAID? How many drives of
what size? Even better which models. How are you doing your SATA? On
motherboard or add-on card? Are your SATA drivers native windows? Third
I need to look at how SATA interfaces with the rest of a system's I/O
again but I'm wondering if this may be your bottleneck.
I agree with the other posters about the anti-virus. If it is installed,
how is it configured with respect to what is scanned?
I just noticed you are at Valve Software. What are the typical sizes of
the files you are working with? Large binaries for games?


From: Erik Johnson [mailto:erik at valvesoftware.com] 
Sent: Monday, March 07, 2005 11:22 AM
To: perforce-user at perforce.com
Subject: [p4] Perforce server lockup problem

I'll try and give as much data on our setup, along with the problem
we've been having. I haven't gotten any really crisp leads from Perforce
support on this problem. Maybe someone else has already solved it.
We're running our Perforce server on Windows 2003 server on a machine
with 2GB RAM, a RAID 5 disk subsystem with 7200RPM SATA drives, and a
single 3.GHz HT processor. We're running server version
P4D/NTX86/2004.2/73359 (2004/12/27).
Our general workflow (and the one that tends to generate the problem) is
that we have a main branch that few people directly work on, with
personal branches off of it that individual developers integrate into.
There are roughly 20 developers integrating roughly 5,000 lines of code
a day. Unfortunately, we changed a couple of variables at once when the
problem started happening (upgraded server software, server hardware),
so I can't reasonably point to a specific root cause.
When a developer is merging from their personal branch into our main
codeline, we're seeing total database lockup, and constant reads on the
server (viewed via Windows perfmon). This means that all other users on
the system cannot sync, integrate, checkout, checkin, etc. The condition
generally takes around 15 minutes to clear, and then things all go back
to normal. CPU and RAM usage on the system is all nominal, and while the
reads on the disk subsystem are constant, they are not within 1/3 of the
observed peak read throughput.
Has anyone else seen behavior like this? It appears that the database is
being read table by table for some reason.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://maillist.perforce.com/pipermail/perforce-user/attachments/20050315/a1e5043b/attachment-0007.html>

More information about the perforce-user mailing list