[p4] Client-side file fragmentation on NTFS

Frank Compagner frank.compagner at guerrilla-games.com
Wed Dec 5 06:48:30 PST 2007


Hi,

we'be been trying to optimize our Perforce performance, especially
for doing large sync's. We have a large amount of binary data in the
depot, and most people sync several gigabytes of data every day, which
can take a considerable amount of time. After some profiling, I was
surprised to find that on most of our clients, the performance of a
large sync is currently limited by the client-side disk. Further, the
sync performance decreases considerably on a disk with even moderate
levels of file fragmentation. And most disks over here are pretty
heavily fragmented, because the way that p4/p4v/etc. writes files to
disk on NTFS easily results in lots of fragmentation.

The Perforce client receives sync data from the server in 4KB blocks
and writes these to disk immediately. That sound like a good idea, but
this way NTFS doesn't know how big the file is going to be, so it just
looks for a free space of at least 4KB and puts the block there. This
easily results in lots of file fragmentation, with two obvious
disadvantages: first, the sync itself is slower, and second, working
with the file once it's been synced is going to be slower than it
needs to be.

This isn't an issue if all you have in Perforce is sourcecode, but we
have a very large amount of big binary files in our depot, and this is
noticeably slowing us down. I've done some measurements to determine the
performance of a sync of a representative sample of our binary files:

Unfragmented:        179 Mbit/sec
Defragmented:        153 Mbit/sec
Fragmented:          130 Mbit/sec
Heavily fragmented:   81 Mbit/sec

These numbers vary quite a bit from machine to machine, and even from
time to time, so these are averages of tests run multiple times on about
5 different machines. The categories are loosely described as:
"Unfragmented" means a brand new clean drive,
"Defragmented" is a heavily fragmented drive subsequently defragmented
(by mstDefrag),
"Fragmented" is a drive that has seen normal use (but not had Perforce
data on it),
"Heavily fragmented" is a good description of most of the workstations
in our office;-). They've seen lots of Perforce trafic, and have been
filled almost to capacity on a number of occasions (average filesize
~1 MB, average #fragments/file ~ 8).
I used my perforce profiling script (available at:
http://public.perforce.com:8080/guest/frank_compagner/ ) to verify
that the client disk was indeed the bottleneck during all tests.

Now, you might say that the obvious solution to this problem is to run
a defragmentation tool regularly, and while that certainly is a good
idea, it is not without it's own problems. Besides, if the Perforce
client tools were to write data to disk in a different manner, there
would not be much need for this. We've talked to Perforce support about
this, and we've asked them to improve the behaviour by, for instance,
making the size of the disk write buffer configurable in the client
settings. I've also made a p4 replacement (only for sync operations)
tool using the p4api FileSys object to see how much the performance
would be improved by using various write methods. It's called p4fs (for
p4 fast sync), and here's how it compares to the normal p4 client (this
was on a number of different machines, hence the somewhat different
numbers):

Averages for:         |     p4                    p4fs
----------------------+-----------------------------------------
Transfer rate         |     91 Mbit/sec       141 Mbit/sec
Average #frags/file   |      5.86               1.01
Total # of fragments  |  14952               2588

As you can see, a nice improvement. Now, my question is: has anybody
else ever noticed this? If so, has anybody tried to measure the
performance impact? Have you spoken to support about this? Do you
agree that improving this behaviour is worthwhile? If so, you might
consider contacting support to register your interest in the subject,
which might get the enhancement request somewhat higher up the list.

Finally, some details on the p4fs tool: I've tried a number of
different write methods:
1 Use a large buffer to cache the incoming data and write that to
  disk in one go.
2 Do the same, but read and write in different threads simultaeously.
3 Pre-allocate the assumed disksize needed (possible since as of p4api
  2006.2 you get a stat message just before the data arives that lists
  the filesize), using the SetFileValidData() Windows api.
Method 2 results in the least fragmentation, but method 3 is slightly
faster. It might be possible to do even better using a different
combination, but the details of the NTFS allocation strategy remain
somewhat mysterious. The tool comes in two flavors: one that tries to
mymic the p4 commandline client as close as possible (but only does
sync operations) and a dialog based version that works well as custom
tool in P4V/P4Win. The latter also has a progressbar, which makes
waiting for syncs a bit more tolerable. I still need to improve the
error reporting and do some more tests, but once that is done I'll put
it in the public depot.

----------------------------------------------------------------
Frank Compagner                                  Guerrilla Games



More information about the perforce-user mailing list