[p4] What actually happens during checkpoint

Lee Marzke lee at marzke.net
Thu Mar 5 09:41:27 PST 2009


The journal  writes transactions in the same order as they occurred,  
just a little slower than
writes to the actual DB files.

The checkpoint is just a consistent consolidation of all the separate DB 
files into a single file.  A consistent
state of the DB is just flushed out to the checkpoint, to insure a 
consistent backup.

The SAN shouldn't be a bottleneck if they only contain your archives.    
As perforce recommends the
actual db.* files should be on a local disk, or a local SAN.

The two most likely things are performance of your SAN ,  or  resources 
for compression.

I'd try two things on a backup p4 server. ( I assume with this large an 
installation you have a
cold spare backup  p4d server available, with a backup license copy from 
Perforce. )

- If the db.* files are on the SAN,  run a checkpoint on your backup p4d 
with the db files on local disk.
   and checkpoint on local disk.
- Run a checkpoint with compression turned off.

There is  white paper from NetAPP from a few years ago about putting 
db.* files on a NetAPP
and doing off-line checkpoints from a snapshot.  That can reduce your 
checkpoint to a few
minutes, but it's much more complicated.


Lee Marzke
4aero.com -  Perforce Certified Consulting Partner


Lee Marzke, lee at marzke.net   http://marzke.net/lee/
IT Consultant, VMware, SAN storage, infrastructure, Perforce consulting
+1 800-393-5217  office         +1 484-348-2230               fax
+1 610-564-4932  cell           sip://8003935217@4aero.com    VOIP

Jamie.Echlin at barclayscapital.com wrote:
> I should have mentioned we're using ext3, and it appears mounting it
> with data=writeback would help, and XFS help even more. But even so, I'm
> currently interested in why this process seems to have degraded so much
> over the last 6 months (checkpoint used to take about an hour), and for
> personal curiousity I'm interested in the mechanics of a checkpoint.
> Cheers, jamie 
>> -----Original Message-----
>> From: perforce-user-bounces at perforce.com 
>> [mailto:perforce-user-bounces at perforce.com] On Behalf Of 
>> Echlin, Jamie: IT (LDN)
>> Sent: 05 March 2009 14:26
>> To: perforce-user at perforce.com
>> Subject: [p4] What actually happens during checkpoint
>> Can anyone help me understand what happens during a checkpoint?
>> Background is, the checkpoint procedure has got incredibly 
>> slow, about 9 hours to produce a 4.5 G compressed checkpoint 
>> from a 104G database.
>> We have some problems with our SAN. The LUNS that make up the 
>> data volume are split over two different array controllers, 
>> one of them in a different data centre. However, this 
>> shouldn't cause problems in itself, it's just not "best 
>> practice", or so I've been told.
>> My theory about the performance then was that the checkpoint 
>> would write the transactions in the same order that they were 
>> originally done, causing reads over all the different tables, 
>> somehow exacerbating a disk problem. Also we don't have 
>> enough main memory, only 32G. However, reading more, 
>> checkpoints are not the same as journals, the transactions 
>> are not in the same order. But a checkpoint is also not just 
>> a compressed copy of all the tables, if that was true there 
>> was no way it could so long. So what gives? ;-)
>> A restore of a checkpoint takes 10 hours, so not much 
>> difference between reads and writes. Just to rule out 
>> decompressing being the bottleneck, it's only using a couple 
>> per cent of CPU.
>> Cheers, jamie

More information about the perforce-user mailing list