[p4] Managing diskspace usage of Perforce Proxy
Frank Compagner
frank.compagner at guerrilla-games.com
Fri May 25 00:53:12 PDT 2007
That sounds like a pretty good plan, simpler and more robust than my
ideas. It is probably a bit harder to do cross-platform (We now have
both windows and linux proxy's), and I'm interested in seeing how long
the script will take to run. But it's not too hard to do, I have a
first version ready, I'll give it a try shortly, and report back here.
If it works well, I'll try to clean it up and make it generic enough
to put in the public depot.
----------------------------------------------------------------
Frank Compagner Guerrilla Games
JG> It occurred to me that one could potentially make a script like this quite
JG> robust - enough to share with all perforce users by putting it in the public
JG> depot (although it would have platform dependencies of course). I don't
JG> have the time to write the script now (I still have to get around to my
JG> p4tar portability improvements!), nor do I have a good environment to test
JG> it, but I can imagine a could operate something like this:
JG> Check disk_usage against disk_threshold
JG> If exceeded, gather the atime and size for all files in the proxy
JG> cache
JG> Sort the files by atime
JG> Calculate the size of files that you need to remove, disk_remove =
JG> disk_usage - disk_removal_threshold
JG> (Disk_removal_threshold might be smaller than disk_threshold so the
JG> next new revision cached doesn't immediately put you over your threshold
JG> again)
JG> While (disk_remove > 0)
JG> disk_remove -= sizeof oldest file
JG> remove oldest file
JG> If you wanted something more simple, but accomplished nearly the same goal,
JG> you could try this:
JG> Set atime_threshold to something reasonable (7 days might be a good
JG> starting point)
JG> While disk_usage is above disk_threshold
JG> Run a find -atime atime_threshold on your proxy data and
JG> remove anything older
JG> Reduce atime_threshold by some amount (subtract off a day,
JG> halve it, whatever makes sense)
JG> These approaches might not help as much as one would like for files that are
JG> stored in delta format in a single file, because newly fetched revisions
JG> would keep growing the single archive and updating its atime. Usually,
JG> though, it's the large binary files that become the burden on a p4p server.
JG> It would be really nice if the perforce server could cache all file
JG> revisions on the proxy using the standard binary method of a compressed file
JG> per revision, in which case this should work out beautifully for all files.
JG> That feature could also potentially speed up transfer to clients as well
JG> because newer versions of perforce will transfer the compressed file across
JG> the wire and decompress it on the client.
JG> j
JG> -----Original Message-----
JG> From: Jeff Grills [mailto:jgrills at drivensnow.org]
JG> Sent: Wednesday, May 23, 2007 8:29 AM
JG> To: 'Frank Compagner'; 'Perforce Users'
JG> Subject: RE: [p4] Managing diskspace usage of Perforce Proxy
JG> Are you running on top of a filesystem which records access time for files?
JG> If so, you could run a cron job which removes files that haven't been
JG> accessed in N days. If you're on a Unix style OS for your proxies, "find
JG> -atime" will be useful.
JG> j
JG> -----Original Message-----
JG> From: perforce-user-bounces at perforce.com
JG> [mailto:perforce-user-bounces at perforce.com] On Behalf Of Frank Compagner
JG> Sent: Monday, May 21, 2007 5:04 PM
JG> To: Perforce Users
JG> Subject: [p4] Managing diskspace usage of Perforce Proxy
JG> Hi,
JG> we've got a problem with diskspace usage of our proxy servers, and I'd like
JG> to know if anybody else has had to deal with this before, and if so, what
JG> they did about it. Let me first explain the problem:
JG> We have a central Perforce server, serving some 150 users and about 20 build
JG> machines. Depot size is about 2TB. To increase performance and server
JG> stability, we recently deployed 4 Perforce Proxy's throughout the building,
JG> 3 serving some 50 users and the final one for the buildmachines. This has
JG> worked well, performance and stability have increased noticeably.
JG> However, the proxy servers have a disk capacity of about 200GB, which fills
JG> up in about a week. Sofar our admins are manually deleting the entire
JG> contents of the proxy server at the end of each week, after which they start
JG> to fill up again. This is too messy and high maintenance, and I'm looking
JG> for a better alternative. Here's some possible solutions I've come up with:
JG> - As the bulk of the data is in large binary compressed files, where most
JG> people only want the head revision, I could run a script that will walk
JG> through the proxy cache every night, and from every ,d directory delete all
JG> files but the most recent one.
JG> - Not all branches/projects are equal; some are very heavily used by many
JG> people, others are only used by a few. I could make a shortlist of
JG> "privileged" parts of the depot, and if diskspace becomes low, zap
JG> everything in the proxy cache that comes from outside those parts.
JG> - If, after doing the above, diskspace is still low, go through the entire
JG> proxy cache and randomly delete (say) 50% of the files.
JG> It won't be hard to put all of this in a python script and run it every
JG> night, but it all feels rather ad-hoc, so I was wondering if anybody had
JG> some better ideas.
JG> Let me know what you think,
JG> Frank.
JG> ----------------------------------------------------------------
JG> Frank Compagner Guerrilla Games
JG> _______________________________________________
JG> perforce-user mailing list - perforce-user at perforce.com
JG> http://maillist.perforce.com/mailman/listinfo/perforce-user
More information about the perforce-user
mailing list