[p4] Managing diskspace usage of Perforce Proxy
Frank Compagner
frank.compagner at guerrilla-games.com
Fri May 25 01:55:35 PDT 2007
Thanks,
that looks like it does most of what I want. Installing CygWin on the
windows proxy's is a bit of a hassle, though, I would prefer something
that works natively in both linux and windows. I also want to add some
more stuff (email reports, for instance), and as my perl skills are
very limited, I'm afraid I'm still going to reinvent the wheel, in
Python this time. Still, the script is a good starting point, and
thanks about the bogust atime warning, I'll keep an eye out for that.
----------------------------------------------------------------
Frank Compagner Guerrilla Games
SS> You can find my perl script that does proxy cache cleanup the way
SS> suggested below at public.perforce.com:1666
SS> //guest/stanton_stevens/cache_clean.pl. I've used it for years, it's
SS> quite solid. It will work on Unix systems, and Windows if you have
SS> cygwin installed. Have a look at the top of the script, you may want to
SS> tweak some things, such as where it keeps information about what it
SS> does.
SS> The script is a little more complicated that I would like. This is due
SS> to a built in workaround for a Perforce proxy bug that happens at least
SS> on Solaris OS. Proxy files are saved with no "last access" date, a check
SS> of that value can give any time in the Unix epoch until someone accesses
SS> the file and resets it. The script keeps track of the last cutoff date
SS> at which it deleted files before that time, then if it finds files dated
SS> before that date/time on the next run, it sets the access date to the
SS> current date. It works fine if you don't have this problem, too.
SS> Stanton
SS> -----Original Message-----
SS> From: perforce-user-bounces at perforce.com
SS> [mailto:perforce-user-bounces at perforce.com] On Behalf Of Jeff Grills
SS> Sent: Wednesday, May 23, 2007 11:41 AM
SS> To: 'Jeff Grills'; 'Frank Compagner'; 'Perforce Users'
SS> Subject: Re: [p4] Managing diskspace usage of Perforce Proxy
SS> It occurred to me that one could potentially make a script like this
SS> quite robust - enough to share with all perforce users by putting it in
SS> the public depot (although it would have platform dependencies of
SS> course). I don't have the time to write the script now (I still have to
SS> get around to my p4tar portability improvements!), nor do I have a good
SS> environment to test it, but I can imagine a could operate something like
SS> this:
SS> Check disk_usage against disk_threshold
SS> If exceeded, gather the atime and size for all files in the
SS> proxy cache
SS> Sort the files by atime
SS> Calculate the size of files that you need to remove, disk_remove
SS> = disk_usage - disk_removal_threshold
SS> (Disk_removal_threshold might be smaller than disk_threshold so
SS> the next new revision cached doesn't immediately put you over your
SS> threshold
SS> again)
SS> While (disk_remove > 0)
SS> disk_remove -= sizeof oldest file
SS> remove oldest file
SS> If you wanted something more simple, but accomplished nearly the same
SS> goal, you could try this:
SS> Set atime_threshold to something reasonable (7 days might be a
SS> good starting point)
SS> While disk_usage is above disk_threshold
SS> Run a find -atime atime_threshold on your proxy data and
SS> remove anything older
SS> Reduce atime_threshold by some amount (subtract off a
SS> day, halve it, whatever makes sense)
SS> These approaches might not help as much as one would like for files that
SS> are stored in delta format in a single file, because newly fetched
SS> revisions would keep growing the single archive and updating its atime.
SS> Usually, though, it's the large binary files that become the burden on a
SS> p4p server.
SS> It would be really nice if the perforce server could cache all file
SS> revisions on the proxy using the standard binary method of a compressed
SS> file per revision, in which case this should work out beautifully for
SS> all files.
SS> That feature could also potentially speed up transfer to clients as well
SS> because newer versions of perforce will transfer the compressed file
SS> across the wire and decompress it on the client.
SS> j
SS> -----Original Message-----
SS> From: Jeff Grills [mailto:jgrills at drivensnow.org]
SS> Sent: Wednesday, May 23, 2007 8:29 AM
SS> To: 'Frank Compagner'; 'Perforce Users'
SS> Subject: RE: [p4] Managing diskspace usage of Perforce Proxy
SS> Are you running on top of a filesystem which records access time for
SS> files?
SS> If so, you could run a cron job which removes files that haven't been
SS> accessed in N days. If you're on a Unix style OS for your proxies,
SS> "find
SS> -atime" will be useful.
SS> j
SS> -----Original Message-----
SS> From: perforce-user-bounces at perforce.com
SS> [mailto:perforce-user-bounces at perforce.com] On Behalf Of Frank Compagner
SS> Sent: Monday, May 21, 2007 5:04 PM
SS> To: Perforce Users
SS> Subject: [p4] Managing diskspace usage of Perforce Proxy
SS> Hi,
SS> we've got a problem with diskspace usage of our proxy servers, and I'd
SS> like
SS> to know if anybody else has had to deal with this before, and if so,
SS> what
SS> they did about it. Let me first explain the problem:
SS> We have a central Perforce server, serving some 150 users and about 20
SS> build
SS> machines. Depot size is about 2TB. To increase performance and server
SS> stability, we recently deployed 4 Perforce Proxy's throughout the
SS> building,
SS> 3 serving some 50 users and the final one for the buildmachines. This
SS> has
SS> worked well, performance and stability have increased noticeably.
SS> However, the proxy servers have a disk capacity of about 200GB, which
SS> fills
SS> up in about a week. Sofar our admins are manually deleting the entire
SS> contents of the proxy server at the end of each week, after which they
SS> start
SS> to fill up again. This is too messy and high maintenance, and I'm
SS> looking
SS> for a better alternative. Here's some possible solutions I've come up
SS> with:
SS> - As the bulk of the data is in large binary compressed files, where
SS> most
SS> people only want the head revision, I could run a script that will walk
SS> through the proxy cache every night, and from every ,d directory delete
SS> all
SS> files but the most recent one.
SS> - Not all branches/projects are equal; some are very heavily used by
SS> many
SS> people, others are only used by a few. I could make a shortlist of
SS> "privileged" parts of the depot, and if diskspace becomes low, zap
SS> everything in the proxy cache that comes from outside those parts.
SS> - If, after doing the above, diskspace is still low, go through the
SS> entire
SS> proxy cache and randomly delete (say) 50% of the files.
SS> It won't be hard to put all of this in a python script and run it every
SS> night, but it all feels rather ad-hoc, so I was wondering if anybody had
SS> some better ideas.
SS> Let me know what you think,
SS> Frank.
SS> ----------------------------------------------------------------
SS> Frank Compagner Guerrilla Games
SS> _______________________________________________
SS> perforce-user mailing list - perforce-user at perforce.com
SS> http://maillist.perforce.com/mailman/listinfo/perforce-user
SS> _______________________________________________
SS> perforce-user mailing list - perforce-user at perforce.com
SS> http://maillist.perforce.com/mailman/listinfo/perforce-user
More information about the perforce-user
mailing list