[p4] Determine if depots directory contains unnecessary files

Matt Janulewicz perforce-user-forum at forums.perforce.com
Mon Aug 15 10:30:01 PDT 2016


Posted on behalf of forum user 'Matt Janulewicz'.

I can think of two ways to attack this, but there's probably more.

First, if you have a reasonably new p4d, run 'p4 fstat -Oc' on all your
files, filtering on lbrFile:

$ p4 fstat -Oc -T lbrFile ywaves.err1 ... lbrFile
//depot/demo/majanu/ywaves.err1   
This points at where your library file is, relative to your server's root.
In my case, this archive would be at
/p4/1/root/depot/demo/majanu/ywaves.err[undetermined]

The "[undetermined]" tag in the above line is because text files
(historically) were stored in an RCS file with a ',v' extension. Binary
files are in their own subdirectory with a ',d' extension. Complicating
matters is that you could set the server these days to store everything in an
individual gz set of archives, so now even text files might be in a ',d'
directory. If you wanted to be safe, you'd parse out all those paths and
rsync both the ',v' files and ',d' directories.

As it happens, I've written a script to to this. You'll also want the
script for shelves:

https://swarm.workshop.perforce.com/files/guest/mattyj2001/fun_scripts/find_archives.sh
https://swarm.workshop.perforce.com/files/guest/mattyj2001/fun_scripts/list_shelved_files.sh

Personally, I think that way is a bit sloppy, but if you only have one master
server it's probably the only way to go. Our environment consists of a
commit server, a bunch of edges and numerous read-only instances. I found myself
building out new servers a lot this year and came across the same problem, to
the tune of around half a terabyte of cruft. So I wrote another script to build
out a replica using 'p4 verify -qt' which transfers needed archive files
automatically. This is probably the best way to do it if you want a 100% concise
server with no cruft in it whatsoever. Depending on how big your set of
library/archives is, you might be building it out for a few days. Even though it
might be a hassle, if you had the hardware to do it, it might be worth making a
temporary read-only replica of your server and using this script to populate the
library/archive files:

https://swarm.workshop.perforce.com/files/guest/mattyj2001/fun_scripts/seed_replica.sh

A fun little project is doing test runs of the transfer using smaller sets of
files, different number of pull threads, etc. to find out what the limits of
your system are. The comments at the beginning of that script are where our
hardware's sweet spot was, yours may vary.

FINAL IMPORTANT NOTE:

Your library/archive files are not the only unique things about your server, as
far as those types of files go. You also need to worry about shelves and unload
repos (and probably archive repos, but we don't have any of those.) No
matter what type of build-out you go with, when you're done you need to be
sure you include shelves and unload. For scenario #1, the list_shelved_files.sh
script will help. With scenario #2 you can transfer shelves programmatically:

$ p4 changes -s shelved | awk '{ print $2 }' | xargs -n1 -I {}
p4 verify -qSt @={}   
And unload:

$ p4 verify -qUt //unload/...   
In either case, when you think you're done, don't decommission or
otherwise destroy your original server without performing a complete backup,
then doing a complete verify, with no errors:

> p4 verify -qz //... > p4 changes -s shelved | awk '{ print $2
}' | xargs -n1 -I {} p4 verify -qS @={} > p4 verify -qU //... 



--
Please click here to see the post in its original format:
  http://forums.perforce.com/index.php?/topic/4906-determine-if-depots-directory-contains-unnecessary-files


More information about the perforce-user mailing list