Part 2: Put the portage tree on a diet ;-)
After getting a feedback about not quite complete clean and a
generator script, which needs to be (not forgotten to)
rerun all the time after .skel
file has modified, I’ve (re)read man rsync
again paying attention to filtering rules description.
I was given a hint, that the script actually is not required. Everything can be done using only rsync
filtering rules. The key feature is (I missed that, first time reading it ;-)
If pattern starts w/ a slash it is matched to the root of transfer, otherwise it is matched against the end of current item path...
Hence is it possible to remove a whole category or a particular directory (package) within some
category and its cached files from metadata/md5-cache/
with a single rule.
# rule to ignore a whole category
- cat/***
# rule to accept only one package from a category
+ cat/some-package**
- cat/**
One more implication to consider when you writing filtering rules: rules order is important.
You have to specify include rules, then more generic exclude. It stops matching if first match
(include or exclude) was found. So if you want to remove everything, but amd64/
from /profiles/arch/
+ /profiles/arch/amd64/**
- /profiles/arch**
Then, if you want to “clean” unused features of selected arch (like x32
or no-multilib
), you need
to add that rules above:
- /profiles/arch/amd64/no-multilib/**
- /profiles/arch/amd64/x32/**
+ /profiles/arch/amd64/**
- /profiles/arch**
Unfortunately that approach do not scale well ;-( Particularly because the way rsync
visits subdirectories:
shortest path visited first… which is quite “normal” and expected actually when traversing directories.
But you can’t write smth like this:
+ /profiles/default/linux/amd64/**
- /profiles/default/**
because before visit nested amd64/
the /profiles/default/linux/
dir will be visited, but the only
rule it matches is the second one: kill all inside of /profiles/default/
, and the first rule will never match!
So to do a pedantic cleanup of /profiles
one have to write a lot of rules to explicitly enumerate things
deep inside of profiles related dirs… like it was in my first version
But there is a other side of a coin: the way rsync
visits and applies rules actually can help to describe desired
cleanups in a shorter functional-like way! :-)
Lets describe what we want to have/keep… Personally I use latest gentoo linux release (it is 13.0), native 64-bit
amd64 with multilib and KDE as a desktop. Everything else I don’t need! First of all we have to take care
about all files, that makes up a profile, anywhere
inside the /profiles/
dir:
+ /profiles/**/eapi
+ /profiles/**/make*
+ /profiles/**/parent
+ /profiles/**/package*
+ /profiles/**/profile*
+ /profiles/**/use*
+ /profiles/**/ChangeLog
Note the last rule is not part of the profile, but will keep latest ChangeLog
files, killing the
outdated others (ChangeLog-2007
for example). Observing nested dirs inside a /profiles
one may notice
that desired things to keep are placed in a dirs named after architecture or features, so to include them
(prevent deletion) it would be enough the following bunch of rules:
+ /profiles/**/amd64
+ /profiles/**/linux
+ /profiles/**/13.0
+ /profiles/**/64bit-native
+ /profiles/**/multilib
+ /profiles/**/multilib/lib32
+ /profiles/**/desktop
+ /profiles/**/kde
And the final part: just kill everything else!!
+ /profiles/arch/base**
- /profiles/arch/**
- /profiles/default/**
- /profiles/embedded/***
- /profiles/features/**
- /profiles/hardened/***
- /profiles/prefix/**
- /profiles/releases/**
- /profiles/targets/**
- /profiles/uclibc/***
That /profile/arch/base/
dir must be kept due referenced from a particular arch dir (arch/amd64/
in my case).
Measure the effect
I have downloaded and unpack a portage snapshot for 2014-Jan-14 to play w/ into /storage/tmp/p
.
Wrote a simple script in it to get a filtered copy of portage/
– just a shortcut for rsync
w/ a bunch of options:
#!/bin/sh
rsync --exclude-from=./gentoo-portage-exclude.list \
--delete-excluded \
--delete-missing-args \
--recursive \
--delete-after \
--verbose \
portage/ portage.stripped/
Now lets count files+dirs before/after and total directory size I’ve got with my final version of filtering rules:
zaufi@gentop〉/storage/tmp/p〉 find portage | wc -l
178370
zaufi@gentop〉/storage/tmp/p〉 find portage.stripped | wc -l
119408
zaufi@gentop〉/storage/tmp/p〉 du -hs portage
739M portage
zaufi@gentop〉/storage/tmp/p〉 du -hs portage.stripped
503M portage.stripped
Results
- 236M freed
- 58,962 directory entries eliminated
blog comments powered by Disqus