After getting a feedback about not quite complete clean and a
generator script, which needs to be (not forgotten to)
rerun all the time after
.skel file has modified, I’ve (re)read
again paying attention to filtering rules description.
I was given a hint, that the script actually is not required. Everything can be done using only
filtering rules. The key feature is (I missed that, first time reading it ;-)
If pattern starts w/ a slash it is matched to the root of transfer, otherwise it is matched against the end of current item path...
Hence is it possible to remove a whole category or a particular directory (package) within some
category and its cached files from
metadata/md5-cache/ with a single rule.
# rule to ignore a whole category - cat/*** # rule to accept only one package from a category + cat/some-package** - cat/**
One more implication to consider when you writing filtering rules: rules order is important.
You have to specify include rules, then more generic exclude. It stops matching if first match
(include or exclude) was found. So if you want to remove everything, but
+ /profiles/arch/amd64/** - /profiles/arch**
Then, if you want to “clean” unused features of selected arch (like
no-multilib), you need
to add that rules above:
- /profiles/arch/amd64/no-multilib/** - /profiles/arch/amd64/x32/** + /profiles/arch/amd64/** - /profiles/arch**
Unfortunately that approach do not scale well ;-( Particularly because the way
rsync visits subdirectories:
shortest path visited first… which is quite “normal” and expected actually when traversing directories.
But you can’t write smth like this:
+ /profiles/default/linux/amd64/** - /profiles/default/**
because before visit nested
/profiles/default/linux/ dir will be visited, but the only
rule it matches is the second one: kill all inside of
/profiles/default/, and the first rule will never match!
So to do a pedantic cleanup of
/profiles one have to write a lot of rules to explicitly enumerate things
deep inside of profiles related dirs…
like it was in my first version
But there is a other side of a coin: the way
rsync visits and applies rules actually can help to describe desired
cleanups in a shorter functional-like way! :-)
Lets describe what we want to have/keep… Personally I use latest gentoo linux release (it is 13.0), native 64-bit
amd64 with multilib and KDE as a desktop. Everything else I don’t need! First of all we have to take care
about all files, that makes up a profile, anywhere
+ /profiles/**/eapi + /profiles/**/make* + /profiles/**/parent + /profiles/**/package* + /profiles/**/profile* + /profiles/**/use* + /profiles/**/ChangeLog
Note the last rule is not part of the profile, but will keep latest
ChangeLog files, killing the
outdated others (
ChangeLog-2007 for example). Observing nested dirs inside a
/profiles one may notice
that desired things to keep are placed in a dirs named after architecture or features, so to include them
(prevent deletion) it would be enough the following bunch of rules:
+ /profiles/**/amd64 + /profiles/**/linux + /profiles/**/13.0 + /profiles/**/64bit-native + /profiles/**/multilib + /profiles/**/multilib/lib32 + /profiles/**/desktop + /profiles/**/kde
And the final part: just kill everything else!!
+ /profiles/arch/base** - /profiles/arch/** - /profiles/default/** - /profiles/embedded/*** - /profiles/features/** - /profiles/hardened/*** - /profiles/prefix/** - /profiles/releases/** - /profiles/targets/** - /profiles/uclibc/***
/profile/arch/base/ dir must be kept due referenced from a particular arch dir (
arch/amd64/ in my case).
Measure the effect
I have downloaded and unpack a portage snapshot for 2014-Jan-14 to play w/ into
Wrote a simple script in it to get a filtered copy of
portage/ – just a shortcut for
w/ a bunch of options:
#!/bin/sh rsync --exclude-from=./gentoo-portage-exclude.list \ --delete-excluded \ --delete-missing-args \ --recursive \ --delete-after \ --verbose \ portage/ portage.stripped/
Now lets count files+dirs before/after and total directory size I’ve got with my final version of filtering rules:
zaufi@gentop〉/storage/tmp/p〉 find portage | wc -l 178370 zaufi@gentop〉/storage/tmp/p〉 find portage.stripped | wc -l 119408 zaufi@gentop〉/storage/tmp/p〉 du -hs portage 739M portage zaufi@gentop〉/storage/tmp/p〉 du -hs portage.stripped 503M portage.stripped
- 236M freed
- 58,962 directory entries eliminated
blog comments powered by Disqus