Experiments with Rpm payload

I appologize to everyone thinking WTF? at the following, but I have to blog
about it, otherwise I'll explode :)

This is the reference for our current payload:
-rw-r--r-- 1 coolo suse 36966400 13. Mär 09:42 coreutils-6.10.tar
8.61user 0.05system 0:08.66elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+11936outputs (0major+2007minor)pagefaults 0swaps
-rw-r--r-- 1 coolo suse 6110179 13. Mär 09:45 coreutils-6.10.tar.bz2.9

Then I did lzma -1 to lzma -9 and these are the numbers:

5.90user 0.04system 0:06.03elapsed 98%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+16456outputs (0major+734minor)pagefaults 0swaps
-rw-r--r-- 1 coolo suse 8423507 13. Mär 09:48 coreutils-6.10.tar.lzma.1

5.83user 0.04system 0:05.88elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+10720outputs (0major+3271minor)pagefaults 0swaps
-rw-r--r-- 1 coolo suse 5488129 13. Mär 09:48 coreutils-6.10.tar.lzma.2

27.11user 0.06system 0:27.23elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+9888outputs (0major+3118minor)pagefaults 0swaps
-rw-r--r-- 1 coolo suse 5061307 13. Mär 09:48 coreutils-6.10.tar.lzma.3

29.60user 0.09system 0:30.36elapsed 97%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+9168outputs (0major+4334minor)pagefaults 0swaps
-rw-r--r-- 1 coolo suse 4691437 13. Mär 09:49 coreutils-6.10.tar.lzma.4

42.07user 0.10system 0:42.27elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+8576outputs (0major+6768minor)pagefaults 0swaps
-rw-r--r-- 1 coolo suse 4385224 13. Mär 09:50 coreutils-6.10.tar.lzma.5

44.68user 0.22system 0:46.32elapsed 96%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+8256outputs (0major+11633minor)pagefaults 0swaps
-rw-r--r-- 1 coolo suse 4222761 13. Mär 09:50 coreutils-6.10.tar.lzma.6

47.24user 0.34system 0:49.82elapsed 95%CPU (0avgtext+0avgdata 0maxresident)k
8inputs+7776outputs (0major+21360minor)pagefaults 0swaps
-rw-r--r-- 1 coolo suse 3971013 13. Mär 09:51 coreutils-6.10.tar.lzma.7

63.44user 0.18system 1:04.61elapsed 98%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+7344outputs (0major+40818minor)pagefaults 0swaps
-rw-r--r-- 1 coolo suse 3753487 13. Mär 09:52 coreutils-6.10.tar.lzma.8

63.52user 0.32system 1:04.14elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+7328outputs (0major+76468minor)pagefaults 0swaps
-rw-r--r-- 1 coolo suse 3746901 13. Mär 09:53 coreutils-6.10.tar.lzma.9

As you can easily see, lzma -2 beats bzip2 -9 both at size and compression speed
(using slightly more memory). Above that it's no longer win-win as you win another
1MB if you go with -5 (as I read in a patch Mandriva is using) at the cost of
taking 5x the compression time bzip2 needs and twice as memory.

So it's important to remember why we're thinking about lzma to begin with:

uncompressing the bzip.9:
1.81user 0.00system 0:01.82elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+1068minor)pagefaults 0swaps

uncompressing the lzma.2:
0.80user 0.00system 0:00.83elapsed 97%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+824minor)pagefaults 0swaps

uncompression the lzma.5:
0.69user 0.01system 0:00.70elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+1080minor)pagefaults 0swaps

And decompressing is what our _users_ do. So what do rpm users want?
- Smaller downloads
- Faster installs of these downloads
- Reasonable memory usage
- (Not waiting one more day for a rebuild to be synced out)

So we're in a small dilemma, but we'll continue playing and something between 2
and 7 will be it.

BTW: remember that these are coreutils sources, the compression rates for binaries
won't be as impressive.


Glad to see that Linux world begin to understand that bzip2 is poor choice for distributions.

>BTW: remember that these are coreutils sources, the compression
>rates for binaries won't be as impressive.
It's not right. LZMA has better compression for binaries( as I see playing with 7zip,though it uses BCJ to improve executable compression- the improvement is about 1.15x).The only data type which bzip2 compresses well is texts.

By fhekfl4rt at Mon, 03/31/2008 - 09:28


By shevegen at Wed, 07/23/2008 - 20:11

bah :P

now i am too lazy to modify my posts... sorry.

By shevegen at Wed, 07/23/2008 - 20:12

Yeah you are attacking bzip, mister "commenter" but you are not mentioning
that the biggest problem of gzip is that IT IS SO BIG.

And exactly this was the reason why bzip became more popular than gzip.

Size matters a lot once you have +10 GIG collections

By shevegen at Wed, 07/23/2008 - 20:11

Damn, this was meant to the guy below me... does the reply function work?


By shevegen at Wed, 07/23/2008 - 20:11