Skip to content

4 seconds

Saturday, 3 December 2005  |  lubos lunak

[13:17:58] <Seli> boy, this machine sucks ... how am I supposed to benchmark anything if it fires up KDE in 3.7 seconds?

Yeah, right, it's kinda stupid to work on performance when the machine is so fast that even sysprof sometimes doesn't produce enough samples and the machine has no support for CPU throttling or anything like that :(. I'll need to transfer the build from this AMD 2800+ to this slow 900MHz laptop I used at aKademy for the performance talk. Note that while I cheated a bit at aKademy, this is normal KDE startup. Still with warm disk caches, but fc-list says 249 fonts, I have a splash (the SUSE one, that is, ksplash is not exactly fast), I have a wallpaper, and while it is a bare KDE it is fully usable. And adding Konqueror, KWrite, Konsole and some of the default systray apps to the startup still keeps it slightly below 5 seconds. I'm curious what the laptop will do.

Anyway, since I've already posted some details about this two times and I don't feel like doing it more than once more, let this place be the once more time. So, for those who feel like asking, various copy&paste from IRC:

[14:11:51] <Seli> [13:54:49] <Seli> actually, I can do a measurement for estimating to maximum impact of -Bdirect [14:12:05] <Seli> [13:59:01] <Seli> so, with konqueror showing /tmp, kwrite and konsole; only kxkb in systray : time spent in relocation processing is ~10% [14:12:17] <Seli> (that's with prelink but relocation processing is still done for dlopen) (the last line just means that while prelink currently doesn't work with dlopen it could, if it did some of the assumptions -Bdirect does, and I have a patch that can emulate the effect of using prelinking for dlopen - I didn't use it here though) [14:12:34] <Seli> [13:59:14] <Seli> that's theoretical maximum limit, which -Bdirect can no way achieve [14:12:34] <Seli> [13:59:53] <yakbos> out of curiosity: whats taking the other 90% ? [14:12:34] <Seli> [14:00:10] <Seli> actually, it probably could get quite close [14:12:43] <Seli> [14:00:37] <Seli> 1/3 "in kernel", whatever that means [14:12:48] <Seli> [14:01:02] <Seli> 20% in X [14:12:54] <Seli> [14:01:48] <Seli> ~15% fonts stuff [14:13:01] <Seli> [14:02:12] <Seli> 10% KConfig parsing, apparently [14:13:24] <Seli> EOF ... [14:13:49] <Seli> this Seli guy also believes prelink for dlopen would be better than -Bdirect, but 10% is probably not worth the effort [14:14:19] <Seli> maybe if I get bored during Christmas or something, could be interesting [14:14:36] <lars> Seli: is all of OpenSuSE prelinked now? [14:14:42] <tronical> so is this Seli guy more optimistic about the widespread use of prelink? :) [14:15:18] <tronical> and what does the Seli guy think about the OOo startup time? (just curious :) [14:15:20] <Seli> lars: no, coolo and matz ... er ... got tired of trying prelink [14:15:31] <Seli> well, let's say that prelink has some issues [14:15:58] <Seli> causes disk fragmentation (and kernel guys apparently still believe the old rubbish "Linux doesn't need defragmentation") [14:16:06] <Seli> you need to run the tool [14:18:19] <tronical> Seli: the other disadvantage of prelink is that it kills the checksums in your rpm database (unless one hacks around it, which is ugly ;( ... [14:16:20] <lars> Seli: I know. So if we don't prelink (for whatever reason), how much does -Bdirect help? [14:16:42] <Seli> don't know, let me try the startup without prelink [14:18:15] <Seli> 38% with the same setup spent in relocation processing ... [14:22:12] <Seli> hmm, let's say -Bdirect could take care of ... say somewhere between 10 and 30 out of those 38% [14:23:00] <Seli> no way it can achieve as much as prelink [14:23:07] <Seli> I'd have to get some numbers to make a better guess ... [14:23:46] <Seli> and I hope you are aware of the fact that -Bdirect alters the normal symbol lookup? [14:24:24] <lars> Seli: yes, but I would even see this as an advantage in most cases :)

Actually now I think the 30 is a better guess, and while -Bdirect can never achieve as much as prelink from principle it should actually be possible to get quite close. Close enough not to make a difference in practice. I guess my belief that prelink would be a better solution is because I find prelink a "cleaner" solution ... which may be both wrong and naive ... and I can be sometimes naive when it comes to choosing practical solutions. Let's say that with -Bdirect we would spend somewhere less than 10% in relocation processing. Note that while this is an educated guess it's still only a wild guess.

Another advantage of prelink would be saving more memory because of less dirty pages caused by relocations, but a) because of conflicts prelink is far from causing no dirty pages at all and fixing those would be non-trivial amount of work, b) kdeinit takes care of it anyway.

As long as we don't spend 1/3 of startup time in the dynamic linker I guess it doesn't really matter in practice how it gets fixed.

Note that all these numbers are for tests with warm disk caches. When KDE has to read all the things from the disk KDE startup takes several times longer. Improving this would need kernel support and/or various preloading techniques.

PS: No, you can't achieve the same startup speed yet. You'd need latest (unstable) fontconfig, few patches and so on. But don't worry, that'll eventually get to you (as far as fontconfig/Qt/KDE are concerned, I can't say about the rest of things KDE would need for better performance).

PPS: Coolo has a nice bootchart picture showing KDE startup.