On measuring memory usage

Oh boy. Gwenview uses 83% more memory than Kuickshow ( - the last comment as of now). BTW I especially like the "83%" part - it's just one case and the measurement is imprecise, but it can't be "about 80%" or "almost double". Reminds me of all those "hair 74% stronger" ads. Anyway.

It really annoys me that there's no good tool for measuring memory usage on Linux. There are tools, like 'top', but they often cause more harm than good - most people don't even know what the fields really mean and only few people can interpret them correctly. Mind you, even I'm not sure I can, and in fact I sometimes doubt such person even exists. The problem is, even intepreting the numbers may not give the answer. Measuring memory usage on Linux is voodoo magic.

Let's have a look at this Gwenview vs Kuickshow case, first let's measure their memory usage after startup and then the memory usage (increase) when showing a 3000x2000 image. Startup memory usage will be measured when showing an empty directory, memory usage when showing the image will be measured in a directory containing only that image (that's in order to prevent both Gwenview and Kuickshow from preloading the next image in the directory and skewing the results - there are so many ways of getting benchmarks wrong that people should be allowed to do that only with a special permission).

So, startup memory usage: 'top' gives these numbers (VIRT/RES/SHR) : 31.7M/18M/15M for Gwenview and 30.6M/17M/14M for Kuickshow. Big difference, huh? But these numbers alone mean nothing if you can't explain them. So let's try that.

VIRT is virtual memory usage, it can probably be best described as the app's used address space - every library the app uses, every data it creates, everything is included here. If the app requests 100M memory from the kernel but actually uses only 1M, VIRT will still increase by 100M.

RES is resident memory usage, i.e. what's actually in the memory. In a way it could be probably used for measuring real memory usage of the app - if the app requests 100M memory from the kernel but actually uses only 1M, this should increase only by 1M. There are only two small problems, a) RES doesn't include memory that's swapped out (and no, the SWAP field in 'top' is not usable, it's completely bogus), b) some of that memory may be shared.

SHR is shared memory. Potentionally shared memory. I.e. memory that may be used not only by this particular app but also by some else. And actually it seems to be the shared part of RES - SHR goes down if the app will be swapped out, at least with recent kernels. I actually don't think it used to do that before, I used to measure unshared memory usage simply as VIRT-SHR and it seemed to give usable numbers. If it used to be always like this then I guess I must have produced a couple of bogus benchmarks in the past. Oh well.

It seems using the DATA field does the job of saying how much total unshared memory the app is using (if it's not visible it can be added using the 'f' key).
That gives 2.6M for Gwenview vs 1.75M Kuickshow. I'm not sure why it's so much, some of that is fontconfig and XIM, some of it may be perhaps relocations. The difference is 0.85M, out of which about 0.28M is some lameness in the XCF (Gimp image) loader, who knows what the rest is and where it comes from, I don't feel like analysing that now. Moreover from Valgrind's Massif output for Gwenview it looks like it should be only 2.2M and not 2.6M, which, also without the XCF thing reduces the difference to about 0.2M and I really don't feel like checking if it's only Gwenview using XMLGUI and Kuickshow not or also something else. Exercise for the reader.

After viewing the image the numbers go up to (VIRT/RES/DATA) : 60M/44M/30M for Gwenview and 48.4M/35M/19M for Kuickshow. That's about 18M increase for Kuickshow, which is simply 3000x2000x3, i.e. image dimensions and 24bpp (RGB tripplet), so it's just the image data. For Gwenview the increase is about 27M for VIRT and DATA, 26M for RES. Gwenview uses QImage for storing image data, which stores truecolor images always as 32bpp, that's 3000x2000x4=24M.

The 1M difference between VIRT/DATA and RES is actually caused by threads. Gwenview uses a thread to load the thumbnail, this 1M is the stack space for the additional thread (although it's presented as data). Since almost none of this reserved space is actually used, RES doesn't grow by this 1M but VIRT does. It also shows that DATA grows too. That means the DATA field has its problems too and maybe the 2.6M vs 1.75M data comparison is a bit bogus too. BTW, I have no idea why this 1M is not freed when the thread finishes, maybe bug, and what's even more interesting is that on another machine it's not 1M but 100M. Now that's something that makes Gwenview look memory hungry.

So, DATA is probably not really useful. Now onto VIRT. I don't use KIPI plugins (plugins for image applications). After installing the kipi-plugins package, the numbers for VIRT/RES/DATA change for Gwenview to 50M/30M/5M (that's right after startup, so compare this to 31.7M/18M/2.6M). Just to quickly explain the DATA change, about 1M of it is caused by libgphoto2 and nvidia libraries, no idea why, probably something similarly lame to the XCF loader, the rest of the difference is probably mainly initialization of the plugins (which is lame too, no idea why it's so much, not feeling like bothering to find out, but I guess Aurelien should reconsider my Gwenview patch for caching, wrapping and loading KIPI plugins on demand). Back to VIRT. The difference there is almost 20M. The gallery plugin needs libkhtml, which is about 4M. The slideshow plugin links to openGL libraries, which here means about 8M. The rest is just the code of the plugins.

Now, which numbers to use and how? DATA is unshared memory, so it's only memory that exclusively belongs to this instance of the application. All that memory is really only used by it. However, as the stack case shows, DATA is the virtual unshared data size, so it's not only really used memory., Moreover memory is just not the unshared data. All shared data is actually just potentionally shared. If there's only one Gwenview instance and it's the only application using KIPI plugins, then all the memory used by KIPI plugins, even the memory used for the code for it, not matter how theoretically shared, is its and only its. So should VIRT be the number used? That number includes also memory used by all KDE libraries, which are definitely shared and already loaded anyway. The same very likely applies to libkhtml. And also, again the stack case, VIRT includes even memory that's not really used. So maybe RES? If you avoid swap, this number should give about the amount memory that's really needed and used at the moment. Except that part of it is again shared, and "memory used at the moment" feels a bit fishy - the kernel may e.g. discard memory used by a shared library because it can load it anytime again from the disk.

Totally confused by now? Good ... maybe after reading this you'll think twice before you decide to look at 'top' output and provide some "benchmarks" on how some application or even KDE in general is bloated and needs a lot of memory. Numbers mean nothing if you can't explain them. If somebody shows you a memory benchmark based on 'top' or 'ps' and can't explain the numbers, you can just as well ignore it.

PS: This doesn't mean it's impossible to get some at least somewhat useful numbers and work on improving memory usage. Just know what you're measuring and think twice (or even better, more than just twice). In fact using Valgrind's Massif or kdesdk/kmtrace is not that difficult and it can give good pointers. After all, they both just measure malloc() memory usage. That's just boring compared to all this.

PPS: Just in case you wonder how I know things like 8M shared memory is taken by openGL libraries, 1M unshared memory is allocated by libgphoto2 and nvidia libraries or 1M is taken by thread stack space, it's the 'pmap' tool. Quite handy. It wasn't just it though, my brain and crystall ball were involved too.

PPPS: If somebody actually knows some good way how to measure memory usage properly and get some useful numbers, I'd like to know. This is the best I know.


Thanks for adding some useful detail to my (seemingly generally correct) knowledge of this issue.

Valgrind tutorials for new developers and perhaps automatic analysis tools based on it would be cool things to have.

By Cristian Tibirna at Thu, 09/15/2005 - 12:10

I found that simply using free and having no swap gives some useful data as well. So

swapoff -a
free # look at the '-/+ buffers/cache' line

and the difference is the memory allocated, and actually used if I'm not mistaken.

By koos vriezen at Fri, 09/16/2005 - 11:41

Yes, that can give some useful data. Unfortunately sometimes it's about as useful as the others. It seems that 'cached' includes memory used by read-only mmap-ed files. If you create a directory with some huge file, make a copy of that directory and run 'Compare directories/Thorough' in Midnight Commander on the directories, it will mmap both the files and compare them. VIRT and RES of mc will be huge, but most of that memory will be included in 'cached'.
In KDE's case this means that the -/+ buffers/cache line in 'free' doesn't include e.g. ksycoca, which definitely is memory used by KDE.

By Lubos Lunak at Mon, 09/19/2005 - 07:52

Thanks for writing this, it was very useful and educational. I too have often wished for a decent memory analysis tool for linux/etc.

By mxcl at Sat, 09/17/2005 - 04:18