Skip to content

(k)Ubuntu GNU awk messed up ? and KDE on yet another OS :-)

Wednesday, 25 February 2009  |  alexander neundorf

Hi,

today at work I noticed something strange. My box there has kUbuntu 7.10 (yes, I know, quite old, but does what it is supposed to do). I have an awk script which I want to use to process a text file consisting of 4.2 million lines, something like 600 MB.

Now, 7 years ago, computer were smaller and slower, and I can remember that I was using awk back then for some heavy text processing. Now, the same should be possible today, just faster. Or so I thought.

Well, while awk was running, it started to consume more and more memory, after 2 million lines it basically didn't progress anymore but was only swapping all the time. I removed all unnecessary code in my script, and now I managed to run it over the complete file, but it took 20 minutes (on a 2.something GHz system with 1 GB RAM !!!) I had another really close look at my awk code, and I didn't see a point where I could be accumulating memory. Strange.

So I set out to find another awk to see whether my GNU awk maybe has some problem.

I found awka, which is a awk-to-C compiler, but which didn't seem too alive. Then I found mawk, which is another regular awk, and compiled this one. How to put it, this one was blazingly fast, it processed the 4 million lines in 10 seconds or so :-) ! Then I added the commented code back in, and now the version, which my original GNU awk could not process at all until the end, was processed in 30 seconds by mawk. Is mawk just that much faster or is GNU awk in (k)Ubuntu 7.10 in some way messed up ? Only issue seems to be that mawk returns one character less for gsub(), which is ugly and requires checking the whole script (but at least it's consistent everywhere, so it shouldn't actually influence the result).

Now to something completely different: you know KDE runs more or less everywhere, also on this evil OS from somewhere in the states, I forgot the exact name ;-) Well, but there is also a free variant of it: ReactOS ! And, yes, KDE runs (almost) on it ! Here is the announcement email from the kde-windows list. But, as you can see, your help is needed to get KDE really running on this free OS ! Come and join the effort !

Alex