Say hello to KJS/Frostbyte -40.9° and Icemaker

Monday, 10 March 2008 | Sad Eagle

If you've been scanning KDE commits lately, you may have wondered about heavy activity on an experimental KJS branch, KJS/Frostbyte. Well, while it's still not 100% done (there are a couple bugs, and not all goals have been met yet), it's complete enough that I am comfortable to blog about it.

KJS/Frostbyte is a bytecode and performance improvement version of KJS, which aims to both provide an immediate speed up, and the infrastructure for further improvements. Right now, if I didn't screw up my notes, it's showing ~1.4x speedup over KDE4.0.2 version of KJS on the SunSpider benchmark suite.

Now, you might wonder why "bytecode" and "performance improvement" are two separate bullets. Doesn't bytecode usually mean "faster?". Well, in this case, only somewhat; much of speedup comes from other changes. KJS's traditional recursive nature matched very well with how ECMAScript is specified (ad hoc big-step operational semantics, basically), so it really didn't have much overhead.

The value of bytecode is more:

It separates out the representation during parsing from the representation used during execution. This means they can both be kept as clean as possible. Previously, various optimizations would require things such as splicing of nodes (or the horrific, though clever, in-place replacement in JSC)
It provides a much simpler, flattened program form to operate on and reason about, opening the possibility of doing fancier things in the future. A lot of the subtler trickiness in the language is made far more explicit, too.

To help with both, and to stay flexible, we do not hand-code the bytecode language. Instead, a description file with a list of IR types, instructions, etc., is fed to a special tool I wrote, Icemaker. Icemaker converts that into:

The virtual machine main loop
Instruction description tables, which with some enhancement could perhaps be used to drive a dataflow engine
Instruction selection and conversion cost tables

Here, too, we have a layer of indirection, keeping things cleaner. For a subtraction operation, we can describe its bytecode merely as:

operation Sub {
    impl number(number v1, number v2) [[
        $$ = v1 - v2;
    ]]
}

Then, when compiling JavaScript into bytecode, we merely ask for an Op_Sub with given arguments. Sounds trivial? Well, it's not quite that simple, since the arguments don't have to be numbers. Using the tables Icemaker built, the instruction selection engine can automatically put in conversions, if needed. It can also do better than that, and pick a more efficient variant, if more than one is described:

operation BracketGet {
    impl value (value v1, value v2) costs 50 [[
        // snip code
    ]]

    overload value (value base, int32 prop) [[
        $$ = base->getByIndex(exec, prop);
    ]]
}

With this, little tweaks to the IR can be tried easily, and specializations can often be done w/o even touching the compiler proper...

Coming up at some point later: some benchmark numbers... After I figure out what's wrong with this ScopeChainNode allocation optimization, anyway.