Skip to content

Implementing C++ implicit type conversions on method arguments in Smoke based language bindings

Monday, 1 February 2010  |  Richard Dale

I'm sorry about the unwieldy title to this blog - I couldn't think of a shorter snappier way of putting it, but I'll try explain the tricky problem with 'C++ implicit type conversions' that I've managed to solve.

Although I've been living in Gran Canaria for over four years, until recently most of my books were still laying around my flat in cardboard boxes. A few months ago I went to the IKEA sale and picked up two of their 'Linnarp' bookshelves for a good price. I was excited about my 'Linnarps' until I read that building them involved hammering nails, and my DIY phobia cut in. I looked at the size of the nails and thought my tiny hammer from my fret saw set was going to be too small, and I didn't have anything bigger. So I tried to buy a hammer in the Spanish shops and found that they don't seem to do small to medium sized hammers, only what I would regard as quite big hammers. Maybe it is to do with the latin 'macho' mind set. Personally, I would rather not use a hammer at all, than use one that I was unhappy with. So my 'Linnarps' just sat there in their boxes for a few more months, until I went back to the UK for Christmas and was able to buy a nice small Stanley hammer.

After waiting patiently to pick up the right hammer, I returned to Gran Canaria and set to with working on the 'Linnarps' and got them built. I could finally put all my treasured books in bookcases where I could actually find them. My computer books filled a whole 'Linnarp', and one of the ones I like most is 'Compilers - Principles, Techniques and Tools' by Aho, Sethi and Ullman or the Dragon Book. I have the first edition with a red dragon on it (the dragon subsequently went from red to green to purple). The dragon on the cover is a metaphor for conquering complexity in compiler design - a mean looking beast, not like our friendly Kongi. If you want to be a knight who slays dragons and don't want to end up as toast, you had better be pretty well prepared and study things like LALR parsers and so on.

If you love solving very complex compiler implementation problems, then C++ is the language for you! It is arguably the most complex language of all with respect to implementing a compiler, and implementing a language binding for a dynamic language to the C++ api involves dealing with some of those same complexities. One of the big 'dragons' that has been bugging me since I first started working on QtRuby over six years ago, is how to deal with the way you can pass an instance of a certain type as an argument to a C++ method, and as long as there is a constructor which takes the argument and can construct an instance of the same type as the target argument then it will work. So if you pass an 'int' to a method expecting a QVariant, then because there is a QVariant constructor that takes an 'int', the C++ compiler will construct a QVariant from the 'int' for you, and pass that to the method.

Here is an example in JSmoke JavaScript of using the Qt 4.6 animation framework:


button = new QPushButton("Animated Button");
button.show();

animation = new QPropertyAnimation(button, "geometry");
animation.setDuration(10000);
animation.setStartValue(new QRect(0, 0, 100, 30));
animation.setEndValue(new QRect(250, 250, 100, 30));
animation.start();

QApplication.exec();

The code is pretty much the same as it would be in C++, and without implicit type conversions it would look like this, which is just that bit clunkier:


button = new QPushButton("Animated Button");
button.show();

animation = new QPropertyAnimation(button, new QByteArray("geometry"));
animation.setDuration(10000);
animation.setStartValue(new QVariant(new QRect(0, 0, 100, 30)));
animation.setEndValue(new QVariant(new QRect(250, 250, 100, 30)));
animation.start();

QApplication.exec();

There are several possible ways I had thought of to solve the problem:

  • One solution would be to just special case all the instances in the api where this can occur, but there are quite a lot of places and for QtRuby only some of the more important uses have been special cased.
  • Another way would be to generate extra methods in the Smoke libraries for all the combinations of extra arguments types that are possible, but it would probably mean the libs would be significantly larger and the api would be more cluttered when you looked at it with introspection tools.
  • Instead of adding the extra methods at library code generation time, another approach could be to generate all the extra possible methods at startup after loading the Smoke libs, and add the extra methods to the statically generated lookup tables.

The problem seemed pretty intractable, and rather than solving it badly I preferred not to solve it at all, just like I had waited to get hold of exactly the right hammer before assembling my bookshelves. In the end I came up with the idea of looking up and matching possible constructors for argument type conversions from the existing introspection data, which is pretty elegant. As the complete api is described in the Smoke lookup tables, it is possible to find all the info with no changes needed to the Smoke libs. The method matching code first tries to match without looking for implicit type constructors, and only if that fails does it go into a second more expensive phase. That way, the exacting matching for the majority of methods that don't have any implicit conversions should run at much the same speed as before.

In JSmoke you can set a 'Qt.Debug.trace' global variable to add various sorts of debug tracing. One option is 'Qt.Debug.MethodMatches' which show how methods were matched against their argument types. If we look at the output from matching the QByteArray argument in the QPropertyAnimation constructor, it looks like this:


Qt.Debug.trace = Qt.Debug.MethodMatches;
animation = new QPropertyAnimation(button, "geometry");
Qt.Debug.trace = Qt.Debug.None;
...

    Argument type conversion matches@animation.js:-1 for 
        QByteArray.QByteArray(geometry):
            QByteArray::QByteArray(const char*) 
            module: qtcore index: 530 matchDistance: 1
Method matches@animation.js:-1 for 
    QPropertyAnimation.QPropertyAnimation([object QPushButton:0x09307a78], geometry):
        QPropertyAnimation::QPropertyAnimation(QObject*, const QByteArray&) 
        module: qtcore index: 3922 matchDistance: 4

For each C++ type in the target method, the corresponding JavaScript argument is matched and given a score as to how close the match was. In the case above, an argument type of 'const char *' in the QByteArray constructor was given a value of 1, whereas a QString argument type would have been prefered and given a value of 0. The first argument to the QPropertyAnimation constructor is a 'QObject*' and we provided a QPushButton which is three levels down in the inheritance heirarchy, and so is given a 'matchDistance' of 3, giving the total of the match scores or 'matchDistance' for the QPropertyAnimation constructor of 4. You can see the class heirarchy by using the handy smokeapi command line tool:


$ smokeapi -r qtgui -p -c QPushButton
      QObject
      QPaintDevice
    QWidget
  QAbstractButton
QPushButton

An extra wrinkle in the C++ complexity of implicit argument type conversions is that as well as matching constructors, the compiler will also look for type conversion operators such as 'QImage::operator QVariant()'. So if you pass a QImage to a method expecting an argument type of 'QVariant' it will use the operator method to construct the QVariant. The JSmoke overloaded method resolution code also handles this case as it can dynamically look up and call those operator methods to do the conversions.

The next stage in implementing the JSmoke runtime will be to add a 'method selector cache' which will allow methods that have already been resolved against a certain set of JavaScript argument types to be retrieved given the JavaScript type signature. The approach I have taken to the implicit type conversions will work well with a cache, as the result of the lookup is encoded into a QVector with one entry for the main method, with optional extra entries for any type conversion methods that are needed. So a language binding based on a language independent introspection library doesn't have to be slower than a conventional static binding as long as the caching scheme works well.