Skip to content

Fun (?) with symbol visibility...

Friday, 19 January 2018  |  alexander neundorf

In the last days I had to deal with loading plugins via dlopen. I learned so far already a lot about symbols on Linux.

I expected, that if I have an executable, and load a plugin into it, that the stuff inside the executable (i.e. object files and static libraries) won't be used to resolve missing symbols in the plugin. But that's wrong, by default, all symbols are visible, and so all the symbols in your executable are visible to a plugin. I didn't find the relevant blogs or mailing list entries from the KDE devs, back in the KDE2 days when all the plugin-infrastructure was added to kdelibs. But also others had to find out:

So, even if your executable links a library as static, its symbols are visible to plugins. This can be fixed by setting the visibility to hidden, which can be done either using -fvisibility=hidden or the visibility-attribute.

One more thing I didn't expect, is that even if no shared library is involved, the symbols, i.e. code in your executable is still visible to a plugin. Assume your plugin defines a class with the same name and the same method name, i.e. the same symbol. You create an instance of that class in your plugin and call some function from it. I didn't expect that at runtime the plugin might call code from the executable instead of the class just built into the plugin (i.e. not being pulled in from a shared lib). Again, making symbols hidden helps, in general. Here's something related:

Today I once ended up in a state where all the correct functions from the plugin were called, e.g. the correct ctor, but later on, when a virtual function of that object was called, it was the virtual function from the executable and not from the plugin. Weird. How could that happen ?

I added a little project in github for playing around with such stuff:

My conclusion so far is that in general you probably always want to build executables and static libraries with visibility=hidden. Not sure why this is not the default...

Update: Different behaviour with different compilers

I played around more and added an example to reproduce really strange behaviour on github in the vtabletest subdir.

In that example, I have an executable which implements a class Base and a class Derived, which is derived from it. Base has virtual and non-virtual functions, Derived reimplements both virtual functions. This executable dlopens a plugin/shared library, which happens to also implement the classes Bar and Foo, both having exact the same functions as in the executable. Then, the executable calls a function in the plugin, and that function allocates an instance of Derived and calls all its functions, virtual and non-virtual. I tested this with g++, clang++ and the Intel icpc compiler (you can get a license from Intel if you qualify as non-commercial Open Source developer).

What do you think happens ? The Derived ctor from the plugin will be called, which will call the ctor of Base from the plugin, and calling the virtual functions will call the implementation in the plugin ?

If symbols are not hidden, this did not happen with any of the three compilers. Instead, all three compilers created different results.

With g++, basically nothing from the plugin was called, Base ctor, Derived ctor and virtual and non-virtual function were all executing the code (symbols) from the executable. This was the most consistent and least messed up result.

With Intel icpc, it was more interesting. When creating the classes in the plugin, the ctors from the plugin are called, also the non-virtual function calls use the version from the plugin. IMO that's good. Now the weird thing: when calling any of the virtual functions, those from the executable were called. So in the plugin I had basically an object, where everything came from the plugin, except when calling its virtual functions, those were the wrong ones. IMO this was the result closest to what I would like to have , but due to the issue with the virtual functions completely broken.

clang offered yet another version. Here, only the ctor for Derived was called from the plugin, all other functions, the ctor for Base, the virtual and non-virtual functions, were all using the versions from the executable.

But, there is an easy fix: hiding the visibility of the classes Base and Derived in the executable, or in the plugin makes everything work as expected, for all three compilers.

I plan to have a closer look at the created executables and libraries, using nm and looking at the assembly code...