JAN
20
2009

More metadata and a new year's resolution

Amazing how long it always takes for me to write a log entry. So many times in the last months I told myself I had to write the next entry... well, new year's resolution (a little late I know): more blogging about what I am up to (regarding KDE of course).

Well then, let's see, KDE 4.2 is around the corner and the Nepomuk features look pretty stable. Strigi is nicely integrated, it can be suspended and resumed (which is does automatically in battery mode or if the harddisl is full), the folders to index can be configured including subfolders. Krunner comes with a Nepomuk search plugin which means you can simply run queries from there. The KIO slave while not yet nicely integrated into the GUI, allows to query stuff from Dolphin or the file open dialog (something I blogged about a long time ago). Almost everything is multi-threaded for your non-GUI-blocking pleasure and tags can be reused in Gwenview. The only thing still imperfect is the storage backend based on Java (too much mem usage), although even that will be solved soon thanks to the nice guys from Openlink. But that is a story for another day (remember: new year's resolution).

Thus, finally we have a good foundation to build new stuff upon and that is what this blog entry is actually about. So let's have a look.

Again I am using Dolphin as the example. But why not, it is our file manager and we are used to the file manager also handling a bit of meta-data. Anyway, in KDE 4.2 Dolphin does display a little bit of meta-data for each file. This includes the size, the type, and some fields directly extracted via Strigi. The latter include id3 tags and some 5 or 6 fields more. However, these are hardcoded in Dolphin. Thus, apart from id3 tags not much is displayed. For example no exif properties. Well, all this information is stored in Nepomuk so why not use it? And that is what I have done. Take a loot at the first screenshot which shows Dolpin displaying meta-data from Nepomuk in a generic way. Meaning, nothing is hardcoded. The properties are read from the Nepomuk store, the labels are read from the Nepomuk/Xesam ontologies and everything is nicely extendable (as we will see later on).

As you can see, some properties are shown twice. That is because everything before "Source modified" comes from Dolphin's hardcoded properties while everything else comes from Nepomuk. Let us have a quick look at the code. After all this is a developer blog:

Nepomuk::Resource res( item.url() );
QHash properties = res.properties();
for ( QHash::const_iterator it = properties.constBegin();
it != properties.constEnd(); ++it ) {
Nepomuk::Types::Property prop( it.key() );
m_metaTextLabel->add( prop.label(), Nepomuk::formatValue( res, prop ) );
}

Now this looks simple enough I think (although I shortened it a bit, the original code does a bit of filtering). Basically we create a Nepomuk::Resource for the selected file and read all its properties. This returns a map of property URIs (remember: all in Nepomuk is defined as ontologies, hence URIs) and property values. Then the map is iterated and for each entry a line is added to the meta-data display. Now what about the Nepomuk::formatValue call? Well, the values can be literals such as strings or integers or doubles but they can also be other resources (other files, tags, persons or whatever). We do not want to display resource URIs to the user. The formatValue call triggers an experimental lib which uses formatting rules to convert resources into strings. An example: a person resource has firstName and lastName properties. The rule would then state that they are to be combined to build the label. Another simple example would be a file: the rule should state that the filename is to be used. Again we will see an example in action shortly (if you dare continue reading that is ;).

Ok then, now Dolphin displays our meta-data and I claim it does so generically. Then what about some new data: I want to remember the source of downloads, both web and IM downloads. The first one can actually be handled within KIO while the second one means to patch Kopete. I did both. But first we need to know how to store this information. Both the Nepomuk ontologies and the Xesam ontology do not provide the necessary properties. Thus, the first step is to create our own ontology for downloads. I will only draft it here quickly, it is not big anyway. (Remember: In Nepomuk all data is stored as RDF which means triples. If that is confusing, think of it as an object-oriented database where you can have classes and subclasses and class members which are here called properties.) It all revolves around the Download class which has subclasses like HTTPDownload or IMDownload. Then there are properties like sourceURL and one to relate local files to the download. (for everyone interested in the details: you can find the ontology in playground: NRDO)

Ok then, let's integrate it into KIO somewhere in the file copy job:

Nepomuk::Resource fileRes( destinationUrl, Soprano::Vocabulary::Xesam::File() );
Nepomuk::Resource downloadRes( QUrl(), Nepomuk::Vocabulary::NDO::HttpDownload() );
downloadRes.setProperty( Nepomuk::Vocabulary::NDO::sourceUrl(), sourceUrl );
downloadRes.setProperty( Nepomuk::Vocabulary::NDO::startTime(), Nepomuk::Variant(startTime) );
downloadRes.setProperty( Nepomuk::Vocabulary::NDO::endTime(), Nepomuk::Variant(QDateTime::currentDateTime()) );
fileRes.setProperty( Nepomuk::Vocabulary::NDO::download(), downloadRes );

As we can see I did not use the Nepomuk resource generator to generate C++ classes. Instead I went the other way and generated a vocabulary class using the onto2vocabularyclass tool provided by Soprano. Actually it is quite easy to integrate that into cmake.

Now what is happening here? We again create a Nepomuk resource for the local file which has been downloaded. Then we create the download resource, set some nice properties and then relate the file to the download. This combined with a little formatting rule for downloads gives us the following display in Dolphin:

Nice, isn't it? Well, this is the actual source URL. My plan is (and the ontology has a property for that) to also store the referrer web page which is more interesting in most cases. But I did not manage to make that work yet (tried to hand that information down through the KIO::Job metadata).

And the exact same thing can be done for Kopete. Only in this case we create an IMDownload and relate it to a person via their IM account instead of a source URL. The following code does work but also creates a new IMAccount resource for each download. The goal has to be to reuse the account resources that already exist (again a reason to push the Akonadi/Nepomuk integration):

First we create the IMAccount resource:

Contact* contact = d->info.contact();
Nepomuk::Resource imAccount( contact->nickName(), Nepomuk::Vocabulary::NCO::IMAccount() );
imAccount.setProperty( Nepomuk::Vocabulary::NCO::imNickname(), Nepomuk::Variant( contact->nickName() ) );
Nepomuk::Resource imContact( QUrl(), Nepomuk::Vocabulary::NCO::PersonContact() );
imContact.setProperty( Nepomuk::Vocabulary::NCO::hasIMAccount(), imAccount );
imContact.setProperty( Nepomuk::Vocabulary::NCO::fullname(), Nepomuk::Variant( contact->formattedName() ) );

After that we create the actual download resource which looks quite similar to the example from KIO:

Nepomuk::Resource downloadRes( QUrl(), Nepomuk::Vocabulary::NDO::IMDownload() );
downloadRes.setProperty( Nepomuk::Vocabulary::NDO::startTime(), Nepomuk::Variant(startTime) );
downloadRes.setProperty( Nepomuk::Vocabulary::NDO::endTime(), Nepomuk::Variant(QDateTime::currentDateTime()) );
downloadRes.setProperty( Nepomuk::Vocabulary::NDO::sendingContact(), imContact );
Nepomuk::Resource fileRes( destinationUrl, Soprano::Vocabulary::Xesam::File() );
fileRes.setProperty( Nepomuk::Vocabulary::NDO::download(), downloadRes );

And this is what the result looks like, again combined with a formatting rule:

Ok, that's it for today. I hope this will become stable soon so we can have some nice additional meta-data in 4.3. Also: I could use some help with this. Not only with integrating into KIO or Kopete or KTorrent but also with the ontology design and the formatting. Both are still rather experimental.

A little sidenote: I am still a bit disappointed that the blog system here changed. No more C++ code highlighting, no more fancy image handling with automatic thumbnails... or maybe it still works somehow but there is no documentation? I was not able to get an answer so far. So I am using html img tags to include my images which is no fun.

Comments

Nice to see you again blogging. Looking at those screenshots brought back an idea I had since I first heard about Nepomuk, namely that when I select a pdf of an academic article, my filebrowser should be at least smart enough to query for basic bibliographic data about that article, dumped (RDF) from Zotero and display it.

I have experimented with Soprano and Nepomuk using PyQt/PyKDE under Ubuntu Gnome (my current setup) to evaluate if something like that is possible, but ran into errors. Because I do not know if this is caused by my lack of of PyQt/PyKDE knowledge, the Python-wrappers themselves, running in the Gnome environment, or the older versions of KDE in the repository, I'm trying switch to KDE proper to eliminate the last two factors.

I have now KDE4.2 running through KDE4Daily, and updated to the latest version I think. Is this adequate to run the examples you post in your blog? If so, where do I find the code? Do you have suggestions to get started with Nepomuk/Soprano using PyQt/PyKDE?


By mhermans at Tue, 01/20/2009 - 22:11

The patches for Dolphin and Kopete are not commited anywhere yet. The formatting code is in playground like all experimental Nepomuk stuff.
Cannot help you with python though. Never used it. :(
As for Zotero: if the data is exported to Nepomuk, it should be displayed automatically. And with some smart rules you can even display information that is further away in the graph.


By Sebastian Trüg at Fri, 01/23/2009 - 13:58

i would like to use this opportunity to report a bug i got:
loading the OS and logging into kde 4.1.96 (debian sid) i get this error (dmesg)
nepomukservices[5408] general protection ip:7f8b851fa55e sp:7fff8d7e7d80 error:0 in libQtCore.so.4.4.3[7f8b85191000+230000]

enable strigi desktop file inders is always disabled (even if i try to enable it)
i get no results showing in dolphin "nepomuksearch:/" every thing is empty

i got strigiclient working but i have no idea how to query it ?

any ideas ?
:-)


By nadavkav at Thu, 01/22/2009 - 15:17

> nepomukservices[5408] general protection ip:7f8b851fa55e sp:7fff8d7e7d80 error:0 in libQtCore.so.4.4.3[7f8b85191000+230000]

this looks like an installation issue AFAICT.


By Sebastian Trüg at Fri, 01/23/2009 - 13:59

(i am looking for an insight out of the mess ;-)

i got latest kde 42 from experimental and latest libqtcore4 from unstable (no experimental version)
it must be versions mismatch. how can i tell which version of libqtcore4 i need for nepomukservices ?

i have tried to get the playground modules but could not compile them.
do you have a script for that or some instructions ?

:-)


By nadavkav at Mon, 02/02/2009 - 16:23

:-)


By nadavkav at Mon, 02/02/2009 - 16:22

I'm considering two ideas to use digikam when it is a nepomuk ready app.

I have rated 5 images and try to use kslideshow.kss to show them in a screensaver. How can I use 'nepomuksearch : / numericRating : 10' inside kslideshow.kss or inside a plasmoid? So far I can use it in 'folder view' but I want to 'nepomuksearch' in other plasmoids.

I have a best_rated.desktop file, and I can open it with Dolphin, but i want it in my 'virtual folders' list, besides 'All music files', 'Recent files', etc. How can I create a custom 'virtual folder' to be shown in nepomuksearch?

And I think there is a bug in nepomuksearch. It shows files with acute accent (spanish is my first language) like 'zero byte files' and Dolphin get zombie when I try to search tags with acute accent, like 'bitácora'.

This technology is great! Thanks for your time and work!


By sebaxtian at Thu, 02/05/2009 - 15:10