APR
25
2006

The future is written in tags

It's important to have information organized, and everyone who tried know it's difficult to organize things efficiently. Lately most applications are trying to help the user to do exactly that, and in my opinion that will be the factor users will use to choose one application over another: how the application allows them to organize the information they work with.

For example, amarok is doing an excellent job with that, it knows all the music files the users have and it keeps it very well organized. Users can search for an artist, for a song title, for the year the song was published, etc. without caring where the information is in the hard disk.

Another example is kimdaba/kphotoalbum or digikam, they both allow the user to create tags and assign them to pictures, which together with the exif information allows for very useful searches to be done in order to find specific pictures on user's albums.

So we have tags in kimdaba and tags in digikam, both applications write their data in different places (kimdaba in a xml file, digikam in a sqlite one), which is a pity and forced me to do a conversion script in order to import in digikam my kimdaba tags (and the corresponding image associations I did over a year of using kimdaba) once I choosed to just use digikam for downloading pictures as well as for organizing them.

That makes me think... why don't both applications store their tags in the same place? and going a step ahead, why doesn't every application store their tags in the same place? Maybe we could have a kind of tagging server for KDE 4, which applies tags for files. Having the tags and the file associations in a central place could allow for more interesting searches. Also, applications could register new tags automatically (for example, kaddressbook could create person tags for each of the contacts automatically, or korganizer could create tags in an event subfolder for each of the events). Having a centralized tag list would allow digikam, kimdaba, basket, amarok, konqueror, etc. to unify the tags that they would use to tag things.

I guess kat, beagle and others already do something similar like full text indexing and those things, but anybody knows if they offer tag lists too?

Something to think about.

Btw, if someone wants the kimdaba -> digikam tag/caption conversion script, just email me but keep in mind that it's not yet meant to be used by "normal" users :) and currently you need to know python in order to use it.

Comments

There is no tag support like you mentioned planned or existing in KDE at the moment. The only similar thing I know at the moment is leaftag by the Gnome guys which already looks very promising.

The most important thing would be in my opinion to have a kind of a tagging server (or even better: a tag-supporting file system) where these information are stored independent from the used desktop.
If that would become a standard every program could be modified to use these tags and there would be no need to write any kind of conversion scripts - until you want to exchange files and tags with other operating systems or web applications (like photos with flickr).


By liquidat at Tue, 04/25/2006 - 01:06

It would be cool to go one step beyond the mere tagging and use RDF or OWL for tagging because this would create a very powerful environment that allows reasoning in the sense of the Semantic Web. This is an excellent example how far you can go. They use standardized vocabularies such as Friend of a Friend, the Dublin Core, and Wordnet to describe pictures. I.e. a person or object can be described and identified with the same "Tag" by different applications.


By dominic.battre at Tue, 04/25/2006 - 06:29

Tagging as in Photo HasTag "Dominic" is so trivial it's not worth doing in RDF which supports Subject-Property-Value triples.

You could definitely use RDF+OWL to make statements about the values in tags. Dominic IsFriendOf Sarah would let you query your photo database for every picture with friends of Sarah. I don't know of any tagging system that lets you do that.

Tagging photos with semantic statements like Photo TakenBy Antonio, Photo Rating 4, Photo hasPrimarySubject Dominic, Photo LikedBy Sarah could be useful if you want to derive more advanced statements about photos, but adding a property to tagging makes it a lot more complicated both in UI and storage.

Regular MediaWiki supports categories, which are like hierarchical tags and a little more interesting. The Semantic MediaWiki project is adding typed links and attribute values to wiki pages, including a "shows" for pictures. Here's a picture in it, scroll down to the "Relations to other articles" infobox and click the magnifying glass icons. You can export these properties as RDF. I think putting pictures into Wiki's will become popular and subsume most of the features of Flickr and such.

Cool stuff.


By skierpage at Wed, 04/26/2006 - 04:45

... but the future of storage is written as binary.

why don't both applications store their tags in the same place? and going a step ahead, why doesn't every application store their tags in the same place? Maybe we could have a kind of tagging server for KDE 4, which applies tags for files.

There is already working layer for this idea and it can be called KDE-DB or KDB, depending on our taste. For now it's KexiDB, soon to be reused in KOffice 2.0, then in KDE 4.x.

There's an idea that KDE4 may be dependent on a small and fast embedded database server, like SQLite or Firebird. At Kexi Project we're evaluating the legal possibility of reusing Firebird as it is more powerful. There are three facts:

  • In any case, KDE4 apps would not even know how the storage looks like, thanks to high-level KexiDB abstraction. Plus: zero maintenance.
  • Tags, as you mentined them, could be implemented as one or two one-to-many relationships (rather two, as you will want a namespace for your app).
  • XML could be used for interoperability with other desktop environments (fd.o), as I cannot believe they would be happy adopting implementation of our storage.

See "Database Abstraction Layer" here and here for more info.


By Jarosław Staniek at Tue, 04/25/2006 - 07:58

hm, nice links. The page says evaluating, how far is the evaluating? And how would the interoperability with other desktops look like? I mean, I would like to have the option to sometimes start in a gnome/xfce environment just to check how they are doing and I would like to keep the information I tagged inside KDE.
Have you tried to contact the gnome guys? Are there any coordinations?

I mean, it would be sad that we get, in the time of portland, tango and freedesktop.org, a new split between the both desktops, making them even more incompatible :/

liquidat


By liquidat at Wed, 04/26/2006 - 02:07

I work mostly on the library implementing DB abstraction layer, while the "tag server" could be another application of the library for real world.

And, I guess, such an application should provide a public interface that we can try to share with other desktops.
BTW: Most of us afraid of a scenario where users have the same information is indexed more than once because KDE and GNOME are both installed (Beagle vs Kat!).

I hope specifying such higher-level interface is possible...
In the end, it all depends on whether the distro makers are sane or not in technical sense.


By Jarosław Staniek at Wed, 04/26/2006 - 22:00

I made an attempt at implementing an application-independent tagging system, using... symlinks :-). It's called Lexeo. You can find it here:
http://www.kde-apps.org/content/show.php?content=24504

It consists of a PyQt app to assign tag to your files (any file) and an IO Slave to make queries like "lexeo:location/home and people/aurelien"

It's not really well maintained to say the least. The other night I was thinking about reimplementing it using the Digikam database as a backend...


By aurélien gâteau at Tue, 04/25/2006 - 08:25

I think that tags, as well as other metadata, should stay with the content they are talking about, that is, with the file. I think that metadata success in digital music is due to their locations inside the sound files.
There is no real reason why this should not be the case with other files, such as pictures or scientific articles.

Bye the way, XMP may be interesting in this context :
http://www.adobe.com/xmp

It would be nice to have efficient and specialized applications (such as Juk or Amarok) that lurk for metadataed files and present them nicely organised to the user. This could be linked to the search engine (I'm of course thinking of tenor here).

Steph


By Stéphane Magnenat at Tue, 04/25/2006 - 15:09

this is partly what the nepomuk project in combination with mandriva's semantic desktop project will bring to kde4 =)


By Aaron J. Seigo at Fri, 04/28/2006 - 17:15

The nepomuk homepage seems to be... too empty. Is there really someone developing something there?
Anyway the commentaries of dominic battre and skierpage about RDF and OWL are quite interesting, and they look to me like the way to go (but I haven't understood the idea behind nepomuk)


By Antonio Larrosa at Wed, 05/03/2006 - 02:06