Skip to content

Jobs

Saturday, 7 August 2010  |  krake

I am not talking about His Steveness or this kind of jobs (congratulations to both involved parties!), I am talking about these.

Traditionally we have been writing in a very start-to-end fashion, where execution starts at an entry point and ends when the task is done. At first there wasn't any reason not to do this as there was no user interaction during a programs execution point or only at controlled points, e.g. a console program asking for a Y/N descision.

Then, with the switch to event based main execution, that changed to having multiple entry points but each usually still starting an execution chain that finished a complete task in one go.

One of the reasons for this "complete task in one go" style is that it is a lot easier to code. All necessary data only has to live during this uninterrupted execution, nothing else can interfere and maybe make some of the data invalid, etc.

This is of course no problem for anything that can be done fast, but long duration processing like the introduction of network I/O with its latency and overall time requirements made it necessary to come up with new styles.

One of them is staying with the start-to-end fashion but move the execution to a parallel processing unit. e.g. multi threading.

Another one is processing the task in smaller steps, using the a similar technique already used for separating the tasks, e.g. event based.

Given that Qt already has event driven processing, doing it for things like networking was a natural choice. However, mainly because Qt's I/O classes are relatively low level and at that time weren't as unified as they are nowadays, KDE developers came up with an even nicer to use idea: jobs.

A job is basically a context for processing a single task (though such a task could of course be an aggregation of sub tasks). It is started and ends at some point, it might report progress inbetween, might even be able to be suspended and resumed.

Implementing a task as a job is usually more challenging than the traditional uninterrupted execution path, but it is really not much more difficult to use: you create the job object, connect to its result signal and start it.

Sure, compared to a normal method call you don't get the result at that part of your code where you start the thread, instead you need an addition method (the slot the result signal is connected to). However, you also have advantages, like having all data necessary for the task stored in the job object and not having to keep it around elsewhere (but still having access to it from whereever you interact with the job).

Unfortunately they are not yet widely used outside of I/O bound tasks, quite some other long duration processing is implemented in keeping the event loop running but not actually returning from the current code context, e.g. "blocking" user input by showing some kind of process dialog, returning either when finished or when the user explicitly cancelled.

At first this looks just like another good option for doing long duration task processing, however this can lead to multiple execution contexts within applications that are not prepared for that (because they know they are not using threads so they don't expect things to be called twice).

This possibilty of unexpected re-entrancy is often overlooked because the magnitude of operations is caused by user input and thu,s by taking that out of the equation, makes the approach "safe" again.

I put "safe" in quotes because it isn't true. Events can not only caused by the application's user, it could be caused by a timer reaching its timeout, or a socket becoming available for reading.

One could work around some of these by means of flags or similar status variables but even for application code this becomes really messy at some point. It is almost an invitation for disaster for any form of library code.

In KDE PIM we've got bitten by that a lot during the last couple of releases due to libraries and applications assuming things like instant access to data. We had to effectively split a lot of our code into smaller parts that could deal with having some data arrive in chunks or not at all.

That was a lot of work and mostly only possible because none of these libraries were public API, thus allowing us to change them without caring about even source compatibilty.

So my advice to anyone adding new libraries that involve long duration processing, don't assume that all applications using the library will be fine with the library blocking the user input as its sol method of avoiding re-entrancy.

Think about providing job based API and let application developers hook it up to a progress dialog if that's how they want it processed. Heck, if it is really so difficult to connect the job's result signal to an appropriate dialog slot, provide a convenience function that does that and extracts and returns the result from the job.