Skip to content

Forgotten File Formats

Wednesday, 23 September 2009  |  jaroslaw staniek

While explaining the story behind his great ppttoxml tool, Jos also mentioned

Since about a year, Microsoft has, after significant political pressure, put documentation for their file formats on-line.

That's fine and solved some issues. But there are MS Access proprietary file formats (mdb, accdb) that remain to be secret. These are not planned to be replaced by XML formats (what would be overkill in databases). I guess there was no pressure to open the formats, what looks like an overlook in EU and the USA (correct me if there's another reason like patents). If you google for that, it is hard to find even a single mention of file format specifications in the above meaning, and even explanations from MS employees or backers show that they do not fully realize one thinf: MSA formats are not covered by the process of said "opening of the legacy formats".

MS Access formats are currently only accessible (I mean openly accessible, i.e. via 100% Free Software code, not via MS ADO dlls, used for instance by the oo.org plugin, what excludes non-Windows systems) through the mdbtools library project or its descendants like the Kexi's MDB plugin, that contains some improvements for mdbtools. The mdbtools project was a huge effort of reverse-engineering mdb formats (that are nasty direct binary dumps of some initialized and uninitialized chunks of memory owned by MSA, more strictly MS Jet db engine). The project now faces stagnation but it is possible to extend it unless MSA moves to entirely different storage format.

So unfortunately the MSA file formats are not quite fair play game in the small but important branch of desktop databases. While anti-competitive behaviour is a valuable corporate weapon, this neglected area contradicts the recent buzz about publishing various document format specifications. By still using the MSA file formats in 2009, you may not only have troubles with not owning your software written with MSA (forms, reports, code...), but also with not owning your data. That is why I emphasize importance of the issue, even if MSA is on its decline after we moved to the web era, and after MSA broke compatibility in 2003 version and also recently in 2007 and 2010 (but sure, it makes rather good consulting business, as any mess in IT).

To solve the issue once and in a definite way I'd like to hear any feedback from MS.

PS: On the Kexi side, due to our desire for releasing only stable software, we're in the way to KOffice 2.2 (i.e. 2010), so there will be no version 2.0 and 2.1. A lot of things have been ported to KDE 4, including forms, and there shall be more database and file formats handled, there are new features, e.g. reports.