Archives for April 2010
ODFDOM is an open source (Apache 2.0) Java library for reading, writing and modifying ODF documents. It runs standalone, not requiring OpenOffice.org or any other editor to be installed. It operates directly on the ODF document itself.
One of the things we’re focusing on in the next release of (the 0.9 release) is optimizing the performance, getting ODFDOM to read and write ODF documents as fast as possible, and with as low a memory footprint as possible. The aim is to make it optimal for concurrent use, say in a Java servlet.
Some of the the things we’re finding as we profile ODFDOM are worth sharing, since they are not specific to this library. They are tips and techniques that are applicable more broadly, potentially to all applications that work with ODF documents. I’ll do a series of posts on these ideas. Hopefully you will find them useful and maybe even can share your tricks as well.
The first thing I’ll point out concerns documents with many image resources, such as large presentation files with a lot of graphics. We found that writing these documents was rather slow. The problem was in how the images were stored in the ZIP archive. As you may know, ZIP allows a file to be compressed (most commonly using the DEFLATE algorithm). Most ZIP libraries will, by default, compress every file you add to the archive. However, for many common media types, like PNG and JPG images, the data has already been compressed, at the level of the image encoding. So if you have your ZIP library try to compress the images a second time, you will typically waste time with very little incremental savings in storage.
Most ZIP libraries have an alternative way to store files in their original, uncompressed form, a method called STORE. What we found in ODFDOM was that if we store images rather than compress them, the time needed to save our large presentation was reduced by 20%, while the size of the archive increased only 3%. So this was a good trade-off.
I think this technique would be applicable to other libraries and editors.
We’ll be hitting a significant date next month. It was on May 1st, 2005 that Open Document Format (ODF) 1.0 was approved by OASIS.
I hope we can all take time to reflect on far we’ve gone, with the specification itself, with the quality and diversity of implementations and with world-wide adoption.
As we read that the other “standard”, after only 2 years, appear to be circling the drain, I hope we take a few moments on May 1st to ask ourselves why ODF did not suffer a similar fate. What worked well with ODF? And what can we teach the world about open standards?
Of course, not everything in ODF is perfect, but to be still so relevant after 5 years is an accomplishment worth bragging about. Not every standard makes it this far. We should celebrate.
It was cold and dreary in Stockholm for last week’s meeting of ISO/IEC JTC1/SC34. There is nothing surprising or particularly interesting to report. That is a sign of a successful meeting. No drama. These face-to-face meetings tend to be formulaic rituals. The real work is done in WG teleconferences and email discussions that occur in advance of the face-to-face meetings. If the WGs have done their work well, the physical meetings are boring. In Stockholm we were not disappointed.
However, I would like to report on the advancement of an ODF initiative that we’ve been working on in the OASIS ODF TC and in SC34/WG6 for a couple months now. The idea is to “upgrade” the ISO version of ODF (ISO/IEC 26300) so it aligns with OASIS ODF 1.1, rather than its current alignment with ODF 1.0. ODF 1.1 was standardized by OASIS back in 2007 and is widely implemented, including support in OpenOffice, Microsoft Office, Symphony, KOffice and so on.
Formally this alignment to ODF 1.1 would be done via an amendment to ISO/IEC 26300:2006, to add the enhancements from OASIS ODF 1.1 — primarily accessibility improvements. The process will look something like this:
- OASIS submits the full text of ODF 1.1 to JTC1 (done)
- The ODF Project Editor will work with SC34/WG6 to prepare the text of an amendment to ISO/IEC 26300. Think of it as a diff between ODF 1.0 and ODF 1.1 (in progress)
- A ballot of SC4 NBs in what is called an FPDAM (Final Preliminary Draft Amendment)
- A ballot of JTC1 NBs in what is called an FDAM (you guessed it — a Final Draft Amendment)
The full process will take 9-12 months, so we can expect the amendment to be published sometime in 2011.
So you may be thinking, what does this mean for ODF 1.2? The answer is: nothing. I was not willing to support this amendment unless it could be done in a way that would not divert the ODF TC from its current work completing ODF 1.2. After quite a bit of discussion in OASIS and with WG6 we found a way to process the amendment that did not effect the ODF 1.2 schedule. However, the result is that it is likely that OASIS ODF 1.2 will be approved before the ODF 1.1 amendment is approved in JTC1. This may look silly, but it is not a serious problem. The important thing to remember is that the latest and greatest ODF work will be found at OASIS, and that this work will slowly but steadily progress through JTC1, and will eventually be published with an ISO/IEC coversheet, 12-16 months later. The ISO version of ODF 1.1 should not matter to implementors, since most are already supporting ODF 1.1 and now moving on to ODF 1.2, but it may be of interest to some adopters.