No, this has nothing to do with getting discounted parking if you use ODF, though that is an intriguing idea…
Daniel Carrera (OpenDocument Fellowship and the OASIS ODF TC) has a new blog and with it comes news of a new ODF tool, an ODF Validator Service, written as part of the Fellowship’s ODF Tools project by Alex Hudson.
It is in the spirit of the W3C’s Markup Validation Service: upload a document and get an instant report of whether or not it is valid ODF, and if not, what problems were found. I tried a few documents and it seems to work well.
It would be interesting to see if something like this could be made into a flexible framework for scanning ODF documents, at various levels. Think of a SAX-like call-back parser but at multiple levels of detail. So the framework knows how to fully parse an ODF document and identify features at the Zip and XML level. Plugins to the framework can subscribe to various parse events. So, maybe a ZipListener interface that simply has methods onFile() and onDirectory(). Then a ManifestListener interface that allows you to subscribe to notifications of the data in the manifest. Then within a document, like a spreadsheet, you could have listeners at the structural and content level, so onWorksheet(), onCell(), or in a Wordprocessor document, onTable(), onImage(), etc.
A framework like this could allow you to make a range of applications that need to scan an ODF document and take some action on it.
- A validation service would operate at several levels, validating the structure of the Zip, the manifest as well as validating each of XML’s.
- You could also do a cross-platform checker, looking embedded images, and other media, OLE links, etc., and reporting on whether any of these have platform dependencies.
- An accessibility scanner would be able to fit into this framework as well.
- A full text indexer could work here.
- Any number of content scraping applications could work well here.
- If there is some query language interface, this could be useful from a test-generation perspective. If you have a large collection of ODF documents, a developer working on a feature can instantly bring up a set of test documents that can be used to test the code he just changed. Give me a list of word processor documents that have Arabic Bidi text which also have tables. Give me a list of spreadsheets that use pie charts with more than 10 slices.
- With the metadata framework coming in ODF 1.2, there will be even more interesting uses of such a framework.
The benefit of the framework is the reduction in code required to get directly to the info in the ODF document you want, without having to master the ODF specification or writing a lot of parsing code. Think of it as a framework for easy multi-level information extraction from ODF documents.