In the last installment I looked at the way the ODF Add-in for Word 2007 integrates into the Word UI. Now let’s drill down into an actual conversion and see what fidelity we get.
I downloaded the code from SourceForce and installed on a machine running the Office 2007 beta 2. The Add-in pre-reqs the .NET 2.0 runtime, an additional 22MB download. The current version only supports reading ODF documents, not writing, and only handles the word processor ODF format.
Now for fidelity. Since you may not all have Office 2007 beta 2 installed, I’m going to show you the fidelity via PDF exports. In all cases I manually verified that the PDF output was identical to what I saw on the screen, every error is real, nothing introduced by the PDF export process.
First up is a document I call “the sampler”. It has a little bit of all the basic word processor formatting, fonts, alignment, nested tables, graphics, other character sets, headers/footers, images, captions, etc. It is not intended to be a particularly hard test of document conversion, but a basic test of core functionality.
So, here is the sampler, in the original ODF format, as well as the PDF rendering of it in OpenOffice 2.0.3, where it was originally created.
I then exported that file from OpenOffice to Word format. This demonstrates the quality of conversion users already get when running OpenOffice. Here is is in DOC and PDF exported after loaded the DOC file in Word 2007 beta 2.
Good, but not perfect. Some differences:
- the bullet point size larger in Word than in OpenOffice
- the nested table collapsed into main table in Word
-
the above table problem causes the table to take up more vertical space, pushing the graphic onto a second page
Again, that is the OpenOffice –> Word conversion we all have available for free today in open source code. Since DOC is a proprietary binary format with inadequate publicly-available documentation, this level of fidelity is impressive. So moving from ISO ODF to Draft Office Open XML should be that much easier, especially since the target format is voluminously documented (4,000 pages and growing), and the writers of the translator are receiving technical assistance from Microsoft.
Let’s take a look. From within Word 2007 (beta 2) I use the ODF Add-in to load the sampler ODF file, and get something that looks like this PDF.
I won’t characterize it but to say it fared less well than I expected. Problems include:
- headers/footers dropped (data loss)
- bullet list indentation ignored
- number list indentation ignored
- table dimensions messed up
- caption for the graphics sized and positioned incorrectly
Whether these are all bugs or merely functional limitations is an interesting question. There is a Functional Specification document available on SourceForge for the Add-in which lists these requirement:
2.1.1.1. Basic Formatting
Here is the list of formatting items that the Add-in and command line translator would keep intact. The first 10 in the list are must haves and the last 4 (number 11 to 14) are good to have items of formatting.
- Bold
- Italics
- Underline
- Bulleting
- Numbering
- Indentation
- Alignment (Left, Center, Right)
- Font size
- Font face
- Tabs
- Tables
- Font color
- Highlights
- Background colors
Tables are “nice to have”? I’d hope so! This does not give me the impression that full fidelity is in their plans. Forget about scripts and macros. They are not even planning on tables or font colors. I hope I am wrong or misinterpreting their plans here, but that is the requirements document they have posted.