When I was a child I stumbled upon the dark secret that all the adults were hiding. A simple mathematical calculation revealed their conspiracy. I was 10 years old, and my mother was 30. So she was 3x my age. I observed that in 10-years time I would be 20, and my mother would be 40. She would then be only twice my age. A few more calculations and the ominous truth was clear: At some point I would surely catch up, and perhaps even surpass her age!
Well, to be fair, I haven’t quite caught up yet.
But I am reminded of this when I hear Microsoft’s claims about “legacy document compatibility”. At first they used the term “legacy documents” to refer to the masses of existing binary documents, these “exobytes” of documents in Office binary formats. The argument seemed to be, that since Microsoft Word 95 had a bug, therefore Apple iWork 08 must also have this same bug when using OOXML format. This form of argument is used to defend all manner of defects in OOXML.
But in recent weeks, the argument has morphed. The legacy era is catching up with us. Microsoft’s unwillingness to fix errors in OOXML is now being defended because the fixes (Microsoft claims) would break compatibility with Ecma-376. In other words, Office 2007 files are now part of this large legacy that must be preserved. I can only call it call this “legacy inflation”.
First, note that Microsoft shipped Office 2007 with support for OOXML as the default, and this was entirely their choice. Beta versions of Office 2007 did not have OOXML as the default. If Microsoft had left the binary formats as the default, it would have been far easier for their customers. They could have waited for the Mac Office to support OOXML, Mobile Office, developer tools, etc., and then have a coordinated rollout of the new format, rather than dump it on an unprepared world. They could have also waited for standards approval for OOXML, wait for the standard to stop changing before forcing on their customers. But the didn’t do that. They took the approach that caused maximum disruption for their customers. And now that Office 2007 is in use, Microsoft wants ISO to bail them out, and not make any changes that would result in even a single attribute in OOXML differing from Ecma-376.
We see similar brinkmanship in wireless networking protocols where chip manufacturers rush to be the first to ship support for “draft” standards like 802.11n, build up an inventory of chips, and then lobby to ensure that the draft does not change, so they can cement their first mover advantage. This does not benefit the consumer, this does not benefit the standard, this does not benefit interoperability. It is all about maneuvering for market advantage. We should not be encouraging or supporting this.
It is interesting to note that in the wifi world, any company that plays this game with draft standards takes a big risk. They may win, or they may lose. It is a gamble. Only a monopolist would assume that they can play this game risk free. Microsoft does not face the same market risks that others would face for making a bad decision.
In any case, the argument that DIS 29500 must remain identical to Ecma-376 is technically deficient. Consider: Ecma-376 is not identical to the binary formats, but Microsoft Office can still read both. That is because Office can tell these files apart and call different code to parse the two different formats. Similarly, if OOXML diverges from Ecma-376, Microsoft can tell, with 100% certainty, which documents were created in Ecma-376 format versus which ones were made according to the ISO version of the standard.
The key is that all OOXML documents describe the application that created them, as well as a detailed version number. These are described in DIS 29500, Part 4:
7.2.2.1 Application (Application Name)
This element specifies the name of the application that created this document.
7.2.2.2 “AppVersion (Application Version)”
to differentiate between different versions of the same producer
If we look at three Office 2007 documents, we see the following in app.xml:
<application>Microsoft Office Word</application>
<appversion>12.0000</appversion>
<application>Microsoft Excel</application>
<appversion>12.0000</appversion>
<application>Microsoft Office PowerPoint</application>
<appversion>12.0000</appversion>
So the way to ensure compatibility in the fact of the standard changing through the approval process is clear. If the version is “12.0000” then interpret as Ecma-376. But when Office is updated to support an approved DIS 29500 (if this ever occurs) then they can simply update the version number in the files. That way Microsoft Office and every other application can tell them apart and process them correctly.
So let’s reject Microsoft’s push for legacy inflation. Otherwise we will soon find that the next version of OOXML is also unchangable, since Office 14 will be out before the next version of OOXML is standardized. Will we then be unable to change anything in OOXML 1.1 because Office 14 is already in beta? Where does this end?
This doesn’t mean we should be capricious with changes in DIS 29500, but where something is clearly wrong, let’s fix it. The assumption should be that the future is bigger than the past, that no matter how many documents existed before, there will soon be many more created in the future. We should be optimizing for that future.