≡ Menu

Legacy Inflation

When I was a child I stumbled upon the dark secret that all the adults were hiding. A simple mathematical calculation revealed their conspiracy. I was 10 years old, and my mother was 30. So she was 3x my age. I observed that in 10-years time I would be 20, and my mother would be 40. She would then be only twice my age. A few more calculations and the ominous truth was clear: At some point I would surely catch up, and perhaps even surpass her age!

Well, to be fair, I haven’t quite caught up yet.

But I am reminded of this when I hear Microsoft’s claims about “legacy document compatibility”. At first they used the term “legacy documents” to refer to the masses of existing binary documents, these “exobytes” of documents in Office binary formats. The argument seemed to be, that since Microsoft Word 95 had a bug, therefore Apple iWork 08 must also have this same bug when using OOXML format. This form of argument is used to defend all manner of defects in OOXML.

But in recent weeks, the argument has morphed. The legacy era is catching up with us. Microsoft’s unwillingness to fix errors in OOXML is now being defended because the fixes (Microsoft claims) would break compatibility with Ecma-376. In other words, Office 2007 files are now part of this large legacy that must be preserved. I can only call it call this “legacy inflation”.

First, note that Microsoft shipped Office 2007 with support for OOXML as the default, and this was entirely their choice. Beta versions of Office 2007 did not have OOXML as the default. If Microsoft had left the binary formats as the default, it would have been far easier for their customers. They could have waited for the Mac Office to support OOXML, Mobile Office, developer tools, etc., and then have a coordinated rollout of the new format, rather than dump it on an unprepared world. They could have also waited for standards approval for OOXML, wait for the standard to stop changing before forcing on their customers. But the didn’t do that. They took the approach that caused maximum disruption for their customers. And now that Office 2007 is in use, Microsoft wants ISO to bail them out, and not make any changes that would result in even a single attribute in OOXML differing from Ecma-376.

We see similar brinkmanship in wireless networking protocols where chip manufacturers rush to be the first to ship support for “draft” standards like 802.11n, build up an inventory of chips, and then lobby to ensure that the draft does not change, so they can cement their first mover advantage. This does not benefit the consumer, this does not benefit the standard, this does not benefit interoperability. It is all about maneuvering for market advantage. We should not be encouraging or supporting this.

It is interesting to note that in the wifi world, any company that plays this game with draft standards takes a big risk. They may win, or they may lose. It is a gamble. Only a monopolist would assume that they can play this game risk free. Microsoft does not face the same market risks that others would face for making a bad decision.

In any case, the argument that DIS 29500 must remain identical to Ecma-376 is technically deficient. Consider: Ecma-376 is not identical to the binary formats, but Microsoft Office can still read both. That is because Office can tell these files apart and call different code to parse the two different formats. Similarly, if OOXML diverges from Ecma-376, Microsoft can tell, with 100% certainty, which documents were created in Ecma-376 format versus which ones were made according to the ISO version of the standard.

The key is that all OOXML documents describe the application that created them, as well as a detailed version number. These are described in DIS 29500, Part 4:

7.2.2.1 Application (Application Name)

This element specifies the name of the application that created this document.

7.2.2.2 “AppVersion (Application Version)”

to differentiate between different versions of the same producer

If we look at three Office 2007 documents, we see the following in app.xml:

<application>Microsoft Office Word</application>
<appversion>12.0000</appversion>

<application>Microsoft Excel</application>
<appversion>12.0000</appversion>

<application>Microsoft Office PowerPoint</application>
<appversion>12.0000</appversion>

So the way to ensure compatibility in the fact of the standard changing through the approval process is clear. If the version is “12.0000” then interpret as Ecma-376. But when Office is updated to support an approved DIS 29500 (if this ever occurs) then they can simply update the version number in the files. That way Microsoft Office and every other application can tell them apart and process them correctly.

So let’s reject Microsoft’s push for legacy inflation. Otherwise we will soon find that the next version of OOXML is also unchangable, since Office 14 will be out before the next version of OOXML is standardized. Will we then be unable to change anything in OOXML 1.1 because Office 14 is already in beta? Where does this end?

This doesn’t mean we should be capricious with changes in DIS 29500, but where something is clearly wrong, let’s fix it. The assumption should be that the future is bigger than the past, that no matter how many documents existed before, there will soon be many more created in the future. We should be optimizing for that future.

Creative Commons License
This work, unless otherwise expressly stated, is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.

{ 8 comments… add one }

  • fenilsen 2008/02/21, 09:40

    Rob Weir wrote:

    “Microsoft’s unwillingness to fix errors in OOXML is now being defended because the fixes (Microsoft claims) would break compatibility with Ecma-376.”

    I’ve been reading most of the MS blogs for the last few weeks and I have seen no such claims. Do you have any links to such claims?

    “Microsoft wants ISO to bail them out, and not make any changes that would result in even a single attribute in OOXML differing from Ecma-376.”

    I’ve seen no sign of this either. Where can I find information about this?

    “In any case, the argument that DIS 29500 must remain identical to Ecma-376 is technically deficient.”

    I have not seen this argument being used by anyone from Microsoft. Any links?

  • Anonymous 2008/02/21, 10:28

    I don’t think they will use the AppVersion metadata (formerly stored as OLE SummaryInformation stream). The reason why is because they never did in the past, and I see it more natural to have a different root namespace, i.e. http://schemas.openxmlformats.org/spreadsheetml/2006/main

    I would not be surprised to see

    http://schemas.openxmlformats.org/spreadsheetml/2009/main

    in Office 2009 files.

    In Excel BIFF, records for disambiguating versions where BOF (0809) and RECALCID (01C1). RECALCID was undocumented until last week. It’s a euphemism to say that Microsoft wanted to keep it for them.

    -Stephane Rodriguez

  • Rob 2008/02/21, 10:35

    The point is that they are capable of telling the different file versions apart,and the claim that they are unable to change the schema of DIS 29500 because of legacy compatibility issues is a false.

  • Rob 2008/02/21, 10:41

    Fredrik, If your sole source of information on Microsoft’s position is Microsoft bloggers, then you are missing most of the debate. Join an NB and you’ll hear many a curious tale.

  • funnybroad 2008/02/21, 15:31

    I have an addition to your 5th paragraph:

    They could have waited until they were able to get the 3 primary applications (Word, PowerPoint and Excel) to deal with their own OOXML formats in a CONSISTENT and ACCURATE manner. They could have also been more forthcoming regarding these issues, but instead, customers are left to get burned on their own, and have to pay Microsoft $$$ to file bug reports if they have any hopes of getting at the very least an acknowledgement that Microsoft is aware of them.

    But I’m not bitter…

  • funnybroad 2008/02/21, 17:31

    From the Word Team at Microsoft: An absolutely perfect example of the confusion and pain they’ve caused their customers by introducing this new file format too soon, and how blind they are to it: http://blogs.msdn.com/microsoft_office_word/archive/2007/08/14/a-compatibility-guide-for-the-end-user.aspx

  • Anonymous 2008/02/22, 02:38

    fenilsen:
    “I’ve been reading most of the MS blogs for the last few weeks and I have seen no such claims. Do you have any links to such claims?”

    Got to Rick Jelliffe’s blog post “ODF Alliance now loves me!”:
    “Now that OOXML has a year’s worth of documents out, Ecma apparantly thinks it is important that (critical parts of) existing valid documents don’t become invalid, as far as I can see.”
    http://www.oreillynet.com/xml/blog/2008/01/odf_alliance_now_loves_me.html

    If you want to know MS’ position on MS OOXML, you cannot do without Rick’s opinion. And he is in the Australian NB.

    Winter

  • Yoon Kit 2008/02/22, 13:22

    fenilsen,

    Ive heard it from a MS rep here in Malaysia too. I nearly fell off my chair laughing.

    http://www.openmalaysiablog.com/2008/02/existing-corpus.html

    yk

Leave a Comment