It is the type of response that was crafted to end all debate and justify all sins: “Backward compatibility with billions of documents produced over decades”. Variations of this occur everywhere. Rather than cite them all, a simple Google query will bring up a representative sample.
Let’s take a deeper look at this argument.
There is a game called Zendo, where a player, called the “Master”, forms in his mind a secret rule which governs the selection and arrangement of objects (often small colored blocks). Arrangements which conform to the secret rule are said to have “Buddha nature”. The other players take turns selecting and arranging their own blocks to conform to what they think the secret rule is, to which the Master will acknowledge success or failure. The winner is the one who first guesses the secret rule, which might be something “an odd number of blocks, at least one of which must be red”.
Microsoft is playing Zendo with the OOXML specification. The Master has formed a secret rule. He calls it, “backwards compatibility with billions of office documents”. But since the file format documentation for the proprietary legacy binary formats has not been made public, the rule might as well just been called “Buddha nature”. It is just as opaque. We have no way of judging whether any specific requirement of OOXML is there to support backwards compatibility, or whether it is just there for the convenience of the Office development team. Or in fact whether it is there to raise barriers to non-Microsoft implementers. How could we know, since the solitary constraint on the creation of OOXML dependent on information that isn’t public? Does Ecma TC45 itself even have access to the binary format specifications? How are they able to properly judge what is done in the name of compatibility? Do we all just take Microsoft’s word for it?
The key point (in my opinion) is that legacy compatibility may be a constraining factor, but it need not be the sole determining factor. There are many, perhaps an infinite number of possible markups which would be compatible with the legacy formats, meaning the legacy documents can be unambiguously transformed into the new XML format. The constraint should be that they are mappable, not that they must be identical. Among the set of such possible XML formats, some will be elegant, some sloppy, some bloated, some sparse, some which will be easy for others to implement, some designed to minimize conversion work for just one vendor, etc. In other words, this can be done well, or it can be done poorly. The constraint of compatibility does not justify everything. Compatibility is one requirement, but it is not the only requirement.
An example may make things clear. Word has a feature called Art Page Borders. If you are like me, you’ve gone 15 years without seeing or using this feature. But it is there, under the Format/Borders and Shading menu, on the Page Border tab.
The markup needed to define these borders is covered in section 2.18.4 “ST_Border (Border Styles)” of the OOXML specification. Here we see descriptions and images of 200 hundred or so Art Page Borders. The images are heavily weighted to Western European, even Anglo-American celebratory icons, things like gingerbread men for Christmas, pumpkins for Halloween, or images of Cupid for St. Valentine’s day, or globes which are neatly centered on the United States. I think it is a legitimate concern that a document format with such obvious cultural biases is moving forward toward an international standard.
Further, I am concerned that the specification includes what can only be considered a clipart collection. What legal rights does the implementer have to reproduce this clipart? Keep in mind that Microsoft’s “Covenant Not to Sue” covers patents, not copyrights. I haven’t seen anything that would grant implementers of OOXML the rights to reproduce this clipart in their application. Is the specification hard-coded to use clipart which we cannot copy?
All of these problems (spec bloat, cultural bias, non-extensibility, copyright concerns) can be solved by one simple mechanism. Instead of having ST_Border be a fixed enumerated set of values, have it include only a small number of trivial values like the basic line styles, and have everything else (all of the Art Borders) be stored as a separate image file in the document archive.
So, if you load a Word XP document that uses the “candyCorn” Page Border, then when you write it out to OOXML, you would include a single frame of that art in the zip file and have the XML document reference that image for the border, tiling as necessary. This solution has several advantages:
- It removes some bloat from the spec. No need to document 100’s of page border clip art
- It lowers the barrier to implement. No one is required to implement 100’s of border styles. They are all generated on-the-fly based on images stored in the document.
- Copyright concerns are eliminated.
- Is an extensible approach. An implementation can include different or additional border styles according to their business and cultural requirements.
- It is compatible with legacy documents. Any existing Word binary or XML document can unambiguously be mapped into this scheme
Of course, this approach would require some minimal code changes in Microsoft Word to support this extensible mechanism. But remaining backwards compatible with the Microsoft Word product was never a stated constraint on OOXML. No one ever said that the goal of Ecma OOXML was to reduce the cost for Microsoft to implement it. It is all about the legacy documents, right?
So there it is, one example to illustrate a point that can be repeated over and over again. Among the potential universe of compatible XML formats for Office are those which are flexible, easy to use, easy to implement, as well as those which simply perpetuate the status quo and vendor lock in.
I had an interesting experience with Microsoft Works last year. Or rather, trying to help someone who had Microsoft Works, had written and saved her CV/resume in MS Works Word, and needless to say, couldn’t get it working in MS Office Word, because the file formats were incompatible.
I wonder if Microsoft’s Open XML format is going to support MS Works file formats, the way it is supposed to be supporting MS Office file formats?
I strongly suspect MS Works users will forever remain second-class citizens in the Microsoft office productivity software scene.
Wesley Parish
Your ODF work has been picked up by Groklaw – more please!
One minor nit-pick: gingerbreadmen for Christmas are German-American. Doesn’t change the fact that they have no place in a technical standard.
Bravo!
I just had the rather disconcerting thought rereading this, that perhaps some member of the ECMA with a presence in the Muslim world should perhaps try to get “Allahu Akbar” – “God is Great” in ornate Arabic lettering approved as an Art Page Border. It is after all, one of the greatest works of abstract art in the world, when seen on a Middle Eastern/Northern African Masjid, being totally divorced from any naturalistic representation.
And thus well worthy of incorporation in any “International Standard” claiming to include “art”, however defined.
Of course, once we start incorporating the art traditions of the world, there is then no rational reason why the erotic carvings of the dancers in the Indian and Indochinese temples should not be incorporated as a valid Art Page Border.
And that is not even mentioning the magnificent Papua New Guinean carvings, or the Australian Aboriginies, or the Cook Islands or the New Zealand Maori abstract art traditions …
This is something that definitely needs to be looked at in closer detail. Some of us do have some basic respect for art.
Wesley Parish
This artwork seems like a far fetched argument to be against the format. Reaching for straws to declare the format not open.
Of course you can just take the artwork from the specification as the specification will be provided by a standards body. The publication and the rights to the work that has copyrights will be in the hands of the standards organisation when they publish a standard.
As long as you use the EMCA specifications (or mayby later even the ISO specifications) you do not have to worry.
I’ve found that when dealing with intellectual property issues, “Of course you can just take the artwork from the specification” is an insufficient argument. The copyright is owned by the creator of the work who may then grant rights to others. But unless this permission is granted explicitly, one must assume nothing. I have not seen anything that grants users the rights to create derived works based on this clipart. If you know of any such statement, please point me to it.
Rememeber, Microsoft’s “Covenant Not to Sue” only grants protection from patents, not copyrights. The Baker & McKenzie analysis recently posted merely rehashes that information.
Although Microsoft did grant royalty-free permissions to reproduce the Office 2003 Reference Schema Specification, this is insufficient because 1) That version of the specification did not include the Art Borders, and 2) That licence specifically says, “No right to create modifications or derivatives of this Specification is granted herein” (http://msdn.microsoft.com/library/default.asp?url=/library/en-us/odcXMLRef/html/odcXMLRefLegalNotice.asp)
In any case, my main argument against that Art Borders feature is that it is poorly engineered. If TC45 fixes that aspect of it, then the artistic, cultural, extensibility and legal problems go away.
So draw your own candy corn. Start up inkscape and just draw a candy corn. Save it as SVG. There, you have candy corn.
Speculation 1: There were quite a number
of parties involved in the standardisation
process. Perhaps some of these parties
might have been using this process merely
to get comprehensive documentation of the
existing binary formats, instead of
relying on the past practice of reverse
engineering?
Speculation 2: Each time a new file
format is introduced that is
incompatible with previous formats, and
that new format is the default, it
creates a “shock wave” of pressure for
parties to change their software to
remain compatible. Perhaps Microsoft
is hoping that by playing along with
this open standardisation process, and
then adopting the file format in its
next Office package release, can help
it sell this release?
– b
“This artwork seems like a far fetched argument to be against the format. Reaching for straws to declare the format not open.”
Even if this was not an impediment to openness, it sure is an impediment to basic sanity in standardization.
“The publication and the rights to the work that has copyrights will be in the hands of the standards organisation when they publish a standard.”
Did your crystal ball tell you that? So far, there are no indications that Microsoft would take such a step.
“So draw your own candy corn. Start up inkscape and just draw a candy corn. Save it as SVG. There, you have candy corn.”
Wow, what a brillant (sic) argument. The entire point is that in this are 100% compatibility cannot be achieved. Although there are many worse problems in Office “Open” XML (such as descriptions that merely say “Behaves like in Word95”), this is a nice example anyway.
“So draw your own candy corn. Start up inkscape and just draw a candy corn. Save it as SVG. There, you have candy corn.”
How is that useful? Not only is that not a faithful representation of the data (the only reason given for the existence of OOXML in the first place) but OOXML doesn’t let you use SVG, remember? You’ve got to use VML instead. Which means you’ll first need to make a VML application, just to draw some candy corn, which will still be different to the original document so you may as well have imported it into OpenOffice along with the slight inconsistencies that causes, since you’ll save yourself a hell of a lot of effort for the same result.
Plus, the ‘like this matters’ tone you use is a double-edged sword. If if really doesn’t matter that much then why go to the effort of documenting so many of the things inside the specification? Maybe because OOXML is just a modified dump of Microsoft Office’s internal workings rather than an actual standard specifiation?