I wish to discuss a recent blog post, a vigorous defense of Microsoft’s Office Open XML and XAML from Novell’s Miguel de Icaza. His post is so wrong, on so many levels, that I am somewhat at a loss for words. Miguel is not stupid, and I find it hard to believe that he is a Microsoft shill, so I must assume that he was imperfectly informed on this issue. “Everyone is entitled to their own opinions but they are not entitled to their own facts,” as Pat Moynihan was fond of saying. I’ll try hard not to make this personal, but there are so many errors in his post that he may very well feel the sting of correction in my words, and for that I apologize in advance.
I suggest you read through Miguel’s post in its entirely, and then return here for my response.
After an attack against lawyers, we come to some technical comments:
Unlike the XML Schema vs Relax NG discussion where the advantages of one system over the other are very clear, the quality differences between the OOXML and ODF markup are hard to articulate.
The high-level comparisons so far have focused on tiny details (encoding, model used for the XML). There is nothing fundamentally better or worse in those standards like there is between XML Schema and Relax NG.
ODF grew out of OpenOffice.org and is influenced by its internal design. OOXML grew out of Microsoft Office and it is influenced by its internal design. No real surprises there.
Maybe I can be of some assistance here, helping to articulate the difference in quality between ODF and OOXML. ODF, starting from its roots in OpenOffice.org specification, spent a further 2 1/2 years being improved and reviewed in OASIS, then further work preparing for submission to ISO, then a further year in ISO, receiving more comments and corrections, before it was published as an ISO standard. So this is a combined 4 years in technical committees being refined by standards bodies. During this time ODF has been implemented in dozens of applications, including full suites like OpenOffice.org, KOffice and Lotus Workplace, as well as individual applications like AbiWord, Gnumeric and Google Docs and Spreadsheets.
In comparison, OOXML went from a proprietary Microsoft specification to an Ecma standard in record time. If you make something 8 times lengthier than ODF, and do it 4 times faster than ODF, then you are going to have a quality problem. The list of problems on GrokLaw is one list of known problems in OOXML. Note that particular list was generated in only 3 or 4 days by volunteers. I recently did a sampled survey of OOXML specification quality and predicted that it contains thousands of errors.
And where are the OOXML implementations? OOXML was approved by Ecma and submitted to ISO without a single available implementation. Certainly, Office 2007 later shipped with support, but is that it? A single implementation? Until you have at least two independent implementations of a standard you will have a very imperfect understanding of the standard’s quality.
So the question to ask is this: Why should JTC1 NB volunteers deal with the mess that Microsoft dropped on their lap by their overhasty review of OOXML in Ecma? Why should they spend the next 6 months reviewing this specification when even a cursory review shows it is defective in so many ways? And considering the observed low level of quality, why should it be reviewed and approved via a Fast Track process, and all in one big chunk of 6,000 pages? Isn’t this the last thing you want to do, following up a rushed review in Ecma by a rushed review in ISO? Instead this should go back to Ecma to let them do a proper review, one they can be proud of.
Miguel correctly points out that OOXML derives from Microsoft Office’s formats, and ODF derives from OpenOffice.org’s formats. But then he leaps to an assertion that they both reflect their parent application’s internals. This is not true. Only a poorly-designed file format reflects the internals of the application. Maybe that is how we did it back in the 1980’s, but best-practices for portable file formats have been known for years now. That is why we have data formats like XML, so the format can be independent of the application internals. ODF was designed, even in the OpenOffice days, from the ground up to be an application- and platform-neutral document format. While it was further developed in OASIS, it continued to take on such good qualities as reuse of existing relevant W3C standards such as XForms and MathML and SVG. So certainly, the platform-independence and open nature of OpenOffice.org rubbed off on ODF, but isn’t that an extremely good thing?
OOXML, on the other hand, matches to an inane degree the internals of a single vendor’s legacy application, with no concessions to platform-neutrality. For example, OOXML encodes data in non-XML formats such as binary blobs, bitmasks and other encodings that defy XML schema validation or processing by XML tools. As I’ve said before, this is not a specification, this is a DNA sequence.
Does that help articulate the difference?
Miguel then takes on the size question:
A common objection to OOXML is that the specification is “too big”, that 6,000 pages is a bit too much for a specification and that this would prevent third parties from implementing support for the standard.
Considering that for years we, the open source community, have been trying to extract as much information about protocols and file formats from Microsoft, this is actually a good thing.
This is good thing, I agree, that Microsoft has produced this specification. I’d like even more for them to make the specification for the Office binary formats public, since that is the format that the billions of legacy documents are actually in. I hope you’ll join with me in calling for Microsoft to release the specification for these formats under their Open Specification Promise, so that users will truly be able to choose which format they want to remain in or move to.
However, merely because it is useful from a disclosure perspective, does not necessarily mean it will make a good standard. Simply because it is better than nothing does not mean it is sufficient for an ISO standard. There is an important difference between a descriptive specification and a prescriptive standard. Writing down file formats is a small virtue, and one that other companies have done for years. Do they all deserve to be ISO standards?
For example, many years ago, when I was working on Gnumeric, one of the issues that we ran into was that the actual descriptions for functions and formulas in Excel was not entirely accurate from the public books you could buy.
OOXML devotes 324 pages of the standard to document the formulas and functions.
….
Depending on how you count, ODF has 4 to 10 pages devoted to it. There is no way you could build a spreadsheet software based on this specification.
This is a rather bold misstatement, considering that implementations such as OpenOffice.org, KSpread, Gnumeric, Google Spreadsheets, Lotus Workplace, etc., already in fact exist. Go back even earlier, we had 1-2-3, Quattro Pro and OpenOffice all supporting Excel’s formulas even though there was no formal specification for it. Sure having a good specification helps, but the extreme rhetoric that says that this is unimplementable is patently absurd. Just look around.
Some folks have been using a Wiki to keep track of the issues with OOXML. The motivation for tracking these issues seems to be politically inclined, but it manages to pack some important technical issues.
Hmm… The open source community helps test a purported open standard, reports the defects it finds, and this is called “politically inclined”? Isn’t this what open source is all about, “given sufficient eyeballs, all bugs are shallow”? Shouldn’t open standards be subject to scrutiny? As I said in my blog, I am so impressed by the quality and productivity of this type of wiki-enabled public review that I am going to investigate how we can do this to solicit public comments on ODF 1.2. This isn’t for political reasons. This is because it works.
Some of the objections over OOXML are based around the fact that it does not use existing ISO standards for some of the bits in it. They list 7 ISO standards that OOXML does not use: 8601 dates and times; 639 names and languages; 8632 computer graphics and metafiles; 10118-3 cryptography as well as a handful of W3C standards.
By comparison, ODF only references three ISO standards: Relax NG (OOXML also references this one), 639 (language codes) and 3166 (country codes).
Not only it is demanded that OOXML abide by more standards than ISO’s own ODF does, but also that the format used for metafiles from 1999 be used. It seems like it would prevent some nice features developed in the last 8 years for no other reason than “there was a standard for it”.
Miguel has inexplicably ommitted all of the W3C standards that ODF uses, such as XForms, MathML, SVG, XLink, SMIL, XSLT, CSS2 as well as IETF standards such as RFC 2045, RFC 2048, RFC 2616, RFC 2898, RFC 3066, RFC 3987. To imply that OOXML follows more standards that ODF is a foolish statement, unsupported by facts.
On the WMF, Miguel has it all wrong. What is a Windows Metafile? It is simply a recording of the graphical function calls made by Windows as it renders a drawing. It maps 1-to-1 into Windows API calls. It maps so closely to Windows that when the WMF format was found to be vulnerable to a security flaw, even the Wine Windows compatibility layer for Linux was susceptible to the same security hole. WMF (and VML, another legacy format in OOXML with a history of security problems) are flawed formats. One security vendor said: “Turns out this is not really a bug, it’s just bad design. Design from another era.” and “The WMF vulnerability probably affects more computers than any other security vulnerability, ever.”
Although Miguel is pleased to note that the proposed cross-platform ISO standard, Computer Graphics Metafile (CGM) dates to 1999, he fails to mention that WMF is even older, dating back to Windows 3.0 (1990).
So which one should be prefered in an ISO standard? The Windows Metafile format which is not documented in an open standard, is tied to the graphical layer of a single vendor, and has design flaws with serious security implications? Is this what we really want? Or do we want an open standard, one designed to be platform neutral, that has been in use for eight years, that has had a community continuing development and promotion of it such as CGM Open and WebCGM? Where is the WMF community? A Google search for WMF comes up with security problems; a search of CGM comes up with communities, initiatives and test suites.
There is an important-sounding “Ecma 376 relies on undisclosed information” section, but it is a weak case: The case is that Windows Metafiles are not specified.
It is weak because the complaint is that Windows Metafiles are not specified. It is certainly not in the standard, but the information is publicly available and is hardly “undisclosed information”. I would vote to add the information to the standard.
Did you really read the Groklaw issues list? WMF is not the only, or even the most troublesome of the undisclosed information in OOXML. Start here, then go back and read the Groklaw list of issues, and let me know if it makes more sense then. I am not that good at explaining these things, so please ask questions and I will try harder.
I have obviously not read the entire specification, and am biased towards what I have seen in the spreadsheet angle. But considering that it is impossible to implement a spreadsheet program based on ODF, am convinced that the analysis done by those opposing OOXML is incredibly shallow, the burden is on them to prove that ODF is “enough” to implement from scratch alternative applications.
There is that claim, that it is impossible to implement an ODF spreadsheet. Miguel, surely you aware of OpenOffice, KSpread, Lotus Workplace, Gnumeric, Google Docs? How can you persist in such obvious error? How could you actually write the above when you know, I know, and everyone reading it knows that it is patently false? Please tell me it was a just a typographical error.
Here’s a challenge: Give me a list of four spreadsheet applications from four different vendors that today are as interoperable with OOXML as the four leading ODF spreadsheets are with ODF.
There is a good case to be made for OOXML to be further fine-tuned before it becomes an ISO standard. But considering that Office 2007 has shipped, I doubt that any significant changes to the file format would be implemented in the short or medium term.
The best possible outcome in delaying the stamp of approval for OOXML would be to get further clarifications on the standard. Delaying it on the grounds of technical limitations is not going to help much.
This is quite a revealing statement. Why should the shipment of Office 2007 factor in the appropriateness and the quality of a proposed International Standard? Should standards of quality be relaxed for Microsoft’s convenience? Do technical limitations not matter because Microsoft has sales targets to meet? Is this what ISO is for? If so, I suggest their hard-working volunteers be given Microsoft salaries and stock options, since clearly they would be working only for Microsoft’s benefit at this point.
Miguel has a good point at the end:
To make ODF successful, we need to make OpenOffice.org a better product, and we need to keep improving it. It is very easy to nitpick a standard, specially one that is as big as OOXML. But it is a lot harder to actually improve OpenOffice.org.
If everyone complaining about OOXML was actually hacking on improving OpenOffice.org to make it a technically superior product in every sense we would not have to resort, as a community, to play a political case on weak grounds.
OpenOffice.org is one, but not the only application of ODF. It is the most prominent one in the traditional heavy-weight office suite model, but I’m not certain that this is the only way forward. We need good implementations, several of them, since one size does not fit all.
In any case I’d say in return that if Microsoft and Microsoft boosters spent some of their time investigating exactly how easy it would be to encode Office’s legacy features on top of the extensible ODF specification, and worked together with the ODF community to address their common concerns, then we could easily have a single interoperable format that we all could use. The resulting standard of OOXML on top of ODF would be smaller, simpler, higher quality and more interoperable than the mess that we’ll end up with by having OOXML as a standard, in addition to ODF.
Change Log:
2/1/2007 — fixed spelling errors reported by a reader via email
2/2/2007 — another spelling error
Dude,
I actually *wrote* a spreadsheet and I actually *know* how the formula implementations came to be.
There was a lot of guessing and reverse engineering to get those in place.
My point, which you sophistically tried to ignore is that ODF lack of fundamental pieces of the standard would not allow for an implementation based on the spec without resorting to third parties that have reverse engineered the code.
Other than trumpeting ODF, have you actually contributed *ANY* code to OpenOffice, or you are just another armchair general?
Miguel.
Spot on, there. Sometimes it can feel like you’re arguing against MS opening their formats at all, which is I think why there’s so much misunderstanding of your points. This post, though, is pretty damning and has just the right sentiment.
Thanks Rob. Once again you’ve served up the kind of factual thrashing that would summarily silence the Microsoft opportunists, shills, boot licking lackies and pay seeking critics of ODF now swarming the blogosphere – if only there was some shred of honesty amongst them.
Your frustration is starting to show though. Not that this will brace your spirit for the next round and the next round and the next, but keep in mind that this is not about technology, or open standards, or competing visions of global interoperability or the universal portable document model. No, this is simple. It’s all about how Microsoft can extend their desktop monopoly to control of servers, devices and Internet systems.
And that makes it truly a worthwhile fight.
There are two comments i’d like to add.
The first ODF reference implementation, OpenOffice, is an open source project. So is the second implementation, KOffice.
This open source at the application layer has had a great impact on the development of the ODF specification.
Because OOo is open source, all participating vendors and enterprise system providers could study and share discussions about the actual “implementation” of ODF – implementation at the application level.
There were no secrets or business strategies to be guarded, and vendors could openly and freely share in a higher level technology discussion regarding application implementation and interoperability.
We’ll never see that with MSOffice 2007 – the OOXML reference implementation.
The second thing is that i was really surprised at Miguel’s spreadsheet formula comments. He takes for Novell a great deal of the credit for OOXML’s formula’s. Yet, Jody Goldberg, who apparently did the OOXML formula XML encoding fix and normalization, was working on the ODF Open Formula Project months before he came to work for Novell.
In October of 2004 David A. Wheeler approached the OASIS ODF TC with his Open Formula Proposal. David began the work immediately and rather quickly started gathering a team of contributors. Jody was part of this early team.
As i recall, the Open Formula Project began with the encyclopedic gathering and in depth analysis of how various applications have implemented formulas over the years. Of course, the greatest problem area to overcome was the closed and closely held MS Excel formulas.
At any time Microsoft could have released those formulas to the public, greatly cutting down the volumes of work needed to be completed by the Open Formula Project for inclusion in ODF.
Instead of a public release that would have greatly helped ODF, Microsoft held these critically important formulas back, burying their XML expressions in the darkly exclusive and private Ecma discussions.
Now we find out that one of the key contributors to the XML encoding and normalization of the Excel formulas is none other than ODF Open Formula participant Jody Goldberg.
So are we to believe that all those months Jody worked with the David A. Wheeler team on ODF Open Formula, normalizing and perfecting the encoding of formulas in XML, working their way across a broad spectrum of legacy and emerging applications, that somehow he came into the motherload of Excel formulas moving to XML a clean slate?
No doubt Jody Goldberg does great work and has been a major contributor to ODF Open Formulas. That he could also contribute to OOXML is no doubt going to be of enormous benefit to everyone using XML based spreadsheets for generations to come.
But for Miguel to champion OOXML and slam ODF without even mentioning the exhaustive work done by the ODF Open Formula team, or the role Jody Goldberg played in working both groups, is theft of open source.
The least he could have done was give some credit where credit is due.
Is it just me? Or is this another round of exactly what Novell did to Linux and OpenOffice with their Microsoft deal?
They take open source work, repackage it precisely to take exclusionary advantage of Microsoft promised marketshare and kickback, all at the expense of the unleashing of anti competitive plots designed to destroy the very same open source communities responsible for enabling Novell to get into a tight position with Microsoft – a position of advantage that screws everyone else.
After all that’s happened, with Scovell jokes and sellout insinuations run wild, one would think Miguel would show a bit of sensitivity when pillaging open source efforts. The ODF Open Formula Project is worthy cause, and does not deserve to be treated with such disrespect. Even when ruthlessly pillaged.
Besides, does anyone really think Microsoft would have opened up the Excel formulas if not for the ODF Open Formula Project? Or even decided on opening their proprietary XML file formats if not for ODF?
As Brian Jones will agree, time lines are proof of motive, intent and provocation. And ODF has pressed the great monopolist at every turn, towards doing what they have never done before. However poorly done it turns out to be.
Hey, now i’m as frustrated and saddened as you Rob :)
~ge~
You write “The resulting standard of OOXML on top of OOXML would be smaller…”. Do you mean “OOXML on top of ODF”, perhaps? That makes more sense at least. Also, I would like you to address Miguel’s critic on ODF’s Spreadsheet spec., which he says is insuficiently specified. I know there are interopable implementations, but do they interoperate based only on the text in the specification, or because of non-normative test suites, OOo source code or something else? If they interoperate only because of things external to the specification, the specification is not good and complete enough. Is this or is this not correct?
I fully agree with Miguel and Rob when they say that the publication of the OOXML specification was a good thing. I think it was also a good think to make it an ECMA standard, as this promises (but not ensures) a better stability of the specification.
However, it should be noted that nothing obliges Microsoft to be compliant with the published standard, nor prevents it to use undocumented proprietary extensions. In both case, the remaining benefit is that the documentation has been published and is available, but the “standardisation” benefit would be defeated. We could even reach a situation were nobody, not even Microsoft, implements fully and correctly the published OOXML standard…
This being said, it is clear to me that the main reason for Microsoft to publish the OOXML specification is politics. They obviously want to avoid being imposed an “externally developed” ISO standard by their users (e.g. government bodies). So, the best way for Microsoft to avoid this is to propose their own “internally developed” ISO standard. In this way, they are back on a par level on “political” grounds.
I’m not fully in line with Rob when he asserts that there are no political intents in the ODF versus OOXML fight. I rather think that there MUST be a political side in it: the battle must (also) take place on the battlefield chosen by Microsoft if you want to be able to win.
On the technical grounds, I think that the battle is already won by ODF. In its current state, ECMA 376 is clearly of poor quality, as very well explained by Rob. The best thing for ECMA 376 now is to go back to ECMA for detailed public revision and improvement. And from a technical point of view, getting the ISO stamp is of no help to OOXML.
Sorry to say this, but Miguel is pretty much a shill for Microsoft. He develops and promotes Mono, which contains Microsoft patented code and methods.
He even applied for a job with Microsoft, as you can see from his wikipedia page:
http://en.wikipedia.org/wiki/Miguel_de_Icaza
and was only “turned down” because he was not a US citizen.
He is also in favor of the recent Microsoft/Novell deal, which most in the community recognise as a bad thing for Linux.
That he praises XAML is not surprising to me. It’s in keeping with his past actions.
MdI: “But considering that Office 2007 has shipped, I doubt that any significant changes to the file format would be implemented in the short or medium term”.
Rob: “This is quite a revealing statement. Why should the shipment of Office 2007 factor in the appropriateness and the quality of a proposed International Standard?”
Which echoes my comment to “Defining Deviancy Down” – what if o07 and OOXML disagree?
Will those government agencies still have to use the “open, standard”, Office 2007 which won’t interoperate with anything else based on OOXML?
And at his comment area, there’s a post from someone who generates spreadsheets listing more problems with o07 v.s. OOXML.
But here is a key difference. I have many ODF spreadsheet programs with GPL code which I can see and document (or incorporate if my product is GPL). They really should submit a spec and get that fast-tracked, but it hasn’t seemed to slow down anyone – I don’t hear the proprietary vendors complaining they can’t figure it out. I can (legally!) instrument their implementations or run my program against them and watch how things are handled.
OOXML has been around for a while – So can Miguel or anyone produce an interoperable spreadsheet or not?
I have one closed source o07 spreadsheet implementation which is partially documented over a thousand pages. Where I can’t (legally, without moving) reverse engineer anything – and I’ve implemented specs correctly as written but have misunderstood something so still got it completely wrong.
Will anyone dare to call Office 2007 non-compliant with OOXML even if something far worse is found? And then demand it not be used where things need to be based on open standards?
If not, what is the point? Microsoft can be open if it wants to, but just like when Excel was first written – they told developers that the playing field was level, but Lotus was really slow because Excel used undocumented APIs.
This seems to be a repeat. OOXML is them being “open”, but then Office 20xx won’t interoperate with anything not produced by Microsoft. They want it both ways – to say things are standard and open – while they keep things proprietary and insure lock-in.
Very good critique. The only weak part is about the formulas, about which Miquel is correct, although he conveniently ignores the active work which is going on. Still, the fact is that the formula specification is incomplete in ODF thus far. That doesn’t make OOXML any more suitable as a standard, but I think it is important to not be blinded by belief in one standard over another. ODF will be more complete with the formulas fully defined, even if it is obviously possible to create an implementation without them. The problem is, it is equally possible to build a completely incompatitble implementation without them, so they should be better specified.
On the issue of how political the ODF vs. ooXML fight is …
The ultimate issues for users/consumers/developers are having choices and having control over their IT decisions without being bound to any one platform, technology or vendor from the start. And later being able to replace legacy systems/apps with newer, better systems/apps in a seamless way that leaves everything still working as intended.
If that is “political,” then yes this debate (and Rob’s excellent post) are political.
“Dude,
I actually *wrote* a spreadsheet and I actually *know* how the formula implementations came to be.
There was a lot of guessing and reverse engineering to get those in place.
My point, which you sophistically tried to ignore is that ODF lack of fundamental pieces of the standard would not allow for an implementation based on the spec without resorting to third parties that have reverse engineered the code.
Other than trumpeting ODF, have you actually contributed *ANY* code to OpenOffice, or you are just another armchair general?
Miguel.”
Appreciate everything you did for open source and I wish you luck in your future but one thing I have realized is open source is bigger than any one programmer and that it will survive even without Linus.
ODF is a simple standard to create products from – as devices get smaller I think this will become even more important – small portable open standards to write products from.
I think ODF is a better standard than microsoft’s. I have read both and if I was going to implement a product from one it would be ODF.
If you have any poll with microsoft I would like for you to suggest to them that they support ODF in their product. With all that money and since it such an inferior, simpler, and shorter standard I am sure they would have no problem implementing it.
“It seems like it would prevent some nice features developed in the last 8 years for no other reason than “there was a standard for it”.”
that is a good enough reason for me. If missing those “innovations” means not locking me into one company’s product then you can have those “innovations” I value choice of products more than fancy graphics.
Here is an example for you – was working on documentation for a solaris upgrade project at work and I did it in ODF exported to a pdf file – had to work on it at home and luckily my daughters Imac has ubuntu linux on it and koffice but not open office because it is somewhat limited in resouces. anyway I was able to work on it through koffice because of ODF. this is what I want from a standard. and being in IT I think this is what we should provide to our customers. Not lock in just for some innovation that they don’t care about and have to pay more money for just so someone can buy another yacht.
Two points:
1. “exactly how easy it would be to encode Office’s legacy features on top of the extensible ODF specification”?
This could be extremely laborious and, in many cases, impossible. Consider the compatibility options in docx files. Many of them, such as WordPerfect 6.x justification, do not represent simple behaviors but complex algorithms. The only way to convey these in ODF would be to extend ODF–and then you have the Microsoft-Java situation all over again, “extend-embrace-extinguish.” Is that what ODF supporters want?
2. “I’d like even more for them to make the specification for the Office binary formats public”
That is not going to happen. Microsoft cannot publish these formats because a) reveal too much about their programs’ inner workings (they’re akin to memory dumps) and b) they are encumbered by all sorts of third party licensing problems.
Besides, do we really want to keep these formats alive? They’re a mess. Let them die out!
“Other than trumpeting ODF, have you actually contributed *ANY* code to OpenOffice, or you are just another armchair general?”
miguel – I am an armchair general.
when it comes to MY DATA I AM the general. and you people that write the software better realize that before you go out of business.
“There is a good case to be made for OOXML to be further fine-tuned before it becomes an ISO standard. But considering that Office 2007 has shipped, I doubt that any significant changes to the file format would be implemented in the short or medium term.”
thanks for letting us know – seems like you have the inside track on where office is going to go. this sounds like a steve ballmer threat.
good luck with that microsoft affiliation and DRM.
Miguel, I know that all as hit the fan recently with Novel/Mono but there is no need to spout off ridiculous comments about XAML and OOXML. The thing is you guys are going to have to *promote* XAML if you want to get Mono working with .NET so I can see why you are laying the ground work now!?
You say that you are an open source advocate so why do you think promoting XAML/OOXML (which is getting enough underhanded marketing at the moment) is actually going to improve open source and open standards. In the words of a true english developer…stop talking bollocks!
After reading De Icaza’s blog I got a distinct feeling that Novell is just a temporary stop for him on the way to Redmond. I can’t figure out the half truths, inconsistencies and, above all, Microsoftie marketspeak (see the “well Office 2007 is already out, so why change a bit?” part) in no other way.
Thanks Rob for your detailed rebuttal, and Miguel: the earlier you jump fences the better for all of us.
R.
When Miguel wrote, “The motivation for tracking these issues seems to be politically inclined,” Rob, you seem to have read it as “the motivation for tracking these issues on a wiki seems to be politically inclined,” when of course the motivation for using a wiki was the purely technical point that it allows more eyeballs to be involved. But the motivation for tracking the issues on Grokdoc was overtly political, in that it was openly intended to influence the political process of JTC1. What I don’t understand is why Miguel thinks this is a bad thing.
We think OOXML would be a disaster as an ISO standard, on many levels, and so political involvement is unavoidable. Indeed, the whole concept of labelling something a “standard” is a purely political and social matter, and has nothing per se to do with how something is technically implemented (except insofar as ISO imposes quality controls). And the consequence of labelling something a “standard” is also political: social impetus for developers to follow one specification as opposed to others. Nor is Miguel’s post any less “political.”
Miguel seems to have fallen for the common propaganda that “political” (i.e., trying to persuade others to cooperate towards common goals) unavoidably means “foolish” or “dishonest” or “at odds with technical goals” or “irrational”. That we should all be silent worker bees (as if choosing to be silent were not also a “political” statement about what one values).
Concerning spreadsheets, even if the ODF standard is seen as ‘deficient’ in how it describes spreadsheet formulas, wouldn’t the fact that there are several, independent implementations of them that one can refer to mitigate the problem? I mean, I can just read the Gnumeric source if I want to know exactly how it does XYZ. Now, I can’t copy that source, but it should definitely help me understand the problem. That said, it would be great if you help ‘nitpick’ ODF 1.2–the better the standard can be made, the better it is for all of us.
As for OOXML, even if it provides us with all kinds of information–even information that will help us interoperate with Microsoft–that doesn’t make it a good standard. I see it as an unnecessary duplication which is irrationally tied to one vendor’s internals, while ODF has already grown beyond that.
There’s no need for senseless duplication. OOXML just isn’t ready to be any kind of standard.
Miguel:
Dude, with all that experience and wonderful OOXML documentation why don’t you write ooxmlmeric tonight? Prove you can build a *COMPLETELY* compatible Excel/ooxml program based just on the ooxml “specs”. FULLY COMPATIBLE or don’t bother. Put it out as open source under GPL and offer to fully indemnify anybody who uses your program or touches the code or uses the code in another project, without any financial limit and including all their legal fees. Your employer is big on indemnity – ask them to do it. Then your comments will have some credibility. Right now, with the many compatible implementations of ODF spreadsheets (Huh? with that flimsy spec that you refer to?), you are just not credible. Here’s your chance to show everybody wrong TECHNICALLY AND BUSINESS-WISE.
Are you up to the challenge? Or are you an armchair shill?
“Rick Jeliffe has also added a new column on the so called contradictions found by Rob and Weir and Groklaw and just simply buries them as not being contradictions in ISO terminologie whatsoever.”
I think you need some reading comprehension lessons. I looked up this blog:
http://www.oreillynet.com/xml/blog/2007/01/what_is_contradiction_of_an_is.html
Here’s his conclusion:
“(I should stress that even if you do accept that “contradiction” has the kind of very specific meaning that I describe here, and even if under this meaning the supposed contradictions detailed by Groklaw fall down on inspection, and even if it means that OOXML can pass beyond the Feb5 adminstration periood into the next phase unmolested, it says nothing about whether OOXL should be defeated (or accepted) at the final ISO vote by member bodies.”
If you bother to actually read his comments and compare the original list, you’ll find that a number of things on there (like, oh, the leap-year problem) DO completely match his strict definition of contradiction.
Microsoft has plenty of money. You’d think they’d be able to hire better astroturfers.
If OOXML is supposed to be whatever emerges from the open orifice in 2007, then it isn’t even a DNA sequence, it is a data log (much like WML is a sequence of API calls).
I would also note that Office 2007 has only been around in release form since 11/30/06, so my thoughts about it and OOXML having dissimilar behaviors and definitions should not be dismissed.
Has anyone audited Office 2007 against the OOXML draft? Fed valid OOXML to Office 2007 to see if it crashed? Or checked the output files to see if it resembles what is specified on OOXML?
When OpenOffice and KOffice did something different, it was either a bug in one or the other or far more importantly, an ambiguity in the specification. (and I suspect the spreadsheet formula handling is also being done with a Null against reference approach). Without a second implementation, the ambiguities stay just that, fast-tracked into approval. And unlike legal contracts, ambiguities are not decided against the author.
(I was involved in RFC2440 and wrote a very lightweight but 100% comprehensive implementation and regression test and found a small bug in one of the others – the lesson is more implementations, fewer problems).
Microsoft could document every behavior of Internet Explorer in detail, but that would not make it W3C compliant. But they could take that list and submit it to ECMA and ISO as a competing standard for web page authoring. Don’t we need another? And it would be open too! IECSS2 and IEHTML anyone?
So is Internet Explorer 7 fixed so that it properly renders a page which passes all the validation tests and looks right in Firefox, Opera and the rest?
Those standards have been around longer than OOXML. And they are clearer. And isn’t Microsoft on the committees?
How is MSN coming along?
I suppose it will be interesting to see if Microsoft can follow a standard of their own writing better than one they just sit and watch or seem to purposely ignore.
And then it would get interesting if a second implementation (preferably GPL) did something correctly but Microsoft refused to fix Office 2007+ to obey their own specification (and if it broke documents). Would governments and others then be forced to switch to such OOGPL?
I don’t have Office 2007, or I’d start looking now. But either way I think it would be interesting – Microsoft correctly documented something or they aren’t OOXML compatible.
I did forget – Microsoft will be releasing in a year or so Office whatever for the Macintosh. I think. Assuming they don’t do a cut-paste job on everything, it should be interesting to see how well they do. But can’t ISO wait until there is both a Mac and Win version of Office Open?
But is the Office for Mac team working from the OOXML spec, or are they reverse-engineering or looking at code from the windows version?
Confusion
There seems to be some confusion with regards to file formats and applications. Either I’m confused, or others writting about the subject are confused.
Simple non-confused statement: ODF is a file format specification.
The confusion seems to be around building applications against the specification as opposed to using the specification.
For example:
Prove you can build a *COMPLETELY* compatible Excel/ooxml program based just on the ooxml “specs”.
A file format specification identifies such things as:
a) What bits are user data.
b) What bits are meta-data about user data. In other words, is that value a name, or a date, how about a birthdate.
c) What bits are meta-data concerning the software itself. As an example, you may have such things as how the user-data is to be formatted.
One could build a straight program that uses it’s own file format. Or that same program could be made to read only the user-data from that particular format.
I’m a developer. I write code to handle reading and writing to files, along with their formats, all the time. The most common format I use is flat ascii text. To the non-programmer, an example:
User-Data
Name: John Doe
Birthdate: 01-Jan-2000
Name: Jack Smith
Birthdate: 01-Feb-2000
Flat ascii format
John Doe 20000101
Jack Smith 20000201
The description of that file layout would be:
Name Birthdate
xxxxxxxxxxx yyyymmdd
The above shows that the list of x’s is how long the name can be along with what value fits that “field”. The second “field” is identified as the birthdate and it should be in the format of a 4 digit year followed by a numeric 2 digit month follwed by a 2 digit day.
With the above example, what my program decides to do with the user information stored in that file, the file itself doesn’t care.
Let’s say I wanted to produce a mailing list, I’d want the user-information that defines the individuals along with their addresses. My program, using the file specfication, knows where to find that information.
Let’s say I wanted to tally up any outstanding debts. My program, again using the file specification, knows where to find the financial information.
A file format is NOT the application/program. An application/program should NOT be used to define a file specification.
Now, any developer that has to send information from their system to a third party where the two have agreed on a file layout is very familiar with what I’ve outline above.
So…. Either I’m really really really confused about what file formats are supposed to be for, or MS has succeeded in convincing people that a file format specification describes the application. If MS actually follows that practice, that’s a very bad practice to have.
RAS
Miguel didn’t write the spreadsheet – at least not in a form that you would recognise today. He certainly started Gnumeric, but lost interest fairly quickly and others finished the job (Goldberg, Meeks, and Lahey among others i forget) – and it wasn’t the last such project where he did the ‘fun’ parts and left the meat to others.
If anything, his experience with Gnumeric should have taught him that interoperating with ms products is a lose-lose situation. A constant up-hill battle of catch-up which merely burns through engineers without ever quite being good enough to compete. The 6 000 pages of ECMA 376 should clearly demonstrate this continues to be a unachievable goal.
MSOOXML is clearly indefensible as an open document interchange standard on any technical ground, and it is dissapointing that he is taking this highly political position. Coming as it does at a time when there is a limited, yet real opportunity to take advantage of the current momentum in the free software and open standards realm, and the unattractive cost (and retraining) of moving to vista and office 2007 – it is also irresponible given his position as a free software personality.
FWLIW, Rob and Miguel, I appear to have had a minor effect on the MS OO XML specification. I don’t know if that makes me an armchair general, or even an armchewer, but I criticized the inclusion of ActiveX controls in the spec. Unlike the Microsoft developers, I have had actual experience in maintaining Microsoft products under Internet-borne malware assault. And ActiveX is bad news.
I complained about it not only to Brian Jones, but also to Port 25. I was surprised to see that they had taken my complaints on board; I had suggested either they provide the full specifications for ActiveX, or abstract the functionality so that I could get the same sort of effect using QT or GTK. They chose the second option.
Now all the hard work by the Groklawyers is further proof that Microsoft hasn’t put nearly as much thought into their Office Open XML file format as they fondly hoped.
My experience is proof that Microsoft suffers from a Not Invented Here syndrome, and while it is possible to get them to change, it takes a lot of work.
Microsoft has already launched their MS Office 2k7 product, with MS OO XML aka ECMA 376 as one of its default file formats. And there’s a whole lot of other equally serious issues with it, that Microsoft shows little or no interest in rectifying.
Which is why, when I get back on to my (stalled) Office Miniatures project, I’ll be using ODF as the default file format. I can always read the KOffice and OpenOffice.org code to see alternative ways of handling major problems.
Miguel, miguel, miguel…Please explain why we should all roll over and accept XAML/OOXML!? XAML for starters…we have HTML/CSS that everyone is familiar with and completely understands…we are also moving towards XForms through the W3C so why do we need to go with one vendor?
OOXML (Nonsense and useless)…we have ODF which has been developed by multiple vendors WHICH IS OPEN and ready for everyone to start using…Why do you insist that we should work with OOXML. ODF is here to end this madness?! OOXML is there ONLY because ODF was gathering momentum…its a bit like IE7 only came around because Firefox has started to take market share?
The real problem that I see is that you are going to need to support XAML with Mono (at some point!) explain how you are going to do that and not touch on any patents? I mean…I keep reading that Mono is not patent encumbered (are you sure about this??) I can only say that it is your lack of wisdom when it comes to dealing with Microsoft…you need to be stung in the a$$ so to speak to realize who you are holding hands with?!?
To Miguel’s question, on whether I have actually done any coding in this area, or whether I am “just another armchair general”. I prefer to let my words and logic stand for themselves. A resume is a poor substitute for a sound argument.
But if you think it makes a bit of difference, I joined IBM as part of their purchase of Lotus, and over the past 17 years I’ve coded on SmartSuite, Lotus Components, eSuite, K-Station, Portal and Workplace, among some of the more notable projects.
Along the way I’ve done a good deal of file format work, with SmartSuite formats, with the legacy binary Office formats, with ODF and with OOXML. I also was part of the team that made the Xalan XSLT engine and contributed that to Apache. So I have some basis for my opinions, based on practical experience as well as clear thinking. You are free to accept or reject either or both.
To the point of interoperability, it is a more complicated issue than you make it out to be. The presence of a specification does not guarantee it, and the absence of a specification does not prevent it. It is a matter of degree based on the transparency of the technology. A binary format, for example, is very difficult to reverse engineer. But a spreadsheet function is rather trivial to do so. The concept, familiar in patent law for, of “undue experimentation” is appropriate here. Can you figure out what AVERAGE() means without undue experimentation? I think so.
Also you might want to take a look at the OOXML formula documentation to see if it does much better. It doesn’t tell whether SIN or COS take radians or degrees as input parameters. So obviously even there interoperability will require the use of information that is beyond the normative text of the specification.n.
Thanks for proving my point;
“Also some commenter here suggest that the OOXML spreadsheer formula’s were based on the work of the OpenFormula’s group. However he seems to completly overlook that fact that the OOXML spreadsheet formula’s were based on SpreadsheetML, a predesssor of OOXML which was created in 2001/2002, even before the conpcept of OpenFormula ever existed.”
# posted by Anonymous
No matter when SpreadsheetML was released, the fact remains that the XML encoding of Excel formulas were not included.
It’s an interesting fact of history that Microsoft did not disclose their Excel fomulas until after IBM had contributed the complete Lotus 123 formula and spreadsheet conversion documentation to the ODF OpenFormula Project.
This is hardly what one would characterize as the straw that broke the camel’s back. It’s more like a ton of bricks falling on Microsoft’s head. They had little choice but to finally give into public outrage and release the secret formulas.
Note that if Microsoft truly believed in the application interoperability they now preach, they could have released the Excel formula documentation at any time – including the 2003 MSOffice release of SpreadsheetML. But they didn’t.
OBTW, now that the MS Translator plugin has been released, suspiciously coinciding with the final days of the Ecma 376 contradiction review, does anyone think Novell will release the Translator plugin for OpenOffice.org on Monday, the final day of MS Ecma 376 review – when fast track ballots must be cast?
~ge~
Gary,
We should also remember how hard Microsoft was pushing the Office 2003 Reference Schemas in Massachusetts even though the spreadsheet formulas were totally unspecified.
Of course, hypocrisy is not a crime.
What I find most distressing is that Microsoft does not even implement the standard it proposes. Word 2007 has dropped justification like WorpPerfect 6 (wpJustification) and will only justify by adding spaces between words. This is increasing my document length by about 20% and making them look much less professional.
Interoperate with a M$’s crappy “standard” again?! No, thanks…
I will not pretend that M$ honestly wants to be “open”, and just can’t believe there is people who believe it’s going to be. Is it a good thing that M$ opened its specification? Yes, it is, of course. But please don’t blindly believe that itself alone makes it necessarily good without considering the motivations of the company.
Thanks Rob, you have so great damn good arguments!