Wow. My previous post seems to have attracted some attention. When I woke up on Monday morning, made my coffee and logged into to my email, I found out that my geeky little analysis of Office 2007 SP2’s ODF support had sparked some interest. I did not intend it to be more than an update for the handful of the “usual suspects” who regularly follow ODF issues via various blogs, many of which you see listed to your right. If I had any foreknowledge or expectation that this post would end up being on SlashDot, GrokLaw, ZDnet, IDG, Reuters, CNet, etc., I would have done a better job spell checking, and maybe toned down the rhetoric a little (just a little).
But this widespread interest in the topic tells me one thing: ODF is important. People care about it. People want it to succeed, and when this success is threatened, whether for deliberate or accidental reasons, they are upset. Although Office 2007 SP2 also added PDF and XPS support, you don’t see many stories on that at all.
I’ve been trying to respond to the many comments by anonymous FUDsters and Fanboys on various web sites where my post is being discussed. However, it is getting rather laborious swatting all the gnats. They obviously breed in stagnant waters, and there is an awful lot of that on the web. Since all links lead back here anyways, it will be much simpler to do a recap here and address some of the more widespread errors.
The talking points from Redmond seem to be consistent, along the lines of:
We did a 100% perfect and conforming implementation of ODF 1.1 to the letter of the standard. If it is not interoperable, then it is the fault of the standard or the other applications or some guy we saw sneaking around back on the night of the fire. In any case, it is not our fault. We just design, write, test and sell software to users, businesses, governments and educational institutions. We have no influence over whether our products are interoperable or not. What effect SP2 has on users or the market — that’s not our concern. Come back in 50 years when you have a 100% perfect standard and maybe we’ll talk.
In other words, all of those Interoperability Directors and Interoperability Architects at Microsoft seem to have (hopefully temporarily) switched into Minimal Conformance Directors and Minimal Conformance Architects, and are gazing at their navels. I hope they did not suffer a reduction in salary commensurate with the reduction in their claimed responsibilities.
In any case, their argument might be challenged on several grounds. First up is the question of whether the ODF documents written by Excel 2007 SP2 indeed conform to the ODF 1.1 standard. This is not a hard question to answer, but please excuse this short technical diversion.
Let’s see what the ODF 1.1 standard says in section 8.1.3 (Table Cell):
Addresses of cells that contain numbers. The addresses can be relative or absolute, see section 8.3.1. Addresses in formulas start with a “[“ and end with a “]”. See sections 8.3.1 and 8.3.1 for information about how to address a cell or cell range.
And the referenced section 8.3.1 further says:
To reference table cells so called cell addresses are used. The structure of a cell address is as follows:
- The name of the table.
- A dot (.)
- An alphabetic value representing the column. The letter A represents column 1, B represents column 2, and so on. AA represents column 27, AB represents column 28, and so on.
- A numeric value representing the row. The number 1 represents the first row, the number 2 represents the second row, and so on.
This means that A1 represents the cell in column 1 and row 1. B1 represents the cell in column 2 and row 1. A2 represents the cell in column 1 and row 2.
For example, in a table with the name SampleTable the cell in column 34 and row 16 is referenced by the cell address SampleTable.AH16. In some cases it is not necessary to provide the name of the table. However, the dot must be present. When the table name is not required, the address in the previous example is .AH16
So, going back to my test spreadsheets from all of the various ODF applications, how do these applications encode formulas with cell addresses:
- Symphony 1.3: =[.E12]+[.C13]-[.D13]
- Microsoft/CleverAge 3.0: =[.E12]+[.C13]-[.D13]
- KSpread 1.6.3: =[.E12]+[.C13]-[.D13]
- Google Spreadsheets: =[.E12]+[.C13]-[.D13]
- OpenOffice 3.01: =[.E12]+[.C13]-[.D13]
- Sun Plugin 3.0: [.E12]+[.C13]-[.D13]
- Excel 2007 SP2: =E12+C13-D13
I’ll leave it as an exercise to the reader to determine which one of these seven is wrong and does not conform to the ODF 1.1 standard.
Next is the question of the relationship between interoperability and conformance. So we are not building skyscrapers in the air, let’s start with a working definition of interoperability, say that given by ISO/IEC 2382-01, “Information Technology Vocabulary, Fundamental Terms”:
The capability to communicate, execute programs, or transfer data among various functional units in a manner that requires the user to have little or no knowledge of the unique characteristics of those units
I think we probably have a better sense of what conformance is. Something conforms when it meets the requirements defined by a standard.
So let’s explore explore the relationship between conformance to a standard and interoperability.
First, does interoperability require a standard? No. There have been interoperable systems without formal standards. For example, there is a degree of interoperability among spreadsheet vendors on the basis of the legacy Excel binary file format (XLS), even though the binary format was never standardized and never defines spreadsheet formulas. Another example is the SAX XML parsing API. Widely implemented, but never standardized. We may call them informal or de facto standards.
Additionally, many standards start out as informal technical agreements and specifications that achieve interoperability among a small group of users, who then move it forward to standardization so that a broader audience can benefit. But the interoperability came first and the formal standard came second. See the history of the Atom syndication format for a good example.
Second, Is interoperability possible in the presence of non-conformance? Yes. For example, it is well known that the vast majority of web pages (93% by one estimate) on the web today do not conform to the HTML standard. But there is a not unsubstantial degree of interoperability on the web today in spite of this lack of conformance. Generally, interoperability does not require perfection. It requires good faith and hard work. If perfection were required, nothing would work in this world, would it?
Third, if a standard does not define something (like spreadsheet formulas) then I am allowed to do whatever I want, right? This is true. But further, even if ODF 1.1 did define spreadsheet formulas you would still be allowed to do whatever you want. Remember, these are voluntary standards. We can’t force you to do anything, whether we define it or not.
So what then is the precise relationship between conformance and interoperability? I’d state it as:
- In general, conformance is neither necessary nor sufficient for to achieve interoperability.
- But interoperability is most efficiently achieved by conformance to an open standard where the standard clearly states those requirements which must be met to achieve interoperability.
In other words, the relationship is due to the efficiency of this configuration to those who wish to interoperate. Conformance is neither necessary nor sufficient to achieve interoperability in general, but interoperability is most efficiently achieved when conformance guarantees interoperability. When I talk about “standards-based interoperability” I’m talking about the situation when you are in the neighborhood of that optimal point.
The inefficiency of other orientations is seen with HTML and Web browsers. Because of the historically low level of HTML conformance by authoring tools and users who hand-edit HTML, browsers today are much more complex then they would otherwise need to be. They need to handle all sorts of mal-formed HTML documents. This complexity extends to any tool that needs to process HTML. Sure, we have a pretty good grip on this now, with tools like HTML Tidy and other robust parsers, but this has come at a cost. Complexity eats up resources, both to coders and testers, but also runtime resources, memory and processing cycles. More complex code is harder to maintain and secure and tends to have more bugs. Greater conformance would have lead to a more efficient relationship between conformance and interoperability.
Similarly, the many years of non-conformance in browsers, most notably Internet Explorer, to the CSS2 standard has resulted in an inefficiency there. From the perspective of web designers, tool authors and competing browser vendors, the lack of conformance to the standards has increased the cost needed to achieve interoperability, a cost transferred from a dominate vendor who chose not to conform to the standards, to other vendors who did conform.
The efficiency of conformance to open standards in particular is the clarity and freedom it provides around access to the standard and the contingent IP rights needed to implement the standard.
So back to ODF 1.1. What is the relationship between conformance and interoperability there? Clearly, it is not yet at that optimal point (which few standards ever achieve) where interoperability is most-efficiently achieved. We’re working on it. ODF 1.2 will be better in that regard than ODF 1.1, and the next version will improve on that, and so on.
Does this mean that you cannot create interoperable solutions with ODF? No, it just means that, like most standards in IT today, you need to do some interoperability testing with other vendor’s products to make sure your product interoperates, and make conformant adjustments to your product in order to achieve real-world nteroperability. Most vendors who don’t have a monopoly would do this naturally and in fact have done this, as my chart indicated. Complaining about this is like complaining about gravity or friction or entropy. Sure, it sucks. Deal with it. Although it may not pay as much as being a professional mourner, work as a programmer is more regular. And giving value to customers will always bring more satisfaction than than standing there weeping about how code is hard.
In any case, this comes down to why do you implement a standard. What are your goals? If your goal is be interoperable, then you perform interoperability testing and make those adjustments to your product necessary to make it be both conformant and interoperable. But if your goal is to simply fulfill a checkbox requirement without actually providing any tangible customer benefit, then you will do as little as needed. However, if your goal is to destroy a standard, then you will create a non-conformant, non-interoperable implementation, automatically download it to millions of users and sow confusion in the marketplace by flooding it with millions of incompatible documents. It all depends on your goals. Voluntary standards do not force, or prevent, one approach or another.
To wrap this up, I stand on the table of interoperability results in the previous post. SP2 has reduced the level of interoperability among ODF spreadsheets, by failing to produce conforming ODF documents, and failing to take note of the spreadsheet formula conventions that had been adopted by all of the other vendors and which are working their way through OASIS as a standard.
If we note the arguments used by Microsoft in the recent past, they have argued that OOXML must be exactly what it is — flaws and all — in order to be compatible with legacy binary Office documents. Then they argued that OOXML can not be changed in ISO, because that would create incompatibility with the “new legacy” documents in Office 2007 XML format. But when it comes to ODF, they have disregarded all legacy ODF documents created by all other ODF vendors and take an aloof stance that looks with disdain on interoperability with other vendor’s documents, or even documents produced by their own ODF Add-in. The sacrosanctness of legacy compatibility appears to be reserved, for strategic reasons, for some formats but not others. We’ll redefine the Gregorian calender in ISO to be interoperable with one format if we need to, but we won’t deign, won’t stoop, won’t dirty ourselves to use the code we already have from the ODF Add-in for Microsoft Office, to make SP2 formulas interoperable with the other vendors’ products, to benefit our own users who are asking for ODF support in Office. As I said before, this ain’t right.
Rob, the sheer amount of time you have available for writing about our product is impressive.
If you could devote some time to explaining Symphony 1.3 in some detail, that would be helpful too. Since your testing is based on that version of your company’s products, don’t you think your readers deserve to know exactly what software you’re testing, and what its default settings are and so on? As it stands, after not including the SP2 beta in your first round of tests because it was in beta, you’re now publishing and repeatedly referring to tests you’ve done with the beta version of your own product, which the general public does not have access to.
Regarding your claims about the conformance of our approach to formulas below, I’ll be responding in detail very soon in a blog post. Unfortunately, I probably can’t get it out today, as our team is involved in many other activities, such as writing comprehensive implementer notes for our current and future implementations of document-format standards.
I hope they did not suffer a reduction in salary commensurate with the reduction in their claimed responsibilities.:-)))))
“My name is Doug Mahugh and I’m Lead Standards Professional on the Office Interoperability team at Microsoft. “
Doug,
please tell us which products spreadsheets saved in Microsoft Excel 2007 SP2 ODF files are interoperable with.
Samples would also be appreciated so that we can verify your claims.
Thanks
Doug, I’m cursed with a fluent pen. Writing long blog posts is easy for me. Writing short ones is much harder.
My Symphony tests were straightforward. I download the beta, installed it on a Windows XP machine, accepted all of the defaults. Ditto for the other tests, except for KOffice which was done on Ubuntu. In all cases I used the application’s default settings.
Note also, I did not leave out SP2 in the first round of testing because it was beta. I left it out because I did not have access to it. If you have a more recent version of SP2 than the one I tested over the weekend, then please, send it along, and I’ll update my table with it.
Mistakes happen and bugs are fixed. I believe that Microsoft would get more credit and restored goodwill for fixing this bug and slipstreaming it into SP2 downloads than I’ll ever get for having pointed out the bug in the first place. So best to just say, “Whoops, we screwed up. Here’s the fix.” and move on.
“The present letter is a very long one, simply because I had no leisure to make it shorter.”
Blaise Pascal
Just as a reminder to Mr Microsoft there (Doug, correct ?).
It will not be the first time that Microsoft launches a hotfix to a newly released service pack.
I remember that you did it a couple of times on NT days, and the most classical one was the “Windows NT Service Pack 6a”.
At that time, there was also a mystery (or a curse) on the industry, because just the odd service packs for NT (1, 3 and 5) was ok. The even ones always had troubles (btw, this was a secondary reason to explain why you released “6a” instead of 7… to defeat the curse).
So, your team have a good excuse now: Blame the NT’s Service Pack curse (and launch an odd “SP3” as soon as possible).
The analysis of section 8.3.1 would be far more compelling if the section 8.1.3 Table Cell subsection on the table:formula attribute wasn’t so explicit that
(1) “Every formula should begin with a namespace prefix specifying the syntax and semantics used within the formula.”
and
(2) Typically, the formula itself begins with an equal (=) sign and can include the following components: … where 8.3.1 is reference.
The example there is with table:formula=”=sum([.A1:.A5])” which follows the “typically” and neglects the “should.”
Whether or not it was a great idea, it seems to me that the provision of a namespace prefix that refers to a specific syntax and semantics is sufficient. It strikes me that the OO.o 2.x… use of the same device is also sufficient. That OO.o is the archtypical “typically” was helpful to them but certainly not to Microsoft Excel users who are accustomed to a quite different syntax.
So it happens that OO.o formulae don’t fit the typically of Excel formulae and there is a problem. They can’t just ignore the prefix and drop it into Excel hoping it might work.
Whether it would have been effective to attempt rewriting in both directions and hope that result is something that could be anything but a support calamity (not sure the current approach isn’t of course), I am in no position to judge and, if I were you, I would hesitate to predict.
I do think the greatest thing that can be done here, something in our respective powers to contribute to, is get OpenFormula out the door and see what we can do to encourage confirmed, consistent implementations in all ODF-supporting spreadsheet products.
@Orcmid, that is simply the form for expressing an optional requirement. Remember, a mandatory requirement is a requirement on a feature where the feature must be supported for conformance. But an optional requirement is for a feature that is not required, but if it is present must adhere to the stated requirements.
The introduction to that clause makes it clear that bullet list is stating what features are optional. However, the contents of the last bullet, as well as the contents of 8.3.1 make it clear that the stated constraints on addressing cells are normative where cell addressing is used.
Also, as you know, the syntax that is most familiar to users is irrelevant in the context of the file format. The application can use whatever UI they want for entering formulas, 1-2-3 style, Excel style, a visual paradigm, etc. The ODF standard only specifies how such formulas should be stored in the XML. It is up to the application to translate between the UI format for formulas and the storage format.
Rob, I don’t read the statement about the table:formula in-attribute prefix as placing any constraint on the choice of semantics and syntax. As far as I am concerned, the appeal to 8.3.1 is covered by the “typically” (which, considering when it was written, wasn’t exactly true at the time, but I’ll let that pass).
In any case, Excel 2007 SP2 is what it is. I think the best way forward is having OpenFormula to rally around.
Also, is OpenFormula as drafted entirely strict about cell references and such, or will that be a condition of the ODF 1.2 incorporation of it? I thought I saw some loose language around the notation when I looked around the first of this year.
I decided I should go look to see if OpenFormula is as loose as I remembered. It is not. The notations for references (which includes cell references) seems pretty tight. I don’t know if it matches ODF 1.1 8.3.1 or not, since it is more elaborate and I didn’t look that closely just now.
@orcmid,
“Typically” is an adverb and must be read to modify the nearest verb or verb phrase, or in this case “begins” and “can include”. If there was a comma after “sign” then “typically” would only modify the first verb. But however you slice it, the verb phrase ends with the colon, well before there is any stated requirements on putting cell addresses in square brackets. Also, the requirements on function parameter separators occurs in an entirely different paragraph, and those on table cell addressing in an entirely different section.
In any case, I’m not so dismal as to the prospect of “SP2 is what it is”. The automatic download does not start for another 80 days or so. There is still ample opportunity for Microsoft to fix this and the other reported issues.
I am not so certain how clear ODF authors are about the proper use of “:” (nor am I for myself), but I will stick with typically as covering the “can include.”
Probably more constructive, or not, would be to avoid making it our business to second-guess how much time and effort it would take for Microsoft to issue a service-pack that would remedy this situation to your satisfaction. I gave up giving and listening to advice like that when I was younger than you are now; I do believe you are smarter than I was then.
@ordmid:
The formula example in section 6.5.3 does not have an “=” sign at the beginning of the formula. This would indicate that the exclusion of a comma after the words “an equal (=) sign” in section 8.1.3 was most likely a typographical error.
However, even if we accept your conclusion that “typically” modifies “include”, it does not logically follow that the the description of the cell addresses is typical. Similarly, a mere reference to 8.3.1 in this section does not render 8.3.1 “typical” as well. You’re essentially creating an elaborate grammatical daisy chain to eliminate whole sections of the specification based on the absence of a comma.
rob, an errata in your post:
where says:
“Let’s see what the ODF 1.1 standard says in section 8.3.1:”
should say:
“Let’s see what the ODF 1.1 standard says in section 8.1.3 (Table Cell)”
@Orlando,
Got it — thanks.
-Rob
Doug:
The sheer amount of time you and Mr. Knowlton have for pushing out criticism of the messenger instead of the message is mind-boggling.
Your customers want interoperability. They will define what it means, not your wasteful DII initiatives.
The public blog commentary benefits your customers. If your company doesn’t start addressing the substantive issues being raised instead of wasting everyone’s time with OOXML and personal attacks on bloggers, they won’t be your customers for long.
“We did a 100% perfect and conforming implementation of ODF 1.1 to the letter of the standard. If it is not interoperable, then it is the fault of the standard or the other applications or some guy we saw sneaking around back on the night of the fire. In any case, it is not our fault. We just design, write, test and sell software to users, businesses, governments and educational institutions. We have no influence over whether our products are interoperable or not. What effect SP2 has on users or the market — that’s not our concern. Come back in 50 years when you have a 100% perfect standard and maybe we’ll talk.”Can you provide a link to the source of this comment? (Is this actually from someone from Microsoft?)
Rob, THANK YOU for your analysis !
I’ve already seen comments on various websites; people all over the world has learned about your tests (http://www.noooxml.org/forum/t-154383/microsoft-now-attempts-to-sabotage-odf)
As we all expected, Microsoft is doing their best to sabotage ODF. The single wise thing we may do is to have a strategy for avoiding this and raise more and more public awareness about their misbehaviour.
As an European citizen, I think there are a few gentlemen in Bruxelles that are very anxious to learn about the results of your tests ! In this time of deep financial crisis, supplementing the EU’s budget through fines would be welcomed by any EU citizen ! ;-)
Back to practical facts, you’ve concentrated your analysis on the “spreadsheet bug” in SP2. For the sake of the whole community, could you please add a few words about other type of files? Is support for ODF in Word and PowerPoint any better ?
Thanks again,
Răzvan
@foo
You missed the “along the lines of” part.
It’s not an acctual quote, but more of a generalization.
-Peder
@foo, that was also an attempt at humor, which I realize may not translate well. For the record, Microsoft has never actually blamed SP2 on “some guy they saw sneaking around back on the night of the fire”. That appears to me the only thing they have not blamed it on. (Again, an attempt at humor. Clear now?)
I noticed Jomar Silva’s observation that password protection of a document is not supported in SP2, which seems to me to be in clear violation of section 17 – oops. “Optionally”.
But it’s interesting that the encryption/password part is the only part of section 17 not implemented in SP2. I guess we can’t have too much interoperability!
KOffice, Open Office, Gnome Office (Gnumeric) all managed to get this right. So what happened to Microsoft?
1) Lack of programming talent
2) Mis-reading of the standard
3) Deliberate attempt to sabotage ODF
Doug can cry that the standard is not sufficiently clear, however the ability of other office suites to produce documents that passed Doug’s test while Microsoft could not, indicates that the problem isn’t the standard, it’s Microsoft. Considering how much money that Microsoft makes, I wouldn’t think it would be lack of talent. When implementing a standard, it is usual to have more than one person read the document, so it can’t be a mis-reading of the standard.
Until Doug provides evidence to the contrary, Item 3, a deliberate attempt to sabotage ODF is the only logical explanation.
Doug, I’m waiting to hear your response.
In one of his own comments on his “Rethinking ODF leadership” post, Gray writes: “I think there is an axiom out there somewhere about what to do when it seems like everybody else is the problem.”I hope that he has found said axiom by now, and that it has taught him what to do about it. (Let’s hope, for his sake, that it wasn’t published in an ODF document, though, or his MSOffice may not be able to open it successfully…)
I tried the formula-thing in Gnumeric creating and saving a simple spreadsheet. The result: Gnumeric saves formulas in the interoperable way ( contrary to what the Microsoft product does ):
“oooc:=[.B3]+[.C3]+[.D3]”
~ $ gnumeric –version
gnumeric version ‘1.8.2’
I wonder, whether this is an attempt to create a “standard within a standard” by Microsoft, thereby hijacking the standard, claiming conformance, while forcing others who wants interoperability with Microsoft to follow ITS “standard”.
Suppose Microsoft is conforming. In the worst-case scenario, could Microsoft effectively create a “standard within a standard”, then turn around and argue for the sake of interoperability, backwards compatibility should be maintained in future ODF standards, thereby forcing its own “standard within the standard” into future implementations of ODF?
Not a comment on the MS implementation of ODF relating to ODS – but a work around.
You can by-pass the SP2 breaking of formulae in ODS documents by firstly ensuring that the Sun ODF Plugin is installed, then by doing this:
In Excel 2007, click on the Add-In tab. Click on the “Import ODF Document” icon in the Custom Toolbars section. Navigate to your ods document.
I’ve just tested this and it preserves the formulae.
Any other way of opening the ods document still strips out the formula, even with the Sun plug-in installed.
I believe we need the equivelant of the Acid test for Web standards and the Sun’s TCK for Java.
If we had the this for ODF 1.0/1.1/1.2 then there would be no doubt as to whether an office suite is conformant.
Come on OASIS get this done quick. MS will have nowhere to go then.
Useful work around.
I have another one, though:
Download and install OpenOffice. Then, never look at the blasted ribbon interface again.
It’s common knowledge that Microsoft has been actively subverting open standards for decades. From file systems to Java to file formats. The problems with SP2 are just the latest outrage in a long history of anti-competitive practices. Even Mahugh’s comment here addresses only conformance, not interoperability.
I like Excel, even despite the abominable Ribbon. It is in nearly every way a better spreadsheet than others. I understand that MS doesn’t want to facilitate its already waning relevance by creating proper support for ODF. But as MS customer, I want and need ODF spreadsheets compatibility. Not conformance! This “damn the customer, full speed ahead” mentality is deplorable.
We have seen this so many times before.When is MS going to grow up and start behaving like a responsible member of the community.
In this myriad of facts and comments,some important things tend to go unnoticed.
Here I am again, with the question I respectfully ask Rob to answer.
Has anyone carrefully tested OpenDocument Text (.odt) and Presentations (.odp) interoperability (and also conformance) ?
Can we safely save this type of ODF documents from Word and PowerPoint ?
Many users I know are tempted to put OpenDocument as the DEFAULT saving format. In Excel, this would be a disaster, but could we expect the same in Word or PowerPoint ?
Rob, please, can you share this information with us ?
OOo now gives Microsoft one less reason to avoid supporting the default namespace – Issue 5658 has been fixed, which means that string cells will be used as numbers in expressions.