“Is there any point to which you would wish to draw my attention?”
“To the curious incident of the dog in the night-time.”
“The dog did nothing in the night-time.”
“That was the curious incident,” remarked Sherlock Holmes.
— Silver Blaze by Sir Arthur Conan Doyle
A curious blog post from Brian Jones, looking at spreadsheet interoperability between Gnumeric and Apple’s new Numbers spreadsheet, using OOXML. Take a read there and come back and we can compare notes.
Did anything strike you as odd? What raised my eyebrows was the utterly trivial nature of the spreadsheet document that was tested. Typically an interoperability demonstration will be a little flashy, showing as much functionality as possible. But this one has no text attributes, only a single, default numeric style, no charts, no use of spreadsheet functions, nothing. Why bother? There is nothing in this spreadsheet that one could not easily have created in VisiCalc 25 years ago. So why is this simplistic document being used to demonstrate interoperability with OOXML? This seems very odd. Interoperability with a more substantial document would have been far more persuasive. So why didn’t they do that? Hmmm….
So I decided that I would give it a try, on my Windows XP laptop running Office 2007, OpenOffice 2.1 Novell Edition (giving it a test drive this week) and Gnumeric 1.7.10. Let’s see what really works.
First, let’s start with a more substantial spreadsheet document. I created the following in Office 2007, illustrating a variety of everyday features:
- numeric format
- simple text styles
- cell background fills
- cell alignment
- spreadsheet functions
- charts
- row widths
- worksheet password protection
- cell validation
- hyperlinks
- word art and shapes
- OLE embedding
Nothing fancy here, nothing that hasn’t been around in Office since Office 2000 or earlier. You could have done most of this in 1-2-3 version 2.3 a decade before that. I saved the document both in XLSX (OOXML) and XLS (legacy binary) formats. Both appeared identical as shown here:
Next, I tried opening the XLS file in OpenOffice. We see that it handled the file well:
The colors in the chart are clearly different, but I didn’t set any particular colors in the original, opting for the default. So this may just be an indication that the charts in OpenOffice have different default colors. As you can see from the above picture, everything else looks fine. I did verify that worksheet protections, cell validation and the hyperlink worked correctly. However, although the OLE embedding seems to be there, I was not able to activate it.
Next, I fired up Gnumeric to see how it would fare. I first tried the same XLS file which loaded and displayed like this:
What do we notice?
- Cell A7 did not format properly. It should be in long date format, but it is displaying in time format.
- Chart colors differ, but this is probably just a difference in defaults.
- Chart text is clipped in several instances.
- The OLE embedding failed to come through with correct metafile for display.
- Workbook protections and cell validation worked as expected.
- Hyperlink worked correctly.
- Arrow shape was dropped.
So, that is Gnumeric with a basic binary Excel worksheet. Since Gnumeric obviously supports this level of functionality, what would we expect to see when it loads the same document, but in OOXML format? See the image below for what I saw:
Hmm…OK… I think we hear the dog barking now. The OOXML import into Gnumeric is not really usable yet. In addition to the problems indicated above with the XLS import, we can add the following:
- None of the charts converted
- Worksheet password protection was lost
- The hyperlink is broken
- The OLE embedding is missing
- Cell validation is broken
- The word art is missing
Now a few points I wish to make concerning this. First I don’t want to have this taken as an attack on Gnumeric or those who work on it. Gnumeric is a fine spreadsheet with an impressive range of analytic features. The developers have done an outstanding job on it.
However, Microsoft points to Gnumeric as proof that OOXML can be implemented by other vendors. I suggest the jury is still out on this. 1-2-3 release 1.0a (1984) supported more functionality than Gnumeric does via OOXML. Note that even though there is no complete public documentation on the legacy binary formats, Gnumeric does a far better job at supporting them than it does with “standard” Ecma-376 OOXML and its 6,000 pages of documentation.
Now certainly, with much time and much effort, I’m sure Gnumeric will reach the point where it can read an OOXML document as well as it can read an XLS document. It might take another two or three years, but that day will come. But what benefit is that? All that effort will be spent writing code and testing to achieve practical results that Gnumeric already has achieved with the binary formats. This effort comes at the expense of other development activities such as adding features or fixing bugs. I hardly think that Jody wakes up in the morning joyed by the prospect of adding OOXML support to an application that is already compatible with billions of legacy Microsoft documents.
Similarly I have to scratch my head at OpenOffice and their announcement that they are adding OOXML support. As shown earlier, their support of the binary formats is already excellent. I guess that is why Microsoft is so eager to change their default formats. When a product like OpenOffice is able to effectively exchange documents with Office, it is too much of a threat to their Office revenue. So OpenOffice must now spend several person-years recreating this same level of interoperability, and the net result is that they will end up with the same capability they had before, but at the expense of forgoing work on new features. I wonder what Microsoft will do when OpenOffice catches up again in a few years? Hmmm…
(It would be interesting to examine out some of the other products that are said to support OOXML. Of the ones that support OOXML as well as the binary formats, how many of them also have OOXML support that is far worse than their binary format support? Is any editor vendor able to stand up and say that OOXML is a blessing to them because it allows higher fidelity interchange with Office than they were able to achieve with the binary formats?)
Everyone is in the same boat with this: KOffice, Corel, Google, IBM, anyone who has applications that work with Microsoft documents. We’re all faced with the prospect of significant expenses to rewrite our file format support with no net benefit to our customers. This is the toll we all must pay to Microsoft just for the ability to fight for the scraps their monopoly may leave behind. If Microsoft jerks their format around, we all must run and chase after it, reallocating resources away from feature work, becoming in the process less competitive in the marketplace, while Microsoft forges ahead with new features. They can easily repeat this game every few years, just to keep competitors busy. This is what a death spiral looks like.
Giving absolute control of a standard document format to a monopolist that is notorious for abusing their control of file formats in the past is insanity. It doesn’t take a Sherlock Holmes to figure that out.
> Giving absolute control of a standard document format to a monopolist that is notorious for abusing their control of file formats in the past is insanity.
Isn’t that the whole point of ISO standardization for the format, taking control AWAY from Microsoft? If we don’t trust them, shouldn’t we want their format controlled by ISO instead of the people in Redmond?
I suppose Microsoft will also tell the world that Novell, Lisnpire, and Xandros happily support OOXML (because Microsoft paid them to do so. SHEESH!). If so many companies $upport OOXML, then it must be getting a wonderful reception, no?
[Rob]: I wonder what Microsoft will do when OpenOffice catches up again in a few years?
__________________
Don’t ya know? They’ll take a hammer to their DLLs. I don’t see the need for MS-OOXML conversion, since neither my company nor myself at home will ever use it. We already have an ODF policy that excludes any MS-OOXML document under the guise of licensing and patents. We just don’t want anything to do with MS-OOXML or its future.
As Sutor says over and over, “ODF is about the future.” We’ve got the past covered with binary formats. Giving in to MS-OOXML is like paying twice the price at the gas pump — for what reason?
Ed said “Isn’t that the whole point of ISO standardization for the format, taking control AWAY from Microsoft? If we don’t trust them, shouldn’t we want their format controlled by ISO instead of the people in Redmond?”
Except that ECMA 376 does not reflect Office 2007 actual implementation. I’ll give just one example : VML parts. It’s important because VML is over the place in ECMA 376. ECMA 376 says it’s deprecated. Office 2007 however creates VML parts for NEW Word, Excel or Powerpoint documents (below is one example).
If you think that’s nitpicking, think again. Implementers will have to write a VML library (a lot of work, no spec except a cursory description in ECMA 376) if they want to read Office 2007 files without fidelity loss.
We know where this goes. It’s a replay of binary blobs used in binary formats. By the way, the binary blogs are still here (see my article on Codeproject).
Also, any implementer will quickly figure out that ECMA 376 was written after Office 2007 was done and it by no stretch of the imagination can be called a fidelity dump of what implementers need to do to be on equal footing with the Office 2007 team (provided they have the time to implement all what it entails in the first place).
It’s typical Microsoft fire and motion.
So Microsoft can put ECMA or ISO under control of the destiny of ECMA 376. It just does not matter since it’s a theoretical paper that does not reflect anything that exists.
Example for creating VML parts :
– create a new Excel 2007 spreadsheet
– right-click and choose Insert comment
– type a comment.
– save, close.
– unzip the XLSX file.
-Stephane Rodriguez
Ed, As for the control question, I suggest you look at other Microsoft/Ecma Fast Tracks into JTC1, and ask how much control Microsoft has given up to ISO. Take C# for example. Would you consider the development of C# to be out of Microsoft’s absolute control? I don’t.
Does Ecma own Ecma 376? Yes, in a formal sense. But that ownership consists of ownership of a stack of papers. But the management of this standard is delegated to a committee chaired by Microsoft with a charter that prevents them from making changes that Microsoft does not want. Microsoft is not obliged to have TC45 meet to develop future versions of the standard collaboratively. Microsoft can do all of the work internally and then forward the completed revision to Ecma for rubber stamping. Microsoft doesn’t really give up any control.
Remember, in a previous post where I showed the Ecma slide where they claim that one of their benefits is that the “minimize risk of changes to input specs?” This can only happen when the submitter maintains sole practical control over the standard.
As for ISO ownership, Ecma has already sent JTC1 a proposal which would designate Ecma TC45 as the maintainer of OOXML. So maintenance releases of OOXML 1.0 would be owned by Ecma and proceed under their rules, which pretty much reverts to the above, where the work happens by Microsoft and is merely rubber stamped by Ecma.
I wonder… with OOXML, even if the spec is good enough to implement without much effort (which it obviously isn’t), would Microsoft really have to follow it? If Microsoft deviates (intentionally or unintentionally) from the spec, then refuse to correct the problem — because of ubiquity Microsoft Office products, would competing products not HAVE to emulate Microsoft’s deviations?
Because Microsoft controls the spec, Microsoft Office is the “Golden standard” of how OOXML must be implemented — even if Microsoft implemented it poorly, it is still in other vendor’s favor to follow Microsoft’s implementations.
Another point is market lead. With OOXML, Microsoft will *ALWAYS* be the first to market supporting the latest and greatest of anything to do with OOXML — and as a result Microsoft’s Office products will always be seen as the defacto standard and the market leader.
Rob said “I have to scratch my head at OpenOffice and their announcement that they are adding OOXML support.”
Here are my two cents on the topic.
I just place myself in the shoes of a CIO that has to choose between ODF and OOXML. Suppose for the sake of discussion, this is just a scenario not reality, that ODF developers don’t include good OOXML support in their software and conversely OOXML developers don’t support ODF well either. Then the CIO is faced with this alternative:
1- If the CIO picks ODF, he cannot easily exchange documents with partners and customers that prefer OOXML.
2- If the CIO picks OOXML, he cannot easily exchange documents with those that prefer ODF.
What is a CIO to do in such a situation? He will ask “where the bulk of the market is going?” If he gets the wrong answer, exchanging documents with external parties on a daily basis could be a living hell. The CIO will then want to reverse his decision and join the rest of the crowd. There is also the legacy of thousands of documents in the wrong format he will have built. There will be no easy way to migrate them to the other format.
CIOs will see this alternative as a pair of lobster cages. Once he crawls into one of these, there is no easy way to get out. Therefore the CIO will want to pick the right one from the beginning.
This means CIOs have a strong incentive to stall their decision and remain with the current binary formats until the market winner is clear. This is the dominant selection criteria above cost, technical merit and openness. Those that move ahead can only use the binary formats as a lingua franca to bridge the ODF and OOXML users. This defeat the point of having a standard. Neither ODF not OOXML can benefit from this situation.
But what if the CIO faces a forced upgrade? What if the Office suite support expires and there will be no security patches anymore? The CIO will see any forced file format decision as a bet on which format is most likely to become the dominant one. It is tough to go against an established monopoly when the decision is framed this way.
The above discussion was about an hypothetical scenario where ODf software doesn’t work with OOXML and vice-versa. Let’s go back to reality.
A CIO looking at OpenOffice vs MS Office 2007 will see this alternative:
1- If he picks OpenOffice, he will be able to share ODF and receive OOXML documents without problems.
2- If he picks MS Office 2007, he will be able to share OOXML and receive some ODF documents thanks to the CleverAge converter.
There are no lobster cages. Conversion from a format to another will eventually be possible, even if the tools are rocky for now.
A- OOXML to ODF will be supported by OpenOffice.
B- ODF to OOXML will be resolved. Who expect Microsoft to keep stalling on the Office 2007 ODF support if the OOXML to ODF conversion works in OpenOffice? Users that choose ODF would have no way to go back to Microsoft but the reverse is not true. The OpenOffice user base can grow at Microsoft’s expense but the reverse is not possible. Microsoft cannot tolerate this.
Market share is no longer the dominant criterion. The CIO can now afford to look at other considerations. OpenOffice has a strong cost advantage and allow for better document exchanges with users of both formats. ODF is more open. Much of the incentives to stick with the monopoly are gone.
> Giving absolute control of a standard document format to a monopolist that is notorious for abusing their control of file formats in the past is insanity. It doesn’t take a Sherlock Holmes to figure that out.
But it DOES take someone who doesn’t have a financial incentive to believe otherwise in the face of clear evidence. At least, that’s been my observation.
Frankly, people do NOT need new word processors / spreadsheets every few years. I couldn’t use 90% of the features in Word if I wanted to without spending hours researching just what they do, but I can still produce anything I need to just fine by using sensible formatting.
It’s funny, my aunt just retired and needed to type something up so she asked if I had Word (I haven’t built her a computer just yet). I told her no, but I had Open Office. She was fearful: “I don’t want to spend hours learning a new program!” But then I showed her it, and just how familiar it looks.
She sat in front of it and didn’t need to ask me how to work it. She was producing a document with formatting, too. Nothing fancy: just the normal things people use when they’re not trying to create garish documents, so it just used a little bold, underlines, centering, tabs, etc.
And she’s not a computer person.
It’s no wonder Microsoft hates open formats: it would KILL Word’s revenue if the 95% of people and businesses who don’t need anything fancy didn’t have to keep upgrading it to read the files other people send them.
Rob wrote: “If Microsoft jerks their format around, we all must run and chase after it, reallocating resources away from feature work, becoming in the process less competitive in the marketplace, while Microsoft forges ahead with new features. They can easily repeat this game every few years, just to keep competitors busy.”
We all think of the desktop software here. But let’s not forget about the back-end. With e-discovery in the US and many governments subject to access to information laws, the document management systems are critical. Several large organizations will see these systems as key to achieve legal compliance.
Non compliance is not an option. Document management users will not use a format unless it is supported by the back-end.
Microsoft is trying very hard to own the back-end market with Sharepoint. What will these vendors do? Will they cozy up to Microsoft to make sure they have access to the information they need to integrate the newer Office file formats and keep their systems current? Or will they seek independence with ODF?
In any event they will be subject to the run and chase game while Microsoft is after their market. Microsoft, not the partner, has the power to change the rules of the game at the drop of a hat.
Nice article ones more! Keep up the good work!
One sad note from yesterday, the German DIN has accepted[1] OOXML :( – I really can’t understand that discision.
[1] http://www.din.de/cmd?cmsrubid=56731&menurubricid=56731&level=tpl-artikel&menuid=49589&bcrumblevel=1&contextid=din&cmstextid=65004&cmsareaid=49589&languageid=en
I think you are wrong about A7. The format in that cell is [$F800]blah which means “use the system date format”, not to be confused with long date. In other words, different Excels would display different result there. (“blah” is to be ignored.)
Recent versions of Gnumeric will give you 08/23/2007 in en_US locale, but the format to use really comes from the C library.
–Morten
I think this comparison contains a bit of sophistry. You include in the oringinal Excel file, features that neither iWork nor Gnumeric support. For example, you include an OLE object, when neither iWork nor Gnumeric support OLE. Whether a particular app supports a feature that a file format supports is orthogonal to whether support for that file format can be understood/implemented by programmers in general.
This is nothing new. Let’s say WordProcessorA supports charts but WordProcessorB does not. Let’s say a user uses WordProcessorA to create a document containing textual data and a corresponding chart. Now another user uses WordProcessorB to edit that file. Even though WordProcessorB understands the file format, it doesn’t support charts as a feature. So what should it do? Many would say that it should just show the text, keeping the chart data in tact. But if it did this, someone like you would use that as evidence that the file format itself is not implementable by parites other than WordProcessorA. Another, more practical, problem is that if the user opened the file in WordProcessorB which displayed the textual data but kept the chart data intact, then the user edited the textual data and saved it, causing WordProcessorB to save the new textual data and the original chart data, then the document is left in an inconsistent state. When a user used WordProcessorA to open the file, he’d see that the textual data no longer matches the chart data (if he’s lucky; if he’s unlucky he wouldn’t notice the inconsistency).
ODF is not immune to this sort of thing.
The Windows version of StarOffice and OpenOffice support OLE. What happens if you open an ODF file containing an OLE object in some other ODF app on Linux? How does the app handle the OLE data? Does it display it correctly? Does it do anything to make sure that it remains consistent with changes the user might make to the document?
Putting OLE aside, how about the hypothetical example that I provided? What happens if you open an ODF file containing charts in an app that understands ODF but doesn’t support charts or graphics? How would it “display” the chart? How would the app make sure that changes made to the document remain consistent with the chart data?
If you want to do tests like you have done, you need to make sure that the apps you’re using to open a file support the features that are in that file in the first place. If they don’t support the feature, then they can’t reliably handle the data, even if they understand file format.
Sir, your confusion is not caused by sophistry, but by your inattentive read of the post. Please give it another try. You’ll find that the test has already incorporated the kind of controls that you mention. And where did I ever mention iWork? You wrote a good comment. Try reading the post this time.
Note that I loaded the same document in XLS format into Gnumeric and it rendered quite completely. So I have established that Gnumeric supports that feature set. Then I loaded the same document in XSLX format and it rendered very poorly. So this demonstrates that the OOXML support is the problem, not the application support for the underlying features.
Rob, actually you are even lucky that Gnumeric shows formatted styles when you open the XLSX file. The reason why, there is currently no implementation for writing styles back and, as one Gnumeric contributor put in the source code “TODO : just about everything”.
The point the other commenter tried to make about OLE is deceptive as well. If a competing product does not support ONE important feature of said Microsoft product despite all the best effort and many years of work, then this proves that competing product cannot compete on equal footing. If Microsoft had not choose to submit this stuff to ECMA and ISO claiming “it can be supported”, may be (just may be), we would not be up in arms.
At that point, I feel safe to say that someone supporting those obviously flawed Microsoft claims is either ignorant or paid by Microsoft.
-Stephane Rodriguez
Isn’t competing products an oxymoron. Anyone who uses that term to describe the competition does not understand the pervasiveness of MS Office.
I think its a healthy monopoly like W3C toe hold with HTML. People need a standard app for doing office documents. Susidizing the losers in the market is a waste of time. There winners should decide the future not the minority.