• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar

An Antic Disposition

  • Home
  • About
  • Archives
  • Writings
  • Links
You are here: Home / Archives for ODF

ODF

An Invitation: ODF Interoperability Workshop

2007/08/02 By Rob Leave a Comment

The OASIS ODF Adoption TC is organizing an ODF Camp to be held on September 20th in Barcelona, Spain. Facilities for this event are graciously provided by OpenOffice.org, which will be holding its annual conference concurrently.

The hope is that this will be the first of several such events to bring ODF vendors together to explore ways of greater technical coordination, especially in the area of interoperability. I’ve written about and presented on this topic before. Now is the time for action, and I’m extremely pleased that so many vendors will be attending.

On other occasions I’ve called interoperability “the price of success” because a standard implemented by only a single vendor and a single application need not worry about it. Only successful standards with many implementations need to rent a hall to bring the implementors together to review and perfect interoperability.

(It is like capital gains taxes. I grumble when I pay them, but take some solace in the fact that my investments were profitable. Those who make a losing investment don’t pay capital gains taxes on it.)

The focus of this first interoperability event will be on the ODF word processor format. Follow-up events will look at spreadsheets and presentations.

Please have a look at the detailed agenda for the camp and consider joining us in Barcelona.

Filed Under: Interoperability, ODF

My comments on the ETRM 4.0 draft

2007/07/29 By Rob 12 Comments

This was my response to the call for public comments on the Information Technology Division’s (ITD) Enterprise Technical Reference Model (ETRM) 4.0 draft.


I’d like to write to you as a long-time Massachusetts resident and taxpayer. My employer (IBM) will likely submit their own comments, but I’d like to offer you my own personal views on the ETRM 4.0 draft.

I am proud of the Commonwealth’s tradition of openness in government, enshrined in our Public Records Law and Open Meeting Law. As James Madison wrote, “A popular government, without popular information, or the means of acquiring it, is but a prologue to a farce or a tragedy. A people who mean to be their own governors must arm themselves with the power which knowledge gives them.” So access to government documents, now and for posterity, is critical for public oversight and participation in government, as well as for preserving our heritage. Now that we’ve moved into the digital age, access to government documents requires that these documents be made available in a format that all Commonwealth residents can read. So the move toward open documents formats, as called for in the ETRM, is laudable. A citizen must never be dependent on any single vendor for the software needed to read their government’s documents.

However, I am concerned at the proposed addition of Ecma Office Open XML (OOXML) to the list of acceptable document formats. As you may have heard, OOXML is currently undergoing review by ISO/IEC JTC1 for possible approval as an ISO standard. As part of this review, technical committees in standards bodies around the world are reviewing OOXML and appraising it’s suitability as an International Standard. As a participant in the US committee reviewing OOXML, INCITS V1, I had the opportunity to review the text of the OOXML specification and to discuss it with others. I am sorry to report that I found the OOXML specification to be full of errors and omissions. Of course, no technical document is perfect. But this one, in particular, is of far greater length (more than 6,000 pages) and of far lower quality than any I have seen before. If it has advanced this far in the ISO process it is because of vendor pressure, not because of technical merit.

What is the problem with a buggy standard? Interoperability suffers. That is the problem. There is no doubt that if everyone in the Commonwealth used Microsoft Office 2007 on Windows Vista, that their interoperability will be good. But as soon as we admit choice in applications and operating systems, then interoperability will only occur when all sides follow a common standard. So the technical quality of a standard (accuracy, comprehensiveness, level of detail, consistency, etc.) is directly proportional to the level of interoperability achievable and the cost to achieve it.

The ISO ballot on OOXML will not end until September 2nd, after which a resolution process to fix defects in the text of the standard will take at least an additional 6-18 months. That is, of course, if OOXML gains ISO approval, something which is not certain at this point. So I would recommend a cautious approach, and wait for the ISO process to conclude, or conduct your own independent technical evaluation of the OOXML specification to confirm its technical quality before adding OOXML to your list. Ask other vendors: Is this something you can implement? Ask yourself: Will this truly give the Commonwealth the interoperability and choice that you desire? These are important questions to ask.

Finally, I’d note that the ETRM also calls out OpenDocument Format (ODF) as an acceptable format. ODF was approved by ISO last year. So why do we need OOXML? I personally think that the complexity of document exchange and translation in a multi-format world would take us back to the confusion and frustration of the early 1990’s when we all juggled WordStar, WordPerfect, Word and WordPro files, and could collaborate only poorly. Better to push for a single unified/harmonized standard document format for personal productivity applications, much as we have a single standard (HTML) for web pages.

I’ll leave you with a quote from Tim Berners-Lee, the inventor of the web, from an interview he gave with David Berlind from ZDNet when Berners-Lee was recently in Boston receiving a Lifetime Achievement Award from the Massachusetts Innovation & Technology Exchange.

Berners-Lee said:

It was the standardization around HTML that allowed the web to take off. It was not only the fact that it is standard, but the fact that it’s open and the fact that it is royalty-free.

So what we saw on top of the web was a huge diversity and different business which are built on top of the web given that it is an open platform.

If HTML had not been free, if it had been proprietary technology, then there would have been the business of actually selling HTML and the competing JTML, LTML, MTML products. Because we wouldn’t have had the open platform, we would have had competition for these various different browser platforms, but we wouldn’t have had the web. We wouldn’t have had everything growing on top of it.

So I think it very important that as we move on to new spaces … we must keep the same openness we that had before. We must keep an open internet platform, keep the standards for the presentation languages common and royalty free. So that means, yes, we need standards, because the money, the excitement is not competing over the technology at that level. The excitement is in the businesses and the applications that you built on top of the web platform.

I believe we want to ensure the same qualities in document formats. We want competition and choice among vendors, applications and services, but not among standards. If we compete on standards, then no one wins.

Filed Under: ODF, OOXML, Standards

The Formula for Failure

2007/07/09 By Rob 47 Comments

It has been a boast for around around 6 months now. Microsoft’s OOXML fully defines spreadsheet formulas, and ODF doesn’t. The Microsoft boosters have been parroting the party line for quite some time.

Miguel de Icaza gleefully noted back in January:

OOXML devotes 324 pages of the standard to document the formulas and functions.

The original submission to the ECMA TC45 working group did not have any of this information. Jody Goldberg and Michael Meeks that represented Novell at the TC45 requested the information and it eventually made it into the standards. I consider this a win, and I consider those 324 extra pages a win for everyone (almost half the size of the ODF standard).

 

And Microsoft’s Jean Paoli quoted in May in InfoWorld:

As far as those 6,000 pages of specs is concerned, there are 350 pages in the OpenXML spec alone — half of the entire ODF spec — just to describe spreadsheet capabilities, which ODF doesn’t have, Paoli says. For example, ODF can’t describe or calculate a formula in a spreadsheet.

“It may sound amazing. They are working on it now. But the current standard doesn’t have it,” Paoli tells me.

There are many other examples, if you care to seek them out. But what you will not find is an examination of what OOXML actually specifies for spreadsheet formulas, or confirmation that it was done sufficiently. Maybe the assumption is that this would be a trivial task, documenting Excel’s behavior? What could possibly go wrong?

Let’s find out.

First, let’s take the trigonometric functions, SIN (Part 4, Section 3.17.7.287), COS (Part 4, Section 3.17.7.50) and TAN (Part 4, Section 3.17.7.313). Hard to mess these up right? Well, what if you fail to state whether their arguments are angle expressed as radians or degrees? Whoops. Same problem for the return value of the inverse functions, ASIN (Part 4, Section 3.17.7.12), ACOS (Part 4, Section 3.17.7.4), ATAN (Part 4, Section 3.17.7.14), and ATAN2 (Part 4, Section 3.17.7.15). It is hard to have interoperable versions of these functions if the units are not specified. What kind of review in Ecma would miss something so simple?

The AVEDEV function (Part 4, Section 3.17.7.17) should return the average deviation of a list of values. However, the formula given for this function is actually for calculating the number of combinations of n things taken k at a time. Nice formula, though. Jakob Bernoulli would be proud. But anyone using an OOXML spreadsheet application that follows this standard will be perplexed at the values returned by their AVEDEV function. Did these formulas get any expert review in Ecma?

It is hard to have confidence in the CONFIDENCE function (Part 4,Section 3.17.7.47). It is said to return the confidence interval around a sample mean given an alpha value, a standard deviation and a sample size. The problem is that this problem is under-defined. One must make an assumption, not stated here, as to the shape of the data distribution. Is it normally distributed data? Exponentially distributed? Weibull distribution? The standard does not define the meaning of this function sufficiently for one to implement it.

The CONVERT function (Part 4, Section 3.17.7.48) converts from one unit to another. Some conversions explicitly allowed include liquid measure conversions such as from liters to cups or tablespoons. But whose cup and whose tablespoon? Traditional liquid measures vary from country to country. In the US, a cup is 8oz, except for FDA labeling purposes when a cup is 240ml. But in Australia a cup is 250ml and in the UK it is 285ml. Similarly a tablespoon has various definitions. OOXML is silent on what assumptions an application should make. I guess I won’t be using OOXML to store my recipes, and certainly not to calculate medical doses!

Almost every one of the financial functions in OOXML depends on a “day count basis” flag, such as US (NASD) 30/360, Actual/Actual, Actual/360, Actual/365, European 30/360. These represent various conventions for how days and months are counted. The problem is that the OOXML standard does not define these conventions, nor does it point to an authority for their definition. There are subtle behaviors here, especially when dealing with leap years and Excel’s deviant treatment of dates in the year 1900. So lack of detailed definitions in this area make it impossible for anyone to rely on identical financial calculations from different OOXML implementations. This, in a field where being off by a penny can cause problems.

Almost 30 spreadsheet functions are broken in this way.

(What do you call a scientist whose calculations are off by 50%? A cosmologist. What do you call an accountant whose calculations are off by 1%? A crook.)

The NETWORKDAYS function (Part 4, Section 3.17.7.344) seems simple enough. It returns the number of workdays (non weekend days) between two dates. Simple enough. Unless you live in the Middle East. The problem is that this function doesn’t provide a facility for distinguishing the different weekend conventions. I may have a weekend on Saturday & Sunday, but a colleague in Tel-Aviv might have off Friday and Saturday, while in Cairo it might be Thursday and Friday. This function lacks the adaptability to deal with this important cultural difference. Saying that the definition of the weekend is implementation- or locale-dependent won’t work either. I may be a French company in Paris dealing with contractors in Algeria. I need to have a French spreadsheet calculate schedules for workers at various locations and be able to exchange it with others offices using other OOXML applications and expect that they will get the same answer. Lacking cultural adaptability, OOXML fails approximately a billion people here.

Another example. Several of the statistical functions in OOXML are defined incorrectly. Take for example, the ZTEST function (Part 4, Section 3.17.7.352). The key error is following the formula where it says, “where x is the sample mean.” The problem is that x-bar is the sample mean, not x. Someone who implements according to the text will give their users the wrong answer. A similar error is repeated in 8 other statistical functions. Certainly this is a typographical error, but this error changes the answer. Remember, this is an approved Ecma Standard and a proposed ISO Standard, not a 4th grade school essay. Denmark and Massachusetts have already said they will adopt OOXML for official business. Spelling counts. Providing the right formula and the right description counts. Copy and paste errors should have been taken care of back during the Ecma review.

I’ve submitted these spreadsheet formula issues, and many others, to INCITS V1, for consideration in determining the US position on the OOXML ISO ballot, but we never got to them during our two-day meeting in DC a couple of weeks ago, and may not get to them at all. There are simply too many other issues to read through and discuss. But I thought it was important to bring up these formula issues in particular, since Microsoft seems especially proud of their work in this area, delusions of adequacy which on reflection must now seem unwarranted. I’m especially concerned with the financial functions, since they are outside my area of expertise and may have additional errors that I missed.

So what is ODF doing about formulas? We’re continuing to work on them. Rather than rush, we’re doing careful, methodical work. We’re documenting the functions in great detail. Where we have the choice between the common naive formula for a function and one that is numerically stable, we’re documenting the stable function. For the NETWORKDAYS function, we created an optional extra parameter, so a user can pass in a flag that tells what their weekend conventions are. We have a professor of statistics reviewing our statistics functions for completeness and accuracy. We’re verifying our assumptions about financial functions by referring to core specifications from groups like the ISDA and the NASD. We’re creating a huge number of test cases and checking them with Excel and other applications.

Under Sarbanes-Oxley, a CEO or CFO puts himself at personal risk if he signs off on financial numbers derived from processes and tools that he knows to give erroneous results. So we utterly reject a rushed process that has lead to an Ecma Standard which incompletely and incorrectly defines spreadsheet functions. Some things are worth taking the time to do right.

As I’ve shown, in the rush to write a 6,000 page standard in less than a year, Ecma dropped the ball. OOXML’s spreadsheet formula is worse than missing. It has incorrect formulas that, if implemented according to this standard, would raise important health, safety and environmental concerns, aside from the obvious financial risks of a spreadsheet that calculates incorrect results. This standard is seriously messed up. Shame on all those who praised and continue to praise the OOXML formula specification without actually reading it.

Filed Under: ODF, OOXML

An ODF/OOXML File Format Timeline

2007/06/24 By Rob 31 Comments

I suppose the downside of a blog post containing only a picture is that there is nothing for anyone to quote. So here are a few themes that struck me while putting this chart together:

  1. Microsoft once made file format information on the binary formats readily available, in fact encouraged programmers to use the binary formats. But then around 1999 they reversed course, and eliminated such documentation. At the time, working at Lotus, I had no idea what motivated this change. It was only years later, when Microsoft internal memos were released in cases like Comes v. Microsoft, that the full picture emerged. The file format was viewed by Microsoft as a strategic tool, used to support the overall Microsoft platform, not the user. The format was designed to preserve their vendor lock-in. The availability of the file format documentation to competitors was limited, as a matter of corporate policy.So this reminds us that just because something is documented and available today does not prevent Microsoft from changing their mind at a later point and removing the documentation, failing to update it with new releases, or making it available only under a more restrictive license. Since Ecma owns the OOXML specification, as well as the future maintenance of it, any belief in the long-term openness of this format depends on your trust of Microsoft’s future behavior in this area.
  2. Like any durable goods monopoly (and few things are as durable as software) Microsoft’s largest competitor is their own install base. Microsoft has made many attempts at moving beyond the binary formats in the past, with Office 2000, Office XP and Office 2003. But in each case it failed. These were all false starts and abandoned attempts. So we should look for signs that OOXML is actually Microsoft’s real direction and not another false start or dead end. My guess is that OOXML is merely a transitional format, much like Windows ME was in the OS space, a temporary hybrid used to ease the transition from 16-bit to the 32-bit platform that would eventually come (Windows 2000). Microsoft doesn’t want to support all of the quirks of their legacy formats forever. That just leads to bloated, fragile code, more expensive development and support costs. They would rather have clean, structured markup, like ODF. But the question is, how do you get there? The answer is straightforward: First, eliminate the competition. Second, move users in small steps, promising the comfort of continuity and safety. Third, once you have eliminated competition and have the users on the OOXML format that no one but Microsoft fully understands, then you may have your will of them. For example, introduce a new format that drops support for legacy formats and force everyone to upgrade. They are pretty much doing this already on the Mac by dropping support for VBA in the next version of the Mac Office.Even a cursory look at OOXML shows that it was not designed for long-term use, even by Microsoft. So the question I have is, what is the real format that they are going toward?
  3. Microsoft, after pretty much ignoring document standards for over a decade, suddenly got religion in late 2005 and rushed whatever they had on hand into Ecma. Remember, just months earlier they had recommended the Office 2003 Reference Schemas to Massachusetts for official use. I’m certainly glad Massachusetts did not fall for that by putting their resources on another dead format in the Microsoft format graveyard. OOXML was not designed to be a standard. It is just a proprietary specification that Microsoft has dumped, at the last minute, into ISO’s lap, in an attempt to translate their market domination into a standards imprimatur in order to further cement their market domination. It is a win-win situation for them. Either they have a effective monopoly in office applications and an ISO standard, or they have an effective monopoly in office applications. Nice situation for them either way.

Filed Under: ODF, OOXML, Popular Posts

Hemidemisemiquavers

2007/06/11 By Rob 9 Comments

Some “short notes” to share with you:

From a GrokLaw news pick we hear that ZDNet’s David Berlind recently interviewed Tim Berners-Lee in Boston, where Sir Tim received the Massachusetts Innovation and Technology Exchange’s Lifetime Achievement Award. Watch the whole interview if you have 12 minutes, though I will transcribe one passage which highlights the importance of agreeing on a single open standard for a problem domain and fostering competition among the applications built upon that standard:

It was the standardization around HTML that allowed the web to take off. It was not only the fact that it is standard, but the fact that its open and the fact that it is royalty-free.

So what we saw on top of the web was a huge diversity and different business which are built on top of the web given that it is an open platform.

If HTML had not been free, if it had been proprietary technology, then there would have been the business of actually selling HTML and the competing JTML, LTML, MTML products. Because we would”t have had the open platform, we would have had competition for these various different browser platforms, but we wouldn’t have had the web. We wouldn’t have had everything growing on top of it.

So I think it very important that as we move on to new spaces … we must keep the same openness we that had before. We must keep an open internet platform, keep the standards for the presentation languages common and royalty free. So that means, yes, we need standards, because the money, the excitement is not competing over the technology at that level. The excitement is in the businesses and the applications that you built on top of the web platform.

Well said. I tried to make a similar point, but with pictures, back in February.

I recently ordered some podcasting equipment. It should arrive tomorrow. I will be looking for people to interview soon. So hide while you can, don’t answer the phone, and if it looks like I’m carrying a microphone, then run for the exit.

An interesting article in the American Surveyor, by Joel Leininger, on the importance of file format standards. Although it is a different application domain, the concerns are very similar (via OpenMalaysia).

Anyone know Romanian? Something gives me the impression that this guy from Microsoft Romania is not complementing me. I wonder what subtle hint gives me that impression…

The OOXML ballot marches on in national standards committees around the world. September 2nd is the deadline, though many committees have earlier deadlines for developing their recommendations. In the US the committee looking at OOXML is called INCITS V1, and we have until July 13th. V1 has had a few meetings so far and we’re just starting to get into the technical comments. Since we have a consensus process, all it takes is a small minority of members to bring everything to a halt, which is pretty much what is happening. For example, we spent 2 1/2 hours today and discussed only two comments. So we risk having a perfunctory technical review of OOXML. When I compare this to the BSI’s excellent work developing detailed comments on a publicly-readable wiki, I think we in the US should be ashamed at the stonewalling going on in V1.

I’ll be hosting a V1 face-to-face meeting in a couple weeks in Washington, DC. Hopefully we’ll make some more substantial progress there. If you really want to follow our work closely, you can read through our mailing list archives which Sun’s Jon Bosak was kind enough to set up for us.

Although no formal call for public comments has gone out, we’ve received a number of unsolicited pro-OOXML letters which you can read here. As you can see, they are pretty much identical form letters, all ending with the artless phrase, “Furthermore, Open XML in no way contradicts any other international document standard.” Remind anyone of the Manchurian Candidate’s, “Raymond Shaw is the kindest, bravest, warmest, most wonderful human being I’ve ever known in my life”?

In any case, if you want to provide input into this process, feel free to send in your thoughts as well. Having read many of these letters myself, I’d offer the following advice:

  1. Don’t send in a form letter. It hurts your cause more than helps it, since it makes it look like you couldn’t get real support if you tried.
  2. Use your real name and email address and postal address, so we know you are a real person and not a robot.
  3. Be polite. Remember you are trying to persuade.
  4. Give a succinct, reasoned opinion. Keep it to a page if you can.
  5. Ask for a specific action. Don’t expect the reader to draw a conclusion. Draw it yourself.

Of course, since V1 is developing the US position on OOXML, comments from US companies and citizens are especially welcome. Also, if you have specific technical comments about OOXML, you can submit them through me and, if I agree with your points, I will raise them directly with the committee. (I do this as a personal favor to you, my readers, not as an official INCITS V1 solicitation.) Assume the committee is already familiar with the GrokLaw items. But OOXML is a big standard, and there are certainly dark corners where I have not ventured. So if you’ve found something new, certainly let me know.

Canada continues to solicit comments on OOXML. And the UK is soliciting comments as well, through June 30th. Again, be succinct, and give your name and address. Otherwise you risk having a committee member reject your comment outright since it cannot be ascertained whether you are actually a resident of that country.

A blog I’d like to recommend to my readers is Lodahl’s blog. Leif Lodahl has been giving some great coverage of ODF happenings in Denmark, including analysis of the parliamentary debate on the question of whether Denmark should have one or two standards. Also a good catch of Microsoft dancing all over the place, trying to avoid giving a straight answer on why Word does not provide integrated ODF capabilities. If you can spare 45 minutes this is a great clip to listen to.

Filed Under: ODF, OOXML, Standards

  • « Go to Previous Page
  • Page 1
  • Interim pages omitted …
  • Page 14
  • Page 15
  • Page 16
  • Page 17
  • Page 18
  • Interim pages omitted …
  • Page 25
  • Go to Next Page »

Primary Sidebar

Copyright © 2006-2026 Rob Weir · Site Policies