Tuesday, October 07, 2008
Where's Rob?
[A] question being asked along the committee corridors by perplexed NB members is whether IBM has withdrawn its staff from participation SC 34. I have no idea, but IBM people are certainly conspicuous here by their total absence.
Well, I'm truly touched, and by way of reciprocation maybe I can help Alex and any other similarly perplexed attendees understand the situation better.
First, it will help if we start by taking a look at recent SC34 meetings and what the attendance record (publicly accessible) says:
| Date | Location | Total Attendance | Size of US Delegation | # of IBM/OASIS Participants | # of Microsoft/ECMA Participants |
|---|---|---|---|---|---|
| Nov 2004 Plenary | Washington DC | 25 | 6 | 0 | 0 |
| May 2005 Plenary | Amsterdam | 28 | 4 | 0 | 0 |
| Nov 2005 Plenary | Atlanta | 22 | 4 | 1 | 0 |
| May 2006 Plenary | Seoul | 30 | 4 | 2 | 2 |
| Mar 2007 Plenary | Oslo | 37 | 6 | 0 | 5 |
| Dec 2007 Plenary | Kyoto | 52 | 3 | 2 | 12 |
| Apr 2008 Plenary | Oslo | 37 | 3 | 1 | 8 |
| July 2008 Ad Hoc 1 | London | 20 | 1 | 1 | 10 |
| Oct 2008 Plenary | Jeju Island | 35 est. | 2 est. | 0 est. | 12 est. (estimates from Alex Brown, since no official attendance has been published) |
To put it in perspective, the US SC34 shadow committee currently has around 20 members. Before Microsoft stuffed it we had around 7. Regardless, the US SC34 mirror committee typically sends a delegation of 2 or 3 people to international meetings. IBM attendance at these meetings has varied from 0 to 2. It really depends on where the meeting is being held. If it is being hosted by an NB where an IBM employee is a member, then he will typically attend. If something is on the agenda that I find interesting, then I'll typically attend regardless of location.
Now what is really interesting is how Microsoft has increased its attendance over the years, something Alex does not mention and presumably does not find fault with. I remember introducing myself to the first Microsoft attendee at a SC34 Plenary back in 2006. He was an attorney, from Microsoft's anti anti-trust department. An odd person to send to a technical standards committee meeting, don't you think?
Since then, Microsoft's representation has swelled so it now comprises 20-50% of any given meeting. And that does not count those additional "independent" companies and contractors that are employed by Microsoft to create OOXML convertors or to consult with on OOXML matters. I'm only counting those people who explicitly list "Microsoft" or "Ecma" as their corporate affiliations.
I think you'll find no other case in SC34 attendance records of a single company sending more than a single representative. Everyone else in the world sends one person. IBM once sent two people. Microsoft sends ten or a dozen.
Despite Microsoft's successful attempt to stuff SC34, as they did NB's around the world, participation from IBM remains in the range of 0 to 2 participants. I'd be hard pressed to justify the expense of any greater attendance. The real work on ODF goes on in OASIS. That's where we put our people, where they can be most effective on the technical topics related to ODF.
Alex, of course, misses all this. Sitting in a room full of non-technical Microsoft employees, the only unusual thing worth mentioning is my unclaimed badge. Good job as always, Inspector Clouseau!

In any case, the greatest concern should be given to that last row in the table, giving the attendance of the recent Jeju Island Plenary. Although the resolutions of this meeting have been posted and discussed, they lack any record of the actual attendance of this meeting. It has been the constant practice of JTC1/SC34, for many years, to record the attendance of their meetings and to post this document to the SC34 document repository and to make it publicly accessible. But in this case, the attendance record is missing entirely. It isn't even available to SC34 members.
What are they afraid to reveal? Exactly how many Microsoft employees were at this meeting? The trend certainly has been upward. But this information is not available. Is Alex, the Convenor of WG1, only going to publicize my absence, but then fail to report who actually attended his own WG meeting? Is Alex going to express pleasure in saying "In the event this went extremely smoothly: all resolutions passed with unanimous consensus" without mentioning who exactly was there to vote for these resolutions?
I hope this is not yet a further sign that JTC1/SC34 has taken a decent into vendor domination and reduced transparency.
Oh, and where was I? I was on vacation. (Yes, I am allowed vacation). I was in Colorado, spending some time above the timberline and among the rocks.

Sunday, August 17, 2008
Giving the Finger to the DIS 29500 Appellants
First, let's put this in perspective. We're talking about members of an organization, in this case four members of ISO/IEC JTC1, raising an appeal under the rules of that organization, alleging that the organization failed to follow its own rules. Almost every organization has a provision for dispute resolution, including the rights of members to appeal the decisions of elected officers or staff. This is a basic part of governance.
It is a worthwhile exercise to see how this "right to appeal" is handled by other SDO's. Let's take a few examples from other organizations that deal with tech standards.
First, let's look at OASIS, a consortium that creates XML standards, like ODF. Any three OASIS members may lodge an appeal if they believe that OASIS procedures have been violated. Resolution is first attempted via correspondence, but if that fails to satisfy the appellants, they then may request a in-person hearing at the next OASIS Board of Directors meeting, where they can present their complaint. This request cannot be denied. It is a right of the members.
INCITS, the US NB in JTC1 has a different approach to appeals, detailed in section 5.8 of their RD-2 [pdf] Procedures guide. Appeals in INCITS are based on the following principles:
- Appeals shall be addressed promptly and a decision made expeditiously.
- The right of the involved parties to present their cases shall not be denied.
- These procedures shall provide for participation by all parties concerned without imposing an undue burden on them.
- Consideration of appeals shall be fair and unbiased and shall fully address the concerns expressed.
- Records of appeals shall be kept and made available upon request. The INCITS Secretariat may levy a nominal charge to cover the cost of reproduction, handling and distribution for requests received from other than the involved parties.
Any INCITS member may lodge an appeal, and if an informal attempt at resolution with the INCITS Secretariat fails, an appeals panel is formed to hear the appeal. The impartiality and balance of the appeals panel is explicitly considered:
The appeals panel shall consist of three individuals who have not been directly involved in the matter in dispute. At least two members shall be acceptable to the appellant and at least two shall be acceptable to the INCITS Secretariat.
From large consortia, to NB's, let's poke around further and look at a industry group, AIIM, with a standards program focused on enterprise content management (ECM) technologies. Section 7.0 of their Policies and Procedures [pdf] manual defines their appeals process.
Persons who have directly and materially affected interests and who believe they have been or will be adversely affected by any procedural action or inaction by AIIM as a standards developer with regard to the development of a proposed American National Standard or the revision, reaffirmation, or withdrawal of an existing American National Standard, have the right to appeal.
The appeal is heard by three member panel, selected as in INCITS to be impartial and balanced:
The appeals panel shall consist of three members selected from the AIIM membership in addition to the Chairperson. The Chairperson of the panel shall be the Standards Board Chairperson, and shall not have a vote in the decision of the panel. The voting members of the panel shall not have been directly involved in the matter in dispute, and not be currently involved in the development of the standard(s) in question, and shall not represent or be an employee of an interest that can be made directly or materially affected by any decision made by or to be made in the dispute. The voting members of the appeals panel shall be agreed to by both the appellant and the respondent.
Perhaps readers can post other summaries of SDO appeals procedures, to give a broader sense of what the common features are. From what I can tell, the best practices are:
Members have a right to appeal decisions of the organization, and to have their appeal heard and considered, in person, by a panel chosen to be impartial and balanced. Although the appellants are not guaranteed that their views will prevail, the rules do no allow the organization to repress the appeal and not let it be heard.
So with that as background, it is interesting so observe how ISO/IEC JTC1's antiquated cold war era rules in effect serve to stifle criticism, repress dissent, and prevent even a hearing on the merits of an appeal. As I'll show, even with this strong organizational bias against appeals, the current DIS 29500 were only dismissed with assistance from a poorly written ballot question, NB confusion resulting in contradictory votes, and an unwillingness of committee chairs to attempt to reach consensus. Organizational failures, in the end, are usually leadership failures.
For an appeal in JTC1 to be heard, two different committees, ISO/TMB and IEC/SMB must first agree to allow the appeal to be heard. The reader should note the increased difficulty of getting two different committees to agree on the same decision, and consider the following mathematical diversion.
---------
Suppose you are pushing for an proposal that has, on average, 50% support within a given organization. It is put to a vote in a subcommittee drawn randomly from that population. What is the probability that the proposal will pass a vote in that subcommittee?
50%. I think most of us have an intuitive sense of that.
But what if there are two subcommittees drawn randomly from that organization, and the proposal must win a vote in each one of the subcommittees, what is the chance it will pass?
Is it still 50%? No. I hope most of us have that intuitive sense that the need to pass two committees is harder than passing a single committee. In fact, your chances of approval could be as low as 25%, depending on whether the two committees make independent decisions, or whether there are factors that cause their votes to be partially correlated.
But the general rule is: the more stages of approval required, the less your chances of success
---------
In the particular case of the DIS 29500 appeals, the IEC/SMB requires a 2/3 super majority to approve a ballot. ISO/TMB presumably requires only simple majority. (Like most of JTC1 Directives, this is not explicitly defined).
The astute reader will note that the odds are against the appellants even getting their appeal considered by a panel. In fact, these committee odds match what is required to impeach a U.S. President (50% in House) and remove him from office (2/3 in the Senate).
Given these odds, how did the appellants fair? In ISO/TMB the first irony comes with the title of the ballot:
Remember, a core matter of the appeals is the mistreatment of the contradiction phase of the DIS process, and one of the core matters of the contradiction arguments raised was that the official name of the DIS, "Office Open XML" bore a close and confusing resemblance to the submitters main competitor, Open Office, and that this would lead to confusion in the name of the standard. Well here we are, and in denying this appeal ISO/TMB commits that same error, giving the incorrect name of the standard!
Looking at the actual results, for each of the 4 appeals, ISO/TMB tied on two of them, 6-6, and voted not to pursue two others by 7-5 and 8-3 votes. So, it was very close. In fact I am a bit surprised they simply dropped further consideration of the appeals when they had tied votes like that.
A basic rule that applies to voting is JTC1 Directives, 9.1.3:
The Chairman has no vote and questions on which the vote is equally divided shall be subject to further discussion.
In this case, the Chair (Denmark) did not vote. That is correct. But why did the procedure end with two appeals showing a 6-6 "equally divided" vote? According to the rules this should be leading to further discussion. An equally divided vote is as far from consensus as one can get. Is this how they want to leave it, just hanging like this?
In IEC/SMB, the voting results are even more bizarre. I don't know quite what to make of them. So just a few quick observations.
First, a motion should be carefully worded to it is clear what will happen if the motion passes, and what will happen if the motion fails. A Chair should insist on this, and indeed that is one of their primary duties as Chair, to ensure that questions put to their committee are clear. However, in the case of the DIS 29500 appeals, the ballot questions, as dictated by the Secretaries General, were muddled. I remarked on this in a previous blog post, and other readers observed this as well. Whether done by malice or incompetence, the ballot questions were destined to cause confusion.
The reported results indeed were muddled, as you can see here:

Of the 15 SMB members only two (China and the Netherlands) followed the explicit instructions and voted either the questions in Part A or in Part B (but not both). Both China and the Netherlands voted in one part, and abstained in the other part.
Most members voted both sections, but while expressing a consistent intent, e.g., vote No for not processing Brazil's appeal further, but vote Yes for processing Brazil's appeal further.
However, it appears that three other NB's voted inconsistent, contradictory instructions. In fact one NB (Canada) gave exactly the same votes on section B as in section A, essentially canceling out their vote on every single question. This was from an NB whose written comments stated they they strongly supported hearing the appeals further. Similarly, the votes from Korea are partially contradictory.
We can attempt to reconstruct what a less-confusing ballot would have yielded. For example, take the questions in part A, whether "not to process the appeal any further", where the recorded results were 8-4-3 for the Brazilian appeal, yielding a 2/3 super majority (ignoring the 3 abstentions). But note then that two of the three abstaining NB's (China and the Netherlands) in fact voted in the affirmative for the question on whether to process the Brazil's appeal in part B. It looks like the only reason why they abstained in Part A is that they actually followed the ballot instructions and cast a vote in Section A or Section B. If we apply their clear intent consistently to the Part A question, then the results become 8-6-1 and the motion to "not process the appeal any further" would have failed for lack of 2/3 majority.
I cannot make sense of Korea's votes. Although they seem to have supported two appeals, while not supporting two other appeals, their inconsistent votes make it impossible to tell which ones they supported and which ones they did not.
Needless to say, a ballot that yields results where it is impossible to tell what the voters wanted is a hallmark of a seriously flawed, useless ballot. The SMB results are tainted by a poorly written ballot question, given to them by the Secretaries General, which has clearly caused confusion among the SMB voters, and which had a material effect on the results. My analysis of IEC/SMB shows that, like ISO/TMB's vote, the results are nearly equally divided, and IEC/SMB should hang their head in shame if they persist in denying a hearing to these four appeals because of ambiguous results from a poorly written, botched ballot.
This is why ballot results should be released publicly and subject to scrutiny. I do not believe we can trust ISO/IEC to perform quality control on their own processes. The rot is too deep.
Labels: OOXML
Tuesday, July 22, 2008
Sed quis custodiet ipsos custodes?

We are coming down to the last week for JTC1 to decide on whether to hear the four NB appeals concerning various claimed errors in the processing of DIS 29500 (OOXML), or whether summarily to dismiss these appeals without hearing them. The decision lies with two committees, the Technical Management Board (TMB) in ISO and the Standards Management Board (SMB) in the IEC.
Back on July 4th, the Secretaries General of ISO and the IEC referred the four NB appeals, with their comments, to the TMB/SMB. Groklaw has the text of these comments, in PDF format, as well as HTML transcription.
The comments of the Secretary General are accompanied by a ballot, asking the question:
ACTION
The members of the Technical Management Board are invited to indicate, by replying yes, no or
abstention on EITHER a) OR b) for each of the four appeals (see item 14 in annex A):
a) not to process the appeal any further:
Item 1 ABNT
Item 2 BIS
Item 3 FONDONORMA
Item 4 SABS
OR
b) to process one or more of the appeals, which would require setting up of a conciliation panel
Item 5 ABNT
Item 6 BIS
Item 7 FONDONORMA
Item 8 SABS
by no later than 4 August 2008.
This is quite a strange animal to see. Why are we having a ballot at all, and only a 30-day one? This is questionable from several perspectives.
First, why are the Secretaries General the ones calling for a ballot? The Directives do not call for them to do so. In fact the Secretaries General are not even called upon to make a recommendation. They are only asked for comments. The Directives say:
11.3.3 The Secretaries-General shall, following whatever consultations they deem appropriate, refer the appeal together with their comments to the TMB/SMB within one month after receipt of the appeal.
11.3.4 The TMB/SMB shall decide whether an appeal shall be further processed or not. If the decision is in favour of proceeding, the Chairmen of the TMB/SMB shall form a conciliation panel (see 9.2).
But deciding is not the same as voting. One of the cardinal principles of JTC1 is to discuss and seek consensus, not rush to a vote. Indeed, this is one of the matters under appeal, the rush to voting at the OOXML BRM. JTC1 Directives, section 1.2 says (my emphasis):
These Directives are inspired by the principle that the objective in the development of International Standards should be the achievement of consensus between those concerned rather than a decision based on counting votes.
But here we are, with a vote pushed on the TMB/SMB.
The sense of the vote is wrong as well. The Directives call for a decision on "whether an appeal shall be further processed or not." Note the wording. It did not call for a decision on "whether to accept the recommendation of the Secretaries General". But somehow, we skip discussion, skip over consensus and get a ballot question which asks the opposite question first "not to process the appeal any further". In an environment where many parties automatically vote Yes to the ballot question, changing the sense of the question in this way is prejudicial to the appellants.
So it is clear from the start that the powers that be do not want to give these four NB's the opportunity to make their case or be heard. In any case, let's take a deeper look at some of the subjects under appeal and see if we can detect what it is exactly that cannot bear the scrutiny of a duly processed appeal.
First up is the alleged mishandling of the contradiction period last year. The Secretaries General dismiss this complaint, saying that it was a matter of judgment:
The Directives give the JTC 1 Secretariat and ITTF latitude to use judgement as to whether a meeting should be organized to address alleged contradictions. Considering that other issues could potentially be identified during the DIS ballot, the JTC 1 secretariat and ITTF concluded that it was preferable to initiate the ballot and to allow all issues to be addressed by the BRM. The NBs were fully informed of all the claimed contradictions and Ecma's responses to them.
This argument doesn't hold water. Although the JTC1 Secretariat and ITTF are allowed judgment, this is not an absolute license which cannot be questioned. The Secretariat and ITTF also have defined duties, and their actions or inactions with respect to these duties can be questioned and are subject to appeal. Specifically, an NB may appeal the issue of an inaction of JTC1, according to JTC1 Directives, 11.3. So for the Secretaries General to suggest that this inaction cannot be appealed because it is a matter of judgment is nonsense. Judgment and duty are the proper matters for an appeal.
So what is the duty in this case? As stated in JTC1 Directives, 13.4:
If a contradiction is alleged, the JTC 1 Secretariat and ITTF shall make a best effort to resolve the matter in no more than a three month period, consulting with the proposer of the fast-track document, the NB(s) raising the claim of contradiction and others, as they deem necessary. A meeting of these parties, open to all NBs, may be convened by the JTC 1 Secretariat, if required.
If the resolution requires a change to the document submitted for fast-track processing, the initial document submitted will be considered withdrawn. The proposer may submit a revised document, to be processed as a new proposal.
If the resolution results in no change to the document or if a resolution cannot be reached, the five month fast-track ballot commences immediately after such a determination is made.
The Directives call for the JTC1 Secretariat to make a best effort to resolve the matter (JTC1 Directives, 13.4). The JTC1 Secretariat is not given latitude to do nothing, or allowed discretion to immediately defer this question to the ballot period, without making a best effort to resolve the matter.
When a new 6,000 page DIS is submitted to JTC1 only one month after the publication of another standard (ODF) in the exact same space (XML document formats for office applications) and 19 NB's submit contradiction statements, and the JTC1 Secretariat's "best effort" is to hold no consultations with the NB's claiming contradictions, to hold no meeting, to make no attempt to resolve the question, then I believe that any NB would has a legitimate grounds for appeal on the inaction of JTC1 with regards to contradictions. There is no evidence that a "best effort" was made here to resolve the contradictions. Doing nothing is clearly incompatible with the required “best effort”.
It should be noted that JTC1 has had challenges in the past getting ITTF to carry out their responsibilities with respect to contradictions, which lead to this resolution adopted unamimously at the 2000 JTC1 Plenary:
Resolution 27 - Consistency of JTC 1 Products
JTC 1 stresses the strong need for consistency of its products (ISs and TRs) irrespective of the route through which they were developed. Any inconsistency will confuse users of JTC 1 standards and, hence, jeopardize JTC 1's reputation. Therefore, referring to clauses 13.2 (Fast Track) and 18.4.3.2 (PAS) of its Directives, JTC 1 reminds ITTF of its obligation to ascertain that a proposed DIS contains no evident contradiction with other ISO/IEC standards. JTC 1 offers any help to ITTF in such undertaking. However, should an inconsistency be detected at any point in the ratification process, JTC 1 together with ITTF will take immediate action to cure the problem.
Perhaps it is time to give ITTF another reminder of their obligations in this regard?
Further, the determination claimed to have been made by the JTC1 Secretariat and ITTF was not communicated to JTC1 NB's. Instead, the JTC1 Secretariat merely forwarded Ecma's responses to the contradiction submissions along with a notification that the DIS ballot should then commence. No statement was made as to whether the ballot was commencing because the contradictions had in fact been resolved, or because a resolution could not be made, which are the only two outcomes allowed by the Directives in 13.4. Not to notify NB's of the actual state of the resolution of the contradictions submissions is incompatible with the JTC1 Secretariat's duty to make a best effort to resolve the matter.
This failure by JTC1 materially effected the ensuing ballot, since Microsoft was then able to take advantage of this procedural nonperformance and repeatedly represent to NB's that the contradictions had been rejected as invalid and could not be considered in the DIS ballot. In fact, this led to several NB's issuing explicit, but erroneous instructions to their members that the contradictions had been resolved and thus could not be raised again as a criterion for determining their national position, e.g., in Australia.
Further, although the Secretaries General claim that “the JTC 1 secretariat and ITTF concluded that it was preferable to initiate the ballot and to allow all issues to be addressed by the BRM” the documented fact is that the BRM Convenor explicitly disallowed any discussion of contradictions at the BRM.
Another subject of appeal was the irregular voting procedures used at the DIS 29500 BRM in February. This is the P-member versus O-member question. The Secretaries General dismiss this appeal in this way:
2e. Correct but inapplicable. The BRM was neither a meeting of JTC 1 nor of SC 34 but was open to all 87 national bodies which submitted a vote (including abstentions) on the DIS. Applying 9.1.4 would have disenfranchised the voting NBs present at the BRM which were not P-members. The fact that any votes in the BRM would be open to all national delegations present was communicated over three months prior to the BRM.
This argument presented is flawed, and amounts to saying, “The voting was done by P- and O-members because the meeting was attended by delegations from P- and O-members”. Who attended the meeting is immaterial. Liaisons such as Ecma also attended the BRM? Should they have been able to vote merely because they attended? No, of course not. Voting rights are defined in JTC1 Directives, and this must not be set aside in favor of an ad-hoc rule made without NB consultation or approval.
Asserting that applying 9.1.4 would disenfranchise NB's is an example of circular reasoning. One can only be disenfranchised if one first has the right to vote. So the statement by the Secretaries General is arguing a conclusion (O-members are permitted to vote at BRM's) by assuming the very thing it tries to prove.
JTC1 Directives 14.4.3.9, which defines the parallel BRM process for the Publicly Available Specification (PAS) transposition process, reads: “At the ballot resolution group meeting, decisions should be reached preferably by consensus. If a vote is unavoidable, the approval criteria in the subclause 9.1.4 is applied.” So here we see 9.1.4 explicitly called for. By the argument put forth by the Secretaries General, all PAS BRM's which follow the Directives are also flawed because they “disenfranchise” those NB's who are not P-members of JTC1. I believe this is a tortured reading of the Directives. The voting rules of 9.1.4 are explicitly and unambiguously called for in PAS BRM's, so one cannot dismiss their application to Fast Track on general principles that would apply equally to PAS. When Fast Track rules say that the BRM vote shall ("if a vote is unavoidable") "be taken according to normal JTC 1 procedures" then we are faced with two alternatives:
- Use the voting rules of 9.1.4, which declares itself to be the normal voting procedures ("In a meeting, except as otherwise specified in these directives, questions are decided by a majority of the votes cast at the meeting by P-members expressing either approval or disapproval.")
- Or use a voting rule which is not to be found anywhere within the Directives.
Finally, neither BRM Convenor, Alex Brown, nor ITTF, nor indeed the assembled delegations at the BRM were competent nor had the mandate to make or change voting rules for a DIS BRM. The rules are set in JTC1 Directives, and must be followed. “These Directives shall be complied with in all respects and no deviations can be made without the consent of the Secretaries-General.” (JTC1 Directives 1.2).
Notifications made by the BRM Convenor in advance of the BRM have no weight on matters which exceed his mandate and authority. The communication referred to by the Secretaries General, which was given in advance by the BRM Convenor, was from this FAQ:
6.8 If votes are taken during the BRM, who votes?
Those present.
This in fact was not the rule applied at the BRM. For example, Liaison representatives could not vote, though they were undoubtedly present at the BRM and participated fully in other ways. Also individual participants could not vote, only delegations, via their HoD could vote. So the Convenor's glib communication should not be taken as notification of a novel voting procedure.
Additionally, the BRM Convenor was unambiguous in his communications on his blog where he clearly stated that the voting rules of 9.1.4 would be applied:
...Now, paper balloting follows normal JTC 1 in-meeting rules: In a meeting, except as otherwise specified in these directives, questions are decided by a majority of the votes cast at the meeting by P-members expressing either approval or disapproval. (9.1.4)
(After the BRM the Convenor dutifully went back and “corrected” his earlier blog post to reflect how the BRM actually operated.)
The Secretaries General further dismiss the concerns regarding BRM voting procedure, saying:
4e. Not correct. Decisions on the comments not discussed during the BRM and proposed dispositions were taken by a process agreed by the BRM itself (29 votes in favour, none against and 2 abstentions).
On the contrary, the BRM was not competent and had not the mandate to set its own voting rules or to negate the provisions for consensus stated in JTC1 Directives 1.2:
These Directives are inspired by the principle that the objective in the development of International Standards should be the achievement of consensus between those concerned rather than a decision based on counting votes.
[Note: Consensus is defined as general agreement, characterised by the absence of sustained opposition to substantial issues by any important part of the concerned interests and by a process that involves seeking to take into account the views of all parties concerned and to reconcile any conflicting arguments. Consensus need not imply unanimity. (ISO/IEC Guide 2:1996)]
The Directives specify the rules. If NB's do not like the rules, then NB's may work with SWG-Directives to define new rules and then vote on them using the defined process. But if the rules are not applied correctly, then the proper course is for NB's to appeal against the actions or inactions of those with a duty to carry out the rules. This is the essential governance model of JTC1. NB's rule, but they rule through the rules. We may not merely decide by majority vote to ignore rules for this DIS, or to institute new rules for that DIS, or to substitute different rules for another DIS, in an ad-hoc fashion, based on a BRM vote.
Using the logic given by the Secretaries General, what in principle would prevent a BRM from voting itself an Augur in addition to a Convenor for the purpose of observing the flights of birds to decide whether a given change to the DIS text was auspicious or not? Is there any voting procedure that would not be permitted them once we say that a BRM, by majority vote, can institute their own voting rules? Are TMB/SMB certain that this is the principle that they want affirmed by their rejection of the NB's appeals?
Further, NB's were not duly notified that their BRM delegations would be determining their own voting rules, so few if any of them had NB instructions on that matter. An agreement among BRM HoD's to set aside cardinal principles of JTC1, in the absence of NB consultations, should not be allowed to stand.
Finally, the existence of a vote at the BRM is not incompatible with the assertion that the BRM was “too short, arbitrarily short, or otherwise incorrectly conducted”. When given the choice between several bad alternatives, the delegations made a choice. That does not legitimatize the flawed application of JTC1 process that incorrectly gave them only bad choices and forced upon them a vote which they did not have the mandate to hold.
I could go on and on, but I'll spare you all more of the same. I am sorry to report that I find the response by the Secretaries General to be perfunctory, poorly reasoned and self-serving. It does not serve to resolve the issues, including important issues where clarification is needed. Majority rule, within the rules, should be encouraged. But to dismiss legitimate complaints by arguing that the majority agreed to not follow the rules, this is to substitute mob rule (or orchestrated monopoly rule) for the rule of law. We know where that leads to -- curtailed rights for those with minority opinions. And that should concern everyone.
The Secretary General of ISO, Alan Bryden, retires at the end of the year. August vacation is approaching, and before you know it there will be a retirement party with the cake and gifts, maybe a wall plaque or pewter paperweight. I am sure he does not need or desire to spend more time being reminded of the OOXML disaster that occurred during his last year at ISO. TMB/SMB members all want vacation as well. So do I. But out of respect for Mr. Bryden's eventual successor, and our shared mission in JTC1, shouldn't we urge TMB/SMB to do their job and not leave this all unresolved for the next guy to deal with? Dismissing an appeal with so many open unresolved issue is not expediency. It is merely creating more dissent, more distrust and more trouble that we'll all need to deal with next time around. It is better, I think, to hear the appeals, get to the bottom of this, seek resolution, consensus and closure, and then to move on. Ignoring mistakes will not make them go away.
Labels: OOXML
Thursday, July 17, 2008
What is Rick smoking?
If you like unintentional humor, you will enjoy reading Rick's over-the-top post.
Rick suggests that organizationally JTC1/SC34 is a more participatory environment for developing standards than OASIS.
JTC1's process, based on National Body voting is both effective ... and more genuinely open, because it is impossible to stack either directly or indirecty.
Let's test that proposition. Let's compare OASIS and JTC1/SC34.
Who can participate? In OASIS, anyone can participate, from any company, organization, government agency, non-profit corporation in the world. Or you can join as an unaffiliated individual, as many have. You don't need your government's permission to join. You just do it. Most join with a nominal membership fee ($300 for individuals) but membership grants are available in some cases, when the fee would be burden for active individual contributors.
What about participation in JTC1/SC34? First, you must be a member of your NB. How do you become a member of your NB? In the US the price is $1,200 and you must be representing a company or organization. Individuals? Sorry, you are not allowed to participate. In other countries the rules vary. In some cases membership is not available at all at any price. You are essentially wait-listed until an opening becomes available. (Sorry, we don't have enough seats, we heard in Portugal). In some countries, like China, membership is forbidden to native citizens who are employees of foreign subsidiaries in China. In other countries you can't join at all. It is entirely a government decision. So, good luck joining the NB of Syria, where the constitution has been suspended under emergency rule since 1963. (But somehow they managed to make time to vote on the OOXML ballot. Zimbabwe as well, that paragon of open participation.)
Now, it is entirely possible for a standards organization to appear open, but in practice to be inaccessible. So we must look at the complete cost of participation, not just the initial membership fees.
The OASIS ODF TC does its work entirely on an email list, a wiki, and via weekly phone calls, which are toll-free calls for most participants. I don't recall there ever being a face-to-face meeting, certainly not so long as I've been a member. This use of technology lowers the barrier to participation, so anyone can be effective on the TC if they wish. In particular it makes it easier for those who have day jobs and can only contribute to the mailing list during non-work hours.
What about JTC1/SC34? To participate effectively requires attendance at several international meetings each year (Plenary's, WG's, Ad-hocs, BRM's, etc.), as well as participation at NB meetings. Since many of the participants are representative of large corporations or government agencies, a junket mentality prevails and the meetings are often held in some of the most expensive places in the world: Geneva, Granada, London, Kyoto, Jeju Island, etc.
JTC1 does not allow meeting participation by telephone. Since important votes, are held at these meetings, and no provision is made for remote participation, one cannot effectively participate in JTC1/SC34 without a substantial budget for international travel. Attendance at a single meeting — the DIS 29500 BRM — was $3687.52 for me, and I flew coach and ate cheap. How many standards meetings like that can you as an individual or your small company afford per year?
Further, note the nature of your membership — what can you actually do? Can you vote? In OASIS, it is one person/one vote. In the TC, your vote as an individual with a $300 membership fee is counted exactly the same as my vote representing an OASIS Foundational Sponsor. At the organizational level, it is one company/one vote, and the smallest OASIS member organization has exactly the same vote as the largest.
In JTC1/SC34 however, you typically can't vote at all. NB's vote, not individuals, not companies. So your opinion and your wishes are subject to the will of your NB. If your opinion varies from your NB's, you may not be accredited to attend an international meeting, and even if you are able to attend you may not be allowed to speak your opinions. This extra level of indirection and censorship means that you, as an individual, can do little. And to the extent your NB's committee is stacked by a single vendor and their partner community, or your NB decides to overrule or ignore its technical committee, or Microsoft calls your head of state to change the NB's vote, or any of the dozens of other documented shenanigans that recently occurred, your entire membership fee and participation will be an entire waste of time, money and effort.
Membership is OASIS is far more open and inclusive. You join. You discuss. You vote. Period. In JTC1/SC34, you are mired in layers of bureaucracy at the national and international level, in a system crafted by and for the big boys to cut back room deals and manipulate the process to the benefit of large corporations.
(Now that isn't to say that there are not some individual consultants out there who thrive in the JTC1 environment by mastering its dark, dusty, demon-haunted hallways. Even the largest corporations occasionally have need of this expertise, as Rick and others are quite aware. If JTC1/SC34 were truly open and transparent, such skills would not be needed. You certainly don't see anyone selling their services to help companies navigate OASIS, do you?)
What about transparency? As Rick demonstrates, OASIS meeting minutes and agenda are all posted and public. So is our mailing list. So are all of our drafts. So is our member and public comments.
But in JTC1/SC34, most of the documents are private, only accessible to SC34 members by password. And then occasionally JTC1 will step in prevent SC34 from releasing their own work , suppressing documents even from their own SC members. There are no public comments to speak of, and member comments on draft standards are secret.
So when you are back from your "trip", Rick, please let us know again, who wins on openness, participation and transparency?
And for the record, a couple of outright deceptions in Rick's post:
- Rick says that there are 80 NB's, and thousands people participating in JTC1, but only 13 people participating on the ODF TC. This is a particularly inept comparison. Why is he comparing all of JTC1 to a single OASIS TC? If you look at OASIS overall, you will see that OASIS has more than 5,000 participants, representing over 600 organizations and individual members in 100 countries. The ODF TC itself has 53 members, including 7 members of JTC1/SC34.
- Rick picks a "random" ODF TC minutes post from a year ago to attempt to suggest domination by a single company. Not so random a choice, methinks. It was a rare joint meeting of the ODF TC and the Metadata subcommittee, which brought in a far greater number of Sun employees than typically participate in a call.
Wednesday, July 16, 2008
Toy Soldiers
One example is the proposals in SC34 to create a new project to create a Technical Report on translating between ODF 1.0 and OOXML 1.0. This might have made sense at some point in the past. But this proposal seems out of place now.
Consider:
- Few applications today support exclusively ODF 1.0 and only ODF 1.0. Most of the major vendors also support ODF 1.1, one (OpenOffice 3.x), now supports draft ODF 1.2 as well.
- No one supports OOXML 1.0 today, not even Microsoft.
- No one supports interoperability via translation, not Sun in their Plugin, not Novell in their OOXML support, and not Microsoft in their announced ODF support in Office 2007 SP2.
Excuse me if my enthusiasm is muted.
And yes, the proposers want accelerated processing for this proposal. But the idea was already obsolete the day it was proposed to SC34. Events have overtaken it, though the clockwork motions continue, and SC34 is currently having a ballot for this proposal, ending on 29 July. I'm not in favor of it. Perhaps it would be worth considering if resubmitted in one year's time, and was targeted to consider ODF 1.2 and OOXML 1.1 (or whatever their next version is). But is it really a priority for SC34 now?
Another example of working on autopilot is the ad-hoc working group in SC34 looking at OOXML maintenance. Although it was heralded with much pomp "SC takes control of OOXML", the fact is SC34 currently can't even look at OOXML, let alone maintain it. They are entirely impotent. But still they will go through the motions and meet next week in London to advise Alex Brown, who will then take all this advice and later formulate and write up his OOXML maintenance plan for SC34 to vote on.
All the best to them. They voted on OOXML without seeing it. Now they'll determine how to maintain it without seeing it. Maybe ISO should stand for Invisible Standards Organization? Maybe one of the participants can let me know where can I submit my invisible defect report?
In any case, since Microsoft has effective voting control of SC34, after almost two years of packing the committee, my bet is that OOXML will effectively be handed over to Ecma for maintenance. That is what JTC1 has done for every other Ecma Fast Track that has been approved. They might call it a "maintenance group" and allow token participation from SC34 liaisons in a non-voting capacity, but in all important ways it will remain Microsoft/Ecma standard. In the end, this makes some sense. Who is better positioned to clarify exactly how Excel financial functions work, the Microsoft engineer who has access to the Excel source code, or an SC34 representative from Kazakhstan?
Given the leisure to do the job right, my bet is on Microsoft. Everyone knows it for what it is now. There is no longer need for elaborate attempts to disguise the fact that OOXML is and will remain a Microsoft-only standard. Why continue the charade? If Microsoft put OOXML on MSDN, at least we would all have access to it and would know where to send our defect reports to, which is more than we can say about ISO OOXML. A real open standard is preferred, of course. But given a choice of fake ISO standard and a real MSDN specification, I'll take the real MSDN specification any day.
Labels: OOXML
Monday, May 19, 2008
Fractured YEARFRAC and Discounted DISC
So it was with much lament last year that I reported that OOXML, then under ballot in ISO/IEC JTC, had many egregious errors in its spreadsheet formula definitions. In addition to enumerating these errors in my blog, I submitted them for consideration by INCITS V1, the US SC34 mirror committee, and these became part of the bundle of comments with which the US accompanied its ballot.
Although the ballot wasn't due until September 2nd, the extravagant "sit and do nothing" provisions in INCITS lead to our technical review being cut off in July. Because of the lack of review time, I was not able to do an in-depth review of all the spreadsheet functions, but only a cursory review. But the existence of such errors as I did identify, in the already-approved Ecma Standard, was disquieting. It should have lead Ecma TC45 to conduct a more through review of the spreadsheet functions. But this did not appear to happen.
When we received the response from Ecma to the NB comments, in January, INCITS V1 was asked to go through all the responses (over 1,000 of them) to determine whether they were acceptable. This review period was again insufficient.
As for the BRM in February, this was a travesty, as I and others have noted.
I have a theory concerning committees. A committee may have different states, like water has gas, liquid or solid phases, depending on temperate and pressure. The same committee, depending on external circumstances of time and pressure will enter well-defined states that determine its effectiveness. If a committee works in a deliberate mode, where issues are freely discussed, objections heard, and consensus is sought, then the committee will make slow progress, but the decisions of the committee will collectively be smarter than its smartest member. However, if a committee refuses to deliberate and instead merely votes on things without discussion, then it will be as dumb as its dumbest members. Voting dulls the edge of expertise. But discussion among experts socializes that expertise. This should be obvious. If you put a bunch of smart people in a room and don't let them think or talk, then don't expect smart things to happen as if the mere exhalation of their breath brings forth improvements to the standard.
So the the BRM ended a different committee than it started, and the mode of operation it was lead into caused it to act like a very stupid committee indeed. I don't say this to be accusatory. I'm just making an observation about crowd behavior. When a committee of experts ceases to be a deliberate committee, then you will achieve subpar results.
One of the ways the BRM was stupid is that it approved changes to OOXML that have totally broken SpreadsheetML's financial calculations, rendering the resulting calculations both mathematically incorrect as well as inconsistent with what Excel actually calculates. More about this later.
Here be dragons.
One of the persistent problems with OOXML has been in the area of "day count conventions" as used in SpreadsheetML.
Why should day counting be complicated? Just count how many days between the two dates and you are done, right? Indeed, everywhere but in finance it is simple. Some of the complications are for historical reasons, to try to make hand calculations easier in the pre-computer era. Also, these conventions made the calendar more regular, so financial instruments were less distorted by calendar irregularities like leap years or variable length months. If you assume a year is exactly 360 days in 12 months with 30 days each, then some things in life are simpler. Of course, this makes other things in life more complicated, including defining spreadsheet functions.
The devil is in the details. You must get these count conventions right. Being wrong by just one day in a year may only be 0.3%, but in a million dollar transaction, that is $3000. Not many people can afford to routinely ignore a $3000 error.
Different financial organizations have developed their own different day count conventions. We have the Banker's Rule, the LIBOR rule, the NASD rule, the ISDA rule, ICMA rule, SIFMA rule, etc.
In Excel, these conventions are defined by a function parameter called the "basis" and this basis is used by many financial spreadsheet functions. In fact, none of the following functions are adequately defined unless the day count basis values are adequately defined:
- ACCRINT()
- ACCRINTM()
- AMORDEGRC
- AMORLINC
- COUPDAYBS()
- COUPDAYS()
- COUPDAYSNC()
- COUPNCD()
- COUPNUM()
- COUPPCD()
- DISC()
- DURATION()
- INTRATE
- MDURATION()
- ODDFPRICE()
- ODDFYIELD()
- ODDLPRICE()
- ODDLYIELD()
- PRICE()
- PRICEDISC()
- PRICEMAT()
- RECEIVED()
- YEARFRAC()
- YIELD()
- YIELDDISC()
- YIELDMAT()
There are five basis conventions defined in OOXML, with values 0-4:
- Basis 0 = US 30/360
- Basis 1 = Actual/Actual
- Basis 2 = Actual/360
- Basic 3 = Actual/365
- Basic 4 = European 30/360
As I reported last July, the definitions provided by OOXML did not sufficiently define the behavior of the conventions that lie behind most of the financial functions in Excel. Unfortunately, Microsoft/Ecma have failed to fix this problem in their proposed Resolutions, and in fact have made it worse. Further, the DIS 29500 BRM, in its negligent bulk approval of Ecma's responses merely advanced these serious errors into the text which was then approved as an International Standard. So essentially, my work in pointing out errors in the spreadsheet language was for naught. Microsoft just shoved it all though anyways.
This puts me in a delicate situation. On the one hand, the ODF TC really would like to finish its work on ODF 1.2, and part of that is completing the OpenFormula work. A key remaining part of OpenFormula is to ensure that our financial functions synch up with how Excel works. What OOXML says is irrelevant, except to the extent that it accurately tells how Excel defines these functions.
However, inquiries to Ecma on these day count conventions, inqueries made months ago, have received no response. Also, the final DIS text of OOXML has not been made available, not even to SC34 members and liaisons. And there is not mechanism in place yet in SC34 for collecting defect reports on OOXML. So we're stuck.
Or maybe not.
David Wheeler, Chair of the OpenFormula subcomittee of the OASIS ODF TC, has been trying to nail down the behavior of Excel's spreadsheet functions for over a year now. One of the last remaining pieces is to nail down the day count conventions. After waiting and waiting for this to be clarified in OOXML, David took matters into his own hands and decided to solve the problem by brute force, enumerating millions of test cases, indeed a comprehensive set of date pairs over a 6 year period, to try to determine exactly how the date bases in Excel work. You can read David's conclusions on his blog.
What strikes me in David's report is that not only are the OOXML definitions incomplete and inconsistent, but they do not accurately reflect what Excel actually calculates. So either Excel is wrong, or the OOXML standard is wrong when calculating almost every financial spreadsheet function. This is quite an embarrassment for an ISO standard, and an unnecessary one, since I have been talking about how poorly defined these functions are for almost a year now.
Let's work through some of David Wheeler's test cases by hand, to get a better feel for how OOXML is broken.
Let's take the YEARFRAC() function as the simplest example. YEARFRAC() takes two dates and a basis value as inputs, and returns the fraction of a year that is between those two dates. So logically, the calculation is like:
YEARFRAC = (interval between start-date and end-date) / length of year.
That is the logical definition. The only complication is the 5 date bases, and what exactly the year length is. This last point is something that stuck out in my mind when I first reviewed the draft of OOXML last summer. You might think the year length is 365 days. But what about leap years? And what about date ranges that straddle normal years and leap years? This is the key fact that this function requires in is definition. This is the problem I reported on my blog last July. This was the problem INCITS V1 submitted with our ballot comments last September. This is the problem that Ecma responded to in a severely flawed way in January. This is the problem that the BRM refused to discuss and merely agreed with Ecma's flawed changes. And this is the problem that is now in the final DIS version of OOXML.
Take a look for yourself in this brief extract from the final DIS version [pdf] of OOXML, provided for purposes of critical review and commentary.
(Yes, I now have a complete copy of the final DIS version of OOXML. If you think that this is unfair -- and I would agree with you on that -- then maybe you should ask ITTF why I was able to get a copy of the final DIS, but no one else in SC34 was.)
Let's take a look at the ISO OOXML definitions and try some test calculations to reproduce some of David's findings.
First, let's take the date basis 1 first, Actual/Actual, since that is the easiest. But we immediately run into a problem. The standard says two different things. In the "description" table it says:
Actual/actual. The actual number of days between the two dates are counted. If the date range includes the date 29 February, the year is 366 days; otherwise it is 365 days.
However, later, when defining the return value, the standard says this:
If the Actual/actual basis is used, the year length used is the average length of the years that the range crosses, regardless of where start-date and end-date fall in their respective years.
There is absolutely no way in which these can both be correct. This would have been easily fixable at the BRM, if the BRM had been allowed to do its job. But I wasn't even allowed to open my mouth and point out this problem. So now this fatal ambiguity sits in the text of OOXML, as authorized by the BRM experts and approved by JTC1 NB's. Gotta love it.
But let's forge on and assume that we really have two algorithms, the first (from the description) which we will call basis 1, and the second basis (from the return value section) which we will call 1' ("one-prime"). We'll calculate it both ways.
Let calculate YEARFRAC(DATE(2000,1,1),DATE(2001,1,1),1)
With basis 1, this is simple. 2000 was a leap year (since it was a century divisible by 400) so the interval between the two dates is 366 days. Similarly, since the date range includes a February 29th, the length of the year used in the calculation is 366. So the returned value returned by YEARFRAC should be 366/366 or 1.0.
With basis 1', this is also simple. The date interval is still 366. But the length of the year is now the average of the year lengths crossed by the date range. So the average of 366 (for 2000) and 365 (for 2001), or an average of 365.5. So using basis 1', YEARFRAC should return 366/365.5 or 1.00137.
Let's try another, with basis 1. What is YEARFRAC(DATE(2000,1,1),DATE(2002,1,1),1) ?
In basis 1, the interval is 731 days (366 days in 2000 plus 365 days in 2001). The year length is 366, since the interval includes a February 29th. So YEARFRAC should return 731/366 = 1.997.
With basis 1', the interval is also 731 days, but the year length is the average of 366 (for 2000), 365 (for 2001) and 365 (for 20002). So YEARFRAC should return 2.0009
Just to make sure we have this down, let's try another example:
What is YEARFRAC(DATE(2000,1,1),DATE(2000,1,2),1) ?
A one day interval? Yes, please humor me.
OK. With basis 1, the interval is one day. Since it does not cross February 29th, the year length is 365 days. So it will return 1/365 or 0.0028.
With basis 1' the interval is also one day, and the year length is 366, So it will return 1/366 or 0.0027
OK. That was too easy. One more example, to make sure that you have it down.
What is YEARFRAC(DATE(2000,1,1),DATE(2004,1,31),1) ?
Hmm.... this one will require more thought. I might have to take off my shoes and count using my toes as well.
With basis 1, the interval is 1491 days = 366 + 365 + 365 +365 +30.
The year lengths are:
2000=366 (since the range cross February 29th)
2001=365
2002=365
2003=365
2004=365 (since the date range does not cross February 29th)
So we have 4 full years plus 30 days of a 365 day year. YEARFRAC should return 4 + 30/365 = 4.0822.
With basis 1' we treat the 2004 as having 366 days and average the years in the interval, so average year length = (366+365+365+365+366)/5 = 365.4. So YEARFRAC should return 4 + 30/365.4 = 4.0821.
Now that we're done with the examples, we can throw them into a table, and compare them to what Excel 2007 calculates for these same parameters:
| start-date | end-date | basis | ISO Value | Excel's Value | Excel Correct? |
|---|---|---|---|---|---|
| 2000-01-01 | 2001-01-01 | 1 | 1.0000 | 1.0000 | Yes |
| 2000-01-01 | 2001-01-01 | 1' | 1.00137 | 1.0000 | No |
| 2000-01-01 | 2002-01-01 | 1 | 1.9972 | 2.0009 | No |
| 2000-01-01 | 2002-01-01 | 1' | 2.0009 | 2.0009 | Yes |
| 2000-01-01 | 2000-01-02 | 1 | 0.0028 | 0.0027 | No |
| 2000-01-01 | 2000-01-02 | 1' | 0.0027 | 0.0027 | Yes |
| 2000-01-01 | 2004-01-31 | 1 | 4.0822 | 4.0805 | No |
| 2000-01-01 | 2004-01-31 | 1' | 4.0821 | 4.0805 | No |
Let your mind linger on these results for a bit. Let it sink in. Look at this table until you recognize its significance and cringe in disgust. In some cases, Excel seems to be using the first definition of date basis 1. In other cases it is using the second definition of date basis 1. And in one case it is using neither definition. In other words, this is a lot more screwed up than at first it appears. This is not just a simple ambiguity. The OOXML definition of date basis 1 is totally wrong.
That is just the Actual/Actual date basis. The other 4 conventions are more complicated. I encourage you to read David's write up in full to see how 3 of 5 basis conventions defined in OOXML differ from what Excel actually calculates. David also shows how he believes Excel really calculates these day count conventions, based on his extensive tests.
Now if this was just a matter of one function in Excel, just YEARFRAC(), then this would not be a big deal. But this is flaw is inherited into most financial functions in OOXML. Let's take an example at random, DISC(), which calculates the discount rate for a security, given settlement and maturity dates, as well as par and redemption values. You can read the definition of DISC() from the final DIS text here [pdf].
You don't need to be an Wall Street quantitative analyst to see some obvious problems here. First, in the formula given, the 2nd term is divided by "SM". What is SM? There is no "SM" defined here. There is a "DSM" defined, however. Is that what is meant? Let's assume so.
We can try a test calculation, using the example given in the text of OOXML:
What is the value of: DISC(DATE(2007,1,25),DATE(2007,6,15),97.975,100,1) ?
- B = 365, since the date range does not include February 29th. (Note that DISC has a single definition for year length in basis 1, not the two conflicting definitions we saw in YEARFRAC)
- DSM = 141 days = 6 in Jan+28 in Feb+31 in March+30 in April+31 in May+15 in June
- par = 97.975, which was our input parameter
- redemption = 100, which is another one of our input parameters
So Excel is off by 2% or so. Do we really care. It's just money, right?
The problem is that the function in OOXML is defined incorrectly, from the financial perspective. The discount rate is the discount from the redemption value, not the discount from the purchase price. So the first term in the formula should be (redemption-par)/redemption, not (redemption-par)/par. If you make this change, then the calculated value matches the value Excel gives.
Does anything strike you as odd in the above? Do you have a chill running down your spine? Do you have renewed sense of dread? You should, because I just illustrated another grave problem with the OOXML standard: The spreadsheet examples in OOXML are a fraud.
You might have mistakenly been reassured by these numerous examples in the spreadsheet formula, that these actually had some relationship to the standard, that they were examples of how the calculations should be done, that they were evidence of some sort of quality assurance, that they may even be of assistance to an implementor to see whether they implemented the function correctly.
But they aren't.
What would be normal practice would be to take the definitions, as given in the OOXML text, and to calculate the values according to the definition provided in the text, and then to compare the resulting values with what Excel returns. That would provide a useful double-check of the definitions in the text. But OOXML doesn't do that. The examples here are mere fluff.
The discrepancy here also indicates that no one has actually reviewed these formulas for accuracy. Errors like this are immediately evident, but only if you look. The fact that things like this have escaped the notice of Microsoft, Ecma TC45, their Project Editors, 80 NB reviews, the BRM experts, and the eagle eyes of ITTF, should make one have considerable concerns over the the sufficiency of the Fast Track review and approval process.
Let's try one more example before we wrap this up.
What is the value of: DISC(DATE(2000,1,1),DATE(2004,1,31),97.975,100,1) ?
- B = 366, since the date range includes February 29th
- DSM = 1491 days = 366 + 365 + 365 +365 +30.
- par = 97.975, which was our input parameter
- redemption = 100, which is another one of our input parameters
So with the DISC() function we found:
- The given text provided a formula that referred to a non-existant "SM" variable. This appears to be a cut & paste error.
- After accounting for that, we found that the formula was incorrect according to recognized financial standards. Securities are discounted from their redemption values, not from their purchase prices.
- Even correcting for the formula errors, we find that the given definition of DISC() does not match what Excel returns, due to errors and ambiguities in the day count conventions, errors that David Wheeler delves into more deeply in his report.
- The examples given in the standard are bogus. They are not actually examples of the defined function.
- Excel does not implement OOXML.
Labels: OOXML
Tuesday, May 13, 2008
Spreadsheet file format performance
But first, a little details of my setup. All timings, done by stopwatch, were from Office 2003 and OpenOffice 2.4.0 running on Windows XP, with all current service packs and patches. The machine is a Lenova T60p, dual-core Intel 2.16 Ghz and 2 GB of RAM. I took all the standard precautions -- disk was defragmented, and test files were confirmed as defragmented using contig. No other applications were running and background tasks were all shut down.
For test files, I went back to an old favorite, George Ou's (at the time with ZDNet) monster 50MB XLS file from his series of tests back in 2005. This file, although very large, is very simple. There are no formulas, indeed no formatting or styles. It is just text and numbers, treating a spreadsheet like a giant data table. So tests of this file will emphasize the raw throughput of the applications. Real world spreadsheets will typically be worse than this due to additional overhead from process styles, formulas, etc.
A test of a single file is not really that interesting. We want to see trends, see patterns. So I made a set of variations on George's original file, converting it into ODF, XLS and OOXML formats, as well as making scaled down versions of it. In total I made 12 different sized subsets of the original file, ranging down to a 437KB version, and created each file in all three formats. I then tested how long it took to load each file in each of the applications. In the case of MS Office, I installed the current versions of the translators for those formats, the Compatibility Pack for OOXML, and the ODF Add-in for the ODF support.
I find it convenient to report numbers per 100,000 spreadsheet cells. You could equally well use the original XLS spreadsheet size, or the number of rows of data, or any other correlated variable as the ordinate, but values per 100K cells is simple for anyone to understand.
I'll spare you all the pretty picture. If you want to make some, here is the raw data (CSV format). But I will give some summary observations.
For document sizes, the results are as follows:
- Binary XLS format = 1,503 KB per 100K cells
- OOXML format = 491 KB per 100K cells
- ODF format = 117 KB per 100K cells
Any ideas?
For load time, the times for processing the binary XLS files were:
- Microsoft Office 2003 = 0.03 seconds per 100K cells
- OpenOffice 2.4.0 = 0.4 seconds per 100K cells
So what about the new XML formats? There has been recent talk about the "Angle Bracket Tax" for XML formats. How bad is it?
- Microsoft Office 2003 with OOXML = 1.5 seconds per 100K cells
- OpenOffice 2.4.0 with ODF = 2.7 seconds per 100K cells
OK. So what are we missing. Ah, yes, ODF format in MS Office, using their ODF Add-in.
- Microsoft Office 2003 with ODF, using the ODF Add-in = 74.6 seconds per 100K cells
- Microsoft Office 2003 in XLS format = 0.75 seconds
- OpenOffice 2.4.0 in XLS format = 3.03 seconds
- Microsoft Office 2003 in OOXML format = 8.28 seconds
- OpenOffice 2.4.0 in ODF format = 14.09 seconds
- Microsoft Office 2003 in ODF format = 515.60 seconds
(I was not able to test files larger than this using the ODF Add-in since they all crashed .)
(Update: Since it is the question everyone wants to know, the beta version of OpenOffice 3.0 opens the OOXML version of that file in 49.4 seconds and Sun's ODF Plugin for Microsoft Office loads this file in 30.03 seconds. )
This is one reason why I think file format translation is a poor engineering approach to interoperability. When OpenOffice wants to read an legacy XLS file, it does not approach the problem by translating the XLS into an ODF document and then loading the ODF file. Instead they simply load the XLS file, via a file filter, into the internal memory model of OpenOffice.
What is a file filter? It is like 1/2 of a translator. Instead of translating from one disk format to another disk format, it simply loads the disk format and maps it into an application-specific memory model that the application logic can operate directly on. This is far more efficient than translation. This is the untold truth that the layperson does not know. But this is how everyone does it. That is how we support formats in SmartSuite. That is how OpenOffice does it. And that is how MS Office does it for the file formats they care about. In fact, that is the way that Novell is now doing it now, since they discovered that the Microsoft approach is doomed to performance hell.
So it is with some amusement that I watch Microsoft and others propose translation as a solution to interoperability, creating reports about translation, even a proposal for a new work item in JTC1/SC34 concerning file format translation, when the single concrete attempt at translation is such an abysmal failure. It may look great on paper, but it is an engineering disaster. What customers need is direct, internal support for ODF in MS Office, via native code, in a file filter, not a translator that takes 10 minutes to load a file.
The astute engineer will agree with the above, but will also feel some discomfort at the numbers. There is more here than can be explained simply by the use of translators versus import filters. That choice might explain a 2x difference in performance. A particularly poor implementation might explain a 5x difference. But none of this explains why MS Office is almost 40x slower in processing ODF files. Being that much slower is hard to do accidentally. Other forces must be at play.
Any ideas?
Labels: ODF, OOXML, Performance
Sunday, May 04, 2008
Release the OOXML final DIS text now !
13.12 The time period for post ballot activities by the respective responsible parties shall be as follows:The OOXML BRM ended on February 29th. One month after February 29th, if my course work in scientific computing does not fail me, is... let's see, carry the 3, multiply, convert to sidereal time, account for proper nutation of the solar mean, subtract the perihelion distance at first point of Aries, OK. Got it. Simple. One month later is approximately March 29th +/- 3 days.
.
.
.
- In not more than one month after the ballot resolution group meeting the SC Secretariat shall distribute the final report of the meeting and final DIS text in case of acceptance.
So the SC34 Secretariat should have distributed the "final DIS text" by March 29th, or at the very least, when the final ballot results on OOXML were known a few days later.
But that didn't happen. Nothing. Silence. What is the hang up? I note that when NB's said that the Fast Track schedule did not give sufficient time to review OOXML, the response from ISO/IEC was "There is nothing we can do. The Directives only permit 5 months". And when NB's protested at the arbitrary 5 day length of the OOXML BRM, the response was similarly dismissive. But when Microsoft needs more time to edit OOXML, well that appears to be something entirely different. "Directives, Schmerectives. You don't worry yourself about no stinkin' Directives. Take whatever time you need, Sir."
It makes you wonder who ISO/IEC bureaucracy is working for? The rights and prerogatives of NB's? Or of large corporations? Almost every decision they made in the OOXML processing was to the the detriment of NB prerogatives.
This delay has practical implications as well. Consider the following:
- We are currently approaching a two month period where NB's can lodge an appeal against OOXML. Ordinarily, one of the grounds for appeal would be if the Project Editor did not faithfully carry out the editing instructions approved at the BRM. For example, if he failed to make approved changes, made changes that were not authorized, or introduced new errors when applying the approved changes. But with no final DIS text, the NB's are unable to make any appeals on those grounds. By delaying the release of the final DIS text, JTC1 is preventing NB's from exercising their rights.
- Law suits, such as the recent one in the UK, are alleging process irregularities, including (if I read it correctly) that BSI approved OOXML without seeing the final text. I imagine that having the final DIS text in hand and being able to point to particular flaws in that text that should have justified disapproval would bolster their case. But if JTC1 withholds the text, then they cannot make that point as effectively.
- There are obvious anti-competitive effects at play here. Microsoft has the final DIS version of the ISO/IEC 29500:2008 standard, and by JTC1 delaying release to NB's, Microsoft is able to have 2+ extra months, free of competition, to produce a fix pack to bring their products in line with the final standard, while other competitors like Sun or Corel are left behind. So much for transparency. So much for open standards. How can this can considered open if some competitors are given a significant time and access advantage?
Note that I'm not talking about the publication of the IS here. I'm talking about the requirements of 13.12 and the release of the final DIS text. Obviously ITTF will have a lot of work to do prepping OOXML for publication. For ODF it took 6 months. For OOXML I would expect it to take at least that long. But that does not prevent adhearance to the Directives, in particular the requirement to distribute the final DIS text.
JTC1/SC34, noticing the delay in the release of this text, adopted the following Resolution at their Plenary in early April:
Resolution 8: Distribution of Final text of DIS 29500
SC 34 requests the ITTF and the SC34 secretariat to distribute the already received final text of DIS 29500 to the SC 34 members in accordance with JTC 1 directives section 13.12 as soon as possible, but not later than May 1st 2008. Access to this document is important for the success of various ISO/IEC 29500 maintenance activities.
This indicates that the final DIS text had already been received by SC34 (but not distributed) as of that date (April 9th).
Well, here we are, May 4th, over two months since the final DIS text was due, and past the date requested by the SC34 Plenary (who by they way have no authority to extend the deadline required by JTC1 Directives, but that is another story). We have nothing.
So, I'll make my own personal appeal. JTC1 has the text. The Directives are clear. The delay is unnecessary and harmful in the ways I outlined above. Release the final DIS text now. Not next month. Not next week. Release it now.
Labels: OOXML
Friday, April 18, 2008
Sinclair's Syndrome
The number of pages of a document is not a criterion cited in the JTC 1 Directives for refusal. It should be noted that it is not unusual for IT standards to run to several hundred, or even several thousand pages.
Now certainly there are standards that are several pages long. For example, Microsoft likes to bring up the example of ISO 14496, MPEG 4, at over 4,000 pages in length. But that wasn't a Fast Track. And as Arnaud Lehors reminded us earlier, MPEG 4 was standardized in 17 parts over 6 years.
So any answer in the FAQ which attempts to consider what is usual and what is unusual must take account of past practice JTC1 Fast Track submissions. That, after all, was the question the FAQ purports to address.
Ecma claims (PowerPoint presentation here) that there have been around 300 Fast Tracked standards since 1987 and Ecma has done around 80% of them. So looking at Ecma Fast Tracks is a reasonable sample. Luckily Ecma has posted all of their standards, from 1991 at least, in a nice table that allows us to examine this question more closely. Since we're only concerned with JTC1 Fast Tracks, not ISO Fast Tracks or standards that received no approval beyond Ecma, we should look at only those which have ISO/IEC designations. "ISO/IEC" indicates that the standard was approved by JTC1.
So where did things stand on the eve of Microsoft's submission of OOXML to Ecma?
At that point there had been 187 JTC1 Fast Tracks from Ecma since 1991, with basic descriptive statistics as follows:
- mean = 103 pages
- median = 82 pages
- min = 12 pages
- max = 767 pages
- standard deviation = 102 pages
A histogram of the page lengths looks like this:

So the ISO statement that "it is not unusual for IT standards to run to several hundred, or even several thousand pages" does not seem to ring true in the case of JTC1 Fast Tracks. A good question to ask anyone who says otherwise is, "In the time since JTC1 was founded, how many JTC1 Fast Tracks have been submitted greater than 1,000 pages in length". Let me know if you get a straight answer.
Let's look at one more chart. This shows the length of Ecma Fast Tracks over time, from the 28-page Ecma-6 in 1991 to the 6,045 page Ecma-376 in 2006.

Let's consider the question of usual and unusual again, the question that ISO is trying to inform the public on. Do you see anything unusual in the above chart? Take a few minutes. It is a little tricky to spot at first, but with some study you will see that one of the standards plotted in the above chart is atypical. Keep looking for it. Focus on the center of the chart, let your eyes relax, clear your mind of extraneous thoughts.
If you don't see it after 10 minutes or so, don't feel bad. Some people and even whole companies are not capable of seeing this anomaly. As best as I can tell it is a novel cognitive disorder caused by taking money from Microsoft. I call it "Sinclair's Syndrome" after Upton Sinclair who gave an early description of the condition, writing in 1935: "It is difficult to get a man to understand something when his salary depends upon his not understanding it."
To put it in more approachable terms, observe that Ecma-376, OOXML, at 6,045 pages in length, was 58 standard deviations above the mean for Ecma Fast Tracks. Consider also that the average adult American male is 5′ 9″ (175 cm) tall, with a standard deviation of 3″ (8 cm). For a man to be as tall, relative to the average height, as OOXML is to the average Fast Track, he would need to be 20′ 3″ (6.2 m) tall !
For ISO, in a public relations pitch, to blithely suggest that several thousand page Fast Tracks are "not unusual" shows an audacious disregard for the truth and a lack of respect for a public that is looking for ISO to correct its errors, not blow smoke at them in a revisionist attempt to portray the DIS 29500 approval process as normal, acceptable or even legitimate. We should expect better from ISO and we should express disappointment in them when they let us down in our reasonable expectations of honesty. We don't expect this from Ecma. We don't expect this from Microsoft. But we should expect this from ISO.
Monday, March 24, 2008
OOXML's (Out of) Control Characters
The value space for an XML data item comprises the set of all allowed values. So the value space for the “float” data type would be all floating point numbers, such as 12.34 or 43.21. The lexical space comprises all ways of expressing these values in the character stream of an XML document. So lexical representations of the value 12.34 include “12.34”, “12.340” and '1.234E1”. For ease of illustration I will indicate value space items in bold, and lexical space items in quotes. In general there are multiple lexical representations that may represent the same value.
Character data in XML also permits more than one lexical representation of the same value. For example, “A” and “A” both represent the value A. The “numerical character reference” approach allows an XML author to easily encode the occasional Unicode character which is not part of the author's native editing environment, e.g., adding the copyright character or occasional foreign character. The value space allowed by XML includes most of Unicode, including all of the major writing systems of the world, current and historical.
The concern I have with DIS 29500 concerns Ecma's introduction of a ST_XString (Escaped String) datatype. This new type is defined via the following XML Schema definition:
<simpletype name="ST_Xstring">
<restriction base="xsd:string">
</simpletype>
This uses the “derivation by restriction” facility of XML Schema to define a new type, derived from the standard xsd:string schema type. The xsd:string type is defined to allow only character values that are also allowed in the XML standard.
The use of derivation by restriction implies a clear relationship between the ST_Xstring type and the base type xsd:string. This is stated in XML Schema Part 1, clause 2.2.1.1:
A type definition whose declarations or facets are in a one-to-one relation with those of another specified type definition, with each in turn restricting the possibilities of the one it corresponds to, is said to be a restriction.
The specific restrictions might include narrowed ranges or reduced alternatives. Members of a type, A, whose definition is a restriction of the definition of another type, B, are always members of type B as well.
The latest sentence can be taken as a restatement of the Liskov Substitution Principle, a fundamental principle of interface design, that a subtype should be usable (substitutable) wherever a base type is usable. It is this principle that ensures interoperability. A type derived by restriction limits, restricts, constrains, reduces the permitted value space of its base type, but it cannot increase the value space beyond that permitted by its base type.
So, with that background, let's now look at how OOXML defines the semantics of its ST_Xstring type:
ST_Xstring (Escaped String)
String of characters with support for escaped invalid-XML characters.
For all characters which cannot be represented in XML as defined by the XML 1.0 specification, the characters are escaped using the Unicode numerical character representation escape character format _xHHHH_, where H represents a hexadecimal character in the character's value. [Example: The Unicode character 8 is invalid in an XML 1.0 document, so it shall be escaped as _x0008_. end example]
This simple type's contents are a restriction of the XML Schema string datatype.
In other words, although ST_Xstring is declared to be a restriction of xsd:string it is, via a proprietary escape notation, in fact expanding the semantics of xsd:string to create a value space that includes additional characters, including characters that are invalid in XML.
Let's review some of the problems it introduces.
First, the semantics of XML strings that contain invalid XML-characters is undefined by this or any other standard. For example, OOXML uses ST_Xstring in Part 4, Clause 3.3.1.30 to store the error message which should be displayed when a data validation formula fails. But what should an OOXML-supporting application do when given a display string which contains control characters from the C0 control range, characters forbidden in XML 1.0?
- U+0004 END OF TRANSMISSION
- U+0006 ACKNOWLEDGE
- U+0007 BELL
- U+0008 BACKSPACE
- U+0017 SYNCHRONOUS IDLE
There is a reason XML excludes these dumb terminal control codes. They are neither desired nor necessary in XML.
Elliotte Rusty Harold explains the rationale for this prohibition in his book Effective XML:
The first 32 Unicode characters with code points 0 to 31 are known as the C0 controls. They were originally defined in ASCII to control teletypes and other monospace dumb terminals. Aside from the tab, carriage return, and line feed they have no obvious meaning in text. Since XML is text, it does not include binary characters such as NULL (#x00), BEL (#x07), DC1 (#x11) through DC4 (#x14), and so forth. These noncharacters are historic relics. XML 1.0 does not allow them.
This is a good thing. Although dumb terminals and binary-hostile gateways are far less common today than they were twenty years ago, they are still used, and passing these characters through equipment that expects to see plain text can have nasty consequences, including disabling the screen.
Further, since these characters are undefined in XML, they are unlikely to work well with existing accessibility interfaces and devices. At best these characters will be ignored and introduce subtle errors. For example, what does “$10,[BS]000” become if one system processes the backspace and another does not? Worst case, the accessibility interface expecting a certain range of characters as defined by the xsd:string type will crash when presented with values beyond the expected range.
Interfaces with existing programming languages are also harmed by ST_Xstring. How does a C or C++ XML parser deal with XML that now can allow a U+0000 (NULL) character in the middle of a string, something which is illegal in that programming language?
What about XML database interfaces that take XML data and store it in relational tables? If they are schema-aware and see that ST_Xstring is merely a restriction of xsd:string, they will assume the normal range of characters can be stored wherever an xsd:string can be stored. But since the value space is expanded, there is no guarantee that this will still be true. These characters may cause validation errors in the database.
By now, the observant reader may be accusing me of pulling a fast one. "But Rob, none of the above is a problem if the application simply leaves the ST_Xstring encoded and does not try to decode or interpret the non-XML character," you might say.
OK. Fair enough. Let's follow that approach and see where it leads us.
Let's look at interoperability with other XML-based standards. Imagine you do a DOM parse of an OOXML document that contains “strings” of type ST_Xstring. Either your parser/application is OOXML-aware, or it isn't. In other words, either it is able to interpret the non-standard _xHHHH_ instructions, or it isn't.
If it doesn't understand them, then any other code that operates on the DOM nodes with ST_Xstring data is at risk of returning the wrong answer. For example, what is the length of the string “ABC”? Three-characters, of course. But what is the length of the string “_x0041_BC” ? These two strings both have the same values according to OOXML. But an XML application might return 9 or return 3, depending on whether it is OOXML-aware or not. Since most (all) XML parsers are unaware of the non-standard escape mechanism proposed by OOXML, they will typically calculate things such as string lengths, string comparisons, string sorting, etc., incorrectly.
But suppose the parser/application is OOXML-aware and correctly decodes these character references into the correct Unicode values, then what? Assuming the host language doesn't crash from the existence of this control characters, we then are presented with problems at the interface with any other code that operates on the DOM. Suppose we try to transform the DOM via XSLT to XHTML. Will the XSLT engine properly handle the existence of these forbidden character values? The XSLT engine may just crash. But suppose it doesn't. How does it write out these control characters into XHTML? It can't. These values are not permitted in XHTML. Dead end. What about DocBook? DITA? OpenDocument Format? Not possible. Since these characters are not permitted in XML 1.0 at all, they will be forbidden in all other markup languages that are based on XML 1.0, or even XML 1.1 for that matter (XML 1.1 allows some but not all of these characters, in particular the NULL character is excluded).
Note further that with XML pipelining and with mashups, the application that writes XML output typically does not have direct knowledge of the application that originally produced the XML values. This decoupling of producers and consumers is an essential aspect of modern systems integration, include Web Services. By corrupting XML string values in the way that it does, DIS 29500 breaks the ability to have loosely coupled systems. Once the value space is polluted by these aberrant control characters, every application, every process that touches this data must be aware of their non-standard idiosyncrasies lest they crash or return incorrect answers. In this way, one standard perverts the entire XML universe, forcing them all to contend with the poor hygiene of a single vendor.
The reader might think that I exaggerate the importance of this, that surely ST_Xstring is only used in OOXML in edge cases, in rare, compatibility modes. We wish that this were true. However, a look at the DIS 29500 shows that ST_Xstring is pervasive, and in fact is the predominant data type in SpreadsheetML, used to express the vast majority of spreadsheet content, including cell contents, headers, footers, displays strings, error strings, tooltip help, range names, etc. Any application that operates on an OOXML spreadsheet will need to deal with this mess.
For example, here are some uses of ST_Xstring in DIS 29500, Part 4:
- Clause 3.2.3 for the name of a custom view in a spreadsheet
- Clause 3.2.5 for the name of a spreadsheet named range, for the descriptive comment, for the name description, for the
help topic, the keyboard shortcut, the status bar text and for the menu item text - Clause 3.2.14 for the name of a spreadsheet function group
- Clause 3.2.19 for the name of a sheet in a workbook
- Clause 3.2.22 for the name of a smart tag as well as for the URL of a smart tag.
- Clause 3.2.25 for the destination file name and title when publishing spreadsheet to the web.
- Clause 3.3.1.10 for the value of a conditional formatting object, e.g., a gradient
- Clause 3.3.1.20 for the name of a custom property
- Clause 3.3.1.28 for sheet and range names
- Clause 3.3.1.30 for error message string, error message title, prompt string and prompt title in a spreadsheet data validation definition.
- Clause 3.3.1.35 for the value of a footer for even numbered pages.
- Clause 3.3.1.36 for the value of a header for even numbered pages.
- Clause 3.3.1.38 for the content of the first page footer
- Clause 3.3.1.39 for the content of the first page header
- Clause 3.3.1.44 for the display string for a hyperlink, the tooltip help for the link, also the anchor target if the hyperlink is to an HTML page
- Clause 3.3.1.49 for values of input cells in a scenario
- Clause 3.3.1.50 for cell inline text values
- Clause 3.3.1.55 for the value of a footer for odd numbered pages.
- Clause 3.3.1.56 for the value of a header for odd numbered pages.
- Clause 3.3.1.73, in scenarios for the comment text, the scenario name and the name of the person who last changed the scenario.
- Clause 3.3.1.88 when defining sort condition, for the values of a the custom sort list
- Clause 3.3.1.93 for the value contained within a cell
- Clause 3.3.1.94 for information associated with items published to the web, including the destination file and the title of the output HTML file
- Clause 3.3.2.2 for expressing the criteria values in a filter
- Clause 3.3.15 for the key/values for smart tag properties
- Clause 3.4.4 for expressing the contents of a rich text run
- Clause 3.4.5 for expressing the name of a font
- Clause 3.4.6 for expressing the text of a phonetic hint for East Asian text
- Clause 3.4.8 for expressing a text item in the shared string table
- Clause 3.4.12 for the text content shown as part of a string
- Clause 3.5.1.2 for a table, expressing a textual comment, a display name as well as style names.
- Clause 3.5.1.3 for a table column, expressing cell and row style names, column name
- Clause 3.5.1.7 for column properties created from an XML mapping, for expressing the associated XPath.
- Clause 3.5.2.4 for the XPath associated with column properties for XML tables
- Clause 3.7.1-3.7.6 for specifying content of tracked comments, including the text of the comments as well as the authors of the comments
- Clause 3.8.29 expressing the name of a font
The reader might further argue that, although the type allows characters that are forbidden by XML, the actual occurrence of these values in real legacy documents is likely to be rare. This might be true, but this is cause for even greater concern. If every document contained these control characters, then we would immediately be aware of any interoperability problems when integrating OOXML data with other systems. But if these characters are permitted, but occur rarely and randomly, then the integration errors will also occur rarely and randomly, allowing data corruption and other problems to occur and propagate further before detection.
In summary, we are concerned that the ST_Xstring type in OOXML opens us up to problems such as:
- Introducing accessibility problems
- Breaking unaware C/C++ XML parsers
- Breaking XML databases
- Breaking interoperability with other XML languages
- Breaking application logic related to string searching, sorting, comparisons, etc.
- Introducing errors that will be hard to detect and resolve
- Use xsd:string uniformly instead of ST_Xstring, with no use of forbidden XML characters. This would require that applications that read legacy binary documents containing such characters eliminate them at this point, perhaps replacing them with licit characters or with whitespace. No application will be more able to devise the original meaning and intent of these characters than the original vendor. So they should be responsible for cleaning up these strings to make them XML-ready.
- Use a non-string type such as the binary xsd:hexBinary or xsd:base64Binary to represent these data items.
- Use a mixed content encoding, where the licit characters are represented by xsd:string data, and the forbidden characters are denoted by specially-defined elements. So “A_x0008_BC” would become: <text>A<backspace/>BC </text>. In this case the semantics of the <backspace> element would need to be documented in the DIS 29500 specification, including its effect on searching, sorting, length calculations, etc.
Labels: OOXML
Five (Bad) Reasons to Approve OOXML
- If you don't approve OOXML, Microsoft will walk away, and you'll never hear from them again. Forget the fact that OOXML is already an Ecma standard (Ecma-376), and cannot be taken away. Forget the fact that Microsoft has other formats lined up for ISO approval in the near future, like XPS or HD Photo. Microsoft wants you to think that if you don't give them exactly what they want, now, they will walk away from ISO and you will be the worse from it. We need to encourage Microsoft for their abuse of the standardization process, in hopes that their participation will evolve in line with our hopes, and not our fears, that they will improve on the standardization side, while curbing the abuse side. Of course, the encouragement could be misinterpreted to mean the opposite, and we could get more abuse, and even lower quality standards. I guess that's the risk we'll just need to take. By similar abuses of logic small children hold their breath until their faces turn blue, thinking they can scare adults into giving them what they want. It doesn't work there either.
- If you approve OOXML, you can have the privilege of spending the next 5 years in the glorious work of fixing thousands of defects in the text. You can get a seat at the table, fixing bugs that should have been fixed in Ecma before OOXML was even submitted to JTC1. Forget the fact that maintenance in JTC1 is a ponderous, time consuming activity, where individual defects are enumerated, changes proposed, discussed, voted on, etc. Forget the fact that the recent BRM showed that you can't really get through more than 60 defects in a week-long meeting. Forget the fact that fixing defects in Ecma, not JTC1, would be far faster and easier due to the lighter-weight process Ecma imposes on their TC's. Forget that Fast Track is intended for mature, adopted standards not for ones that will require a "Perpetual BRM". Forget all that. You want a seat at the bug fixing table? You got it.
- Billions and Billions of legacy documents. Well, actually these legacy documents are not in OOXML format; they are in the legacy binary format. And no mapping has been provided from the legacy formats to OOXML. But there are billions and billions of these legacy documents. That must be important. So vote Yes for OOXML because there are billions and billions of documents in some other format that is nebulously related to it.
- More standards are better. More standards means more choice, means more decisions, means more consultants, means more money paid to XML experts. You'll sooner find the American Dairy Council recommending less milk consumption than a standards professional calling for fewer standards. So ignore quality, maturity and need. More standards are a good thing. Like Blue-ray and HD DVD.
- ODF will be better if OOXML is approved. In OASIS we're too stupid to look up legacy features or Excel spreadsheet formulas in Ecma-376. We would have never thought of that. We believe the only way to make ODF better is to make it more like OOXML. That is why we would like to encourage nice little JTC1 countries like Kazakhstan to vote YES for OOXML. As soon as OOXML is approved, then magically, it becomes useful to us. But the exactly same text, not approved by Kazakhstan and JTC1, is not useful to us at all. It is all or nothing. There is nothing in the middle. Rather than taking a useful, high quality text, and approving it on its merits, we are asked to approve a specification with thousands of defects, and by our approval we transform it into something useful to ODF.
Labels: OOXML
Tuesday, March 18, 2008
How many defects remain in OOXML?
So what was the initial quality of OOXML, coming into JTC1? One measure is the defect density, which we can say is at least one defect for every 6045/1027 = 5.8 pages. I say "at least" because this is the lower bounds. If we believed that the 5-month review represented a complete review of the text of DIS 29500, by those with relevant subject matter expertise, then we would have some confidence that all, or at least most, defects were detected, reported and repaired. But I don't know anyone who really thinks the 5-month review was sufficient for a technical review of 6,045 pages. Further, we know that Microsoft worked actively to suppress the reporting of defects by NB's. So the actual defect density is potentially quite a bit higher than the reported defect density.
But how much higher? This is the important question. It doesn't matter how many defects were fixed. What matters is how many remain.
There are several approaches to answering this question. One approach is to look at defect "find rates", the number of defects found per unit of time spent reviewing, and fit that to a model, typical an S-curve (sigmoid) and use that model to predict the number of defects remaining. However, we have no time/effort data for the DIS 29500 review, so we don't have enough data to create that model. Another approach is to randomly sample the post-BRM text and statistically estimate the defect density by this sample.
Are there any other good approaches?
Here is the plan. I will use the second approach. Since I do not actually have the post-BRM text, I need to make some adjustments. I'll start with the original text, in particular Part 4, the XML reference section, at 5,220 pages, where the meat of the standard is. I'll then create a spreadsheet and generate 200 random page numbers between 1 and 5,220. For each random page I will review the clause associated with that page and note the technical and editorial errors I find. I will then check these errors to see if any of them were addressed by BRM resolutions.
Based on the above, I will be able to estimate two numbers:
- The defect density of the text, both pre and post BRM
- The fraction of defects which were detected by the Fast Track review.
Clear enough? Microsoft is claiming something like 99% of all issues were resolved at the BRM. So let's see if we get anything close.
I'm not done with this study yet. I'm finding so many defects that recording them is taking more time than finding them. But since this is topical, I will report what I have found so far, based on the first 25 random pages, or 1/8th completion of my target 200. I've found 64 technical flaws. None of the 64 flaws were addressed by the BRM. Among the defects are some rather serious ones such as:
- storage of plain text passwords in database connection strings
- Undefined mappings between CSS and DrawingML
- Errors in XML Schema definitions
- Dependencies of proprietary Microsoft Internet Explorer features
- Spreadsheet functions that break with non-Latin characters
- Dependencies on Microsoft OLE method calls
- Numerous undefined terms and features
- Page 692, Section 2.7.3.13 — no errors found
- Page 1457, Section 2.15.3.45 — This is a compatibility setting which creates needless complexity for implementers who now must deal with two different ways of handling a page break, one in which a page break ends the current paragraph, and another where it does not. This is not a general need and expresses only a single vendor’s legacy setting.
- Page 490, Section 2.4.72 — This defines the ST_TblWidth type, used to express the width of a table column, cell spacing, margins, etc. The allowed values of this type express the measurement units to be used: Auto, Twentieths of a point, Nil (no width), Fiftieths of a percent. I find these choices to be capricious and not based on any sound engineering principle. It also mixes units with width values (Nil) and modes (auto). This should be changed to allow measurements in natural units, such as allowed in XSL-FO or CSS2, such as mm, inches, points, pica. Also, do not mix units, values and modes in the same attribute. Nil is best represented by the value 0 and Auto should be its own Boolean attribute.
- Page 328, Section 2.4.17 — The frame attribute description says it “Specifies whether the specified border should be modified to create a frame effect by reversing the border's appearance from the edge nearest the text to the edge furthest from the text.” This is not clear. What does it mean to reverse a border’s appearance? Are we doing color inversions? Flipping along the Y-axis? What exactly? Also a typographical error: “For the right and top borders, this is accomplished by moving the order down and to the right of its original location.” Should be “moving the border down…” Also, it is not stated how far the border should be moved.
- Page 1073, Section 2.14.8 — This feature is described as: “This element specifies the connection string used to reconnect to an external data source. The string within this element's val attribute shall contain the connection string that the hosting application shall pass to a external data source access application to enable the WordprocessingML document to be reconnected to the specified external data source.” Since connection to external data typically requires a user ID and a password, the lack of any security mechanism on this feature is alarming. The example given in the text itself hardcodes a plain-text password in it the connection string.
- Page 4387, Section 6.1.2.3 — For the “class” attribute it says “Specifies a reference to the definition of a CSS style.” The example implies that some sort of mapping will occur between CSS attributes and DrawingML. But no such mapping is defined in OOXML. The "doubleclicknotify" attribute implies some sort of event model that us undefined in OOXML. How do you send a message for doubleclicknotify? Why do we describe organization chart layouts here when it is not applicable to a bezier curve? What happens if this shape is declared to be a horizontal rule or bullet or ole object? The text allows you label it as one of these, but assigns no meaning or behavior to this. Why do we have an spid as well as an id attribute? The "target" attribute refers to Microsoft-specific I.E. features such as "_media". Although the text says that control points have default values, the schema fragment does not show this.
- Page 3164, Section 4.6.88 — This and the following two elements are all called "To" but this seems to be a naming error. 4.6.89 is essentially undefined. What does "The element specifies the certain attribute of a time node after an animation effect" mean? It doesn't seem to really signify anything. Ditto for 4.6.90.
- Page 5098, Section 7.1.2.124 — The example does not illustrate what the text claims it does. The example doesn't even use the element defined by this clause.
- Page 4492, Section 6.1.2.11 — The "althref" attribute is described as "Defines an alternate reference for an image in Macintosh PICT format". Why is this necessary for only Mac PICT files? Why would "bilevel" necessarily lead to 8 colors? We're well beyond 8-bit color these days. "blacklevel" attribute is defined as "Specifies the image brightness. Default is 0." What is the scale here? This needs to be defined. Is it 0-1.0, 0-255 or what? And what is "image brightness" in terms of the art? Is this luminosity? Opacity? Is this setting the level of the black point? For "cropleft", etc. -- what units are allowed? (implies %) How does "detectmouseclick" work when no event model is defined? "emboss effect" is not defined. "gain" has the same problem as "blacklevel" -- no scale is defined. This element has two different id attributes in two different namespaces, with two different types. "movie" attribute is described as "Specifies a pointer to a movie image. This is a data block that contains a pointer to a pointer to movie data". Excuse me? "A pointer to a pointer to movie data"? This is useless. The "recolortarget" example appears to contradict the description. It shows shows blue recolored to red, not black. The "src" attribute is said to be a URL, yet is typed to xsd:string. This should be xsd:anyURI.
- Page 1431, Section 2.15.3.30 — no errors noted
- Page 3405, Section 5.1.5.2.7 — The conflict resolution algorithm should be normative, not merely in a note.
- Page 875, Section 2.11.21 — Instead of saying that the footnote "pos" element should be ignored if present at the section level, the schema should be defined so as to not allow it at the section level. In other words, this should be expressed as a syntax constraint.
- Page 1955, Section 3.3.1.20 — This facility for adding "arbitrary" binary data to spreadsheets is said to be for "legacy third-party document components". No documentation or mapping for such legacy components has been provided, so interoperability with this legacy data cannot be achieved. Why isn't this expressed using the extension mechanisms of Part 5 of the DIS?
- Page 4526, Section 6.1.2.13 — The "allowoverlap" attribute is not sufficiently defined. In particular, what determines whether the object shifts to right or left? ST_BWMode is not adequately defined. For example, one option is "Use light shades of gray only". How light? And what is the difference between "hide" and "undrawn"? Also, concept of "wrapping polygon" is not sufficiently defined. For example, what is the wrapping polygon for an oval? The purpose of "dgmlayoutmru" is obscure. Wouldn't the most-recently-used layout option be the one which is actually in use, "dgmlayout"? The "dgmnodekind" attribute is undefined, said to be "application-specific". Is interoperabilty not allowed? The text seems to imply that applications must use application-specific values. The "href" attribute is give a string schema type. Shouldn't this be xsd:anyURI. The "id" attribute is said to be a "unique identifier". Unique in what domain? Among shapes of this type? Among all shapes? All shapes on this page? Among all ID's in the document? The "preferrelative" attribute is not sufficiently defined. Where is the original size stored? After what reformatting? This appears to be a specification for runtime behavior, not a storage artifact. But it is not clear what is required. For the "regroupid", where is the list of these possible id's? The Hyperlink targets _media and _search are Internet Explorer proprietary features.
- Page 1193, Section 2.15.1.39 — no errors noted
- Page 1459, Section 2.15.3.46 — no errors noted
- Page 2671, Section 3.17.7.150 — no errors noted
- Page 2347, Section 3.10.1.69 — An "AutoShow" filter is not defined in this standard, though it is called for in several places of this section. "Average" aggregation function is not defined. In fact, none of these aggregation functions are defined. Although some have common mathematical definitions, in a spreadsheet context it is critical to make an explicit statement on treatment of strings, blanks, empty cells, etc. For dataSourceSort, what type of sort is required? Lexical or locale-sensitive? This element seems to mix field-specific settings, like dragToCol with pivotTable-wide settings like hiddenLevel. This will result in large data redundancy as settings like hiddenLevel are stored multiple times, once for each pivotField. "Inclusive Mode" is not defined. "Measure based filter" is not defined. "AutoSort" mode is not defined. The resolution of pivot table versus cell styles is ambiguous. "If the two formats differ, the cell-level formatting takes precedence." Is this negotiation done at the level of the entire text style? Style ID? Or at the attribute level? "Outline form" is not defined. "server-based page field" is not defined. (what is a page field?) "member caption" is undefined.
- Page 2885, Section 3.18.51 — The values of the given type (ST_OleUpdate) are explicitly tied to the Microsoft Windows OLE2 technology via the two method calls IOleObject::Update or IOleLink::Update
- Page 3951, Section 5.5.3.4 — The base values "margin" and "edge" are ambiguous. Is it specifying positioning from the left or right page edge?
- Page 2710, Section 3.17.7.200 — The description of "lookup-vector" is insufficient. It seems to be saying that the range should be sorted. Is this really correct? Spreadsheet functions typically do not have side effects. Also, the sorting procedure is explicitly defined only defined for the Latin alphabet. What about the rest of allowed Unicode characters, including the C0 control characters which are allowed in SpreadsheetML cell contents? Where are they sorted?
- Page 934, Section 2.13.5.5 — The "id" attribute is required to be unique, but it is not specified over what domain it must be unique.
- Page 607, Section 2.6.2 — What does "reversing the borders's appearance mean"? How much offset is required for a shadow?
- Page 201, Section 2.3.2.19 — This feature allows the suppressing of both spell and grammar checking for a text run. These should be two different settings, one for spelling and one for grammar proofing. There are many cases where it is important to check one, but not the other, just as in content comprised of sentence fragments, which are not grammatically complete, but where correct spelling is desired.
- Page 1240, Section 2.15.1.74 — This setting specifies that the document should be saved into an undefined invalid XML format. But it is not stated how an XSLT transfor can be applied to an OOXML document, since OOXML is a Zip file containing many XML documents. So what exactly is the specified XSLT applied to?
That's as far as I've gone. But this doesn't look good, does it? Not only am I finding numerous errors, these errors appear to be new ones, ones not detected by the NB 5-month review, and as such were not addressed in Geneva. Since I have not come across any error that actually was fixed at the BRM, the current estimate of the defect removal effectiveness of the Fast Track process is < 1/64 or 1.5%. That is the upper bounds. (Confidence interval? I'll need to check on this, but I'm thinking this would be based on standard error of a proportion, where SE=sqrt((p*(1-p))/N)), making our confidence interval 1.5% ± 3%) Of course, this value will need to be adjusted as my study continues. However, it is starting to look like the Fast Track review was very shallow and that detected only a small percentage of the errors in the DIS.
[20 March Update]
As one commenter noted, the page numbers I'm using above are PDF page numbers, not the page numbers on bottom of each page. If I used the printed pages then I would need to deal with all the Roman numeral front matter pages as an exception. Simpler to just use the one large domain of PDF page numbers.
PDF Page Number = Printed Page Number + 7
I will continue to report new defects, according to the original random number list I generated. I'll update the statistics every 25.
Here's some more for today:
- Page 4192, Section 5.8.2.20 — "fPublished" attribute is defined as "Specifies whether the shape shall be published with the worksheet when sent to the spreadsheet server. This is for use when interfacing with a document server." What worksheet? This section is in the DrawingML reference material. Charts could appear in presentations as well. This should not be limited to worksheets. Also what is a "spreadsheet server"? No such technology has been defined in this standard. Also no protocol has been defined for publishing to a spreadsheet server. Is this some proprietary hook for SharePoint? The "macro" attribute allows the storage of application-defined scripts. We are told that the macro "should be ignored if not understood." However there is no mechanism for determining what language the script is in. How do we know if we understand the macro? Content sniffing? Attempt to execute it and see if we get a runtime error? But by that time, once we find out that we do not understand it, it is too late to ignore the macro. We may have already triggered runtime side effects. What we really need here is some way to declare what scripting language is being used, via a namespace or an additional attribute like "lang".
- Page 3526, Section 5.1.5.4.21 — The "algn" attribute specifies the text alignment. Allowed values include left, right, center, justified, etc. However, what is lacking is "start" and "end" alignment, which are sensitive to writing direction and are part of internationalization bets practices, for example, XSL-FO. When translating a document between RTL and LTR systems, the approach used by OOXML will harder to deal with and be more expensive to translate, since the translator will need to manually play with styles on not just perform an semi-automated translation.
[End Update]
I'll continue to review the remaining 173 pages of my random sample and update the numbers and the defect list as I go. If you want to play along at home, the upcoming random page numbers will be:
- 1039
- 4933
- 3334
- 1993
- 1632
- 4787
- 460
- 481
- 4497
- 310
- 282
- 2383
- 1793
- 2451
- 3310
- 3716
- 1261
- 1077
- 2219
- 4236
- 285
- 3090
- 737
- 2370
- 741
- 164
- 5044
- 364
- 2272
- 1377
- 4512
- 1410
- 964
- 5079
- 5030
- 4110
- 3620
- 3588
- 2301
- 3222
- 4485
- 5082
- 193
- 3632
- 985
- 1593
- 5155
- 1054
- 3371
- 3717
- 5015
- 1071
- 2965
- 2294
- 1809
- 161
- 4922
- 5219
- 1719
- 1040
- 4259
- 3134
- 1195
- 4232
- 4444
- 3931
- 2302
- 2788
- 3584
- 8
- 5092
- 2580
- 1080
- 1239
- 1415
- 1170
- 1501
- 151
- 148
- 4754
- 1350
- 3714
- 1895
- 3926
- 4833
- 2886
- 2983
- 1439
- 3622
- 4960
- 2000
- 2555
- 671
- 2388
- 352
- 222
- 1630
- 3033
- 4994
- 3346
- 531
- 2393
- 482
- 207
- 2252
- 4074
- 3302
- 2459
- 751
- 1891
- 1635
- 3120
- 2226
- 1119
- 810
- 1728
- 837
- 4570
- 4474
- 1072
- 3901
- 300
- 4895
- 1764
- 2332
- 619
- 4392
- 2112
- 1653
- 4339
- 2384
- 4566
- 4085
- 1171
- 2238
- 5144
- 1399
- 4157
- 1352
- 27
- 4118
- 4167
- 5046
- 4460
- 4053
- 1258
- 4252
- 922
- 3748
- 1742
- 458
- 4448
- 963
- 2227
- 1404
- 593
- 4140
- 1739
- 1102
- 1611
- 3016
- 2646
- 3083
- 5105
- 747
- 1142
- 2596
- 845
- 626
- 4047
- 1415
- 5143
- 3997
Labels: OOXML
Friday, March 14, 2008
The Disharmony of OOXML
An easy counter-example is HTML. Does HTML reflect the internals of NCSA Mosaic? Does it represent the internals of Netscape Navigator? Firefox? Opera? Safari? Are any faults in HTML properly justified by what a single browser does internally? Applications should follow standards, not the other way around.
The question we should be asking is not whether a standard is similar to an application's internal representation. That in itself is not necessarily a fault. We need to go a step further, and ask if this encoding represents reasonable engineering decisions, not just for that one application, but for general use? Or in ISO terms, does it represent the "consolidated results of science, technology and experience"? If it is a good, reasonable engineering choice with general applicability, and the original application already found that solution, then this is a good thing. We should be encouraging standards to encode the best practices of industry.
Take colors for example. There are only so many ways one can represent colors in markup. You can have an RGB value encoded as triplets where red= (255,0,0). Or you can have a hexadecimal integer encoded as RRGGBB, where red=FF0000. You can do it W3C style, like CSS or XSL-FO with a hash mark in front, like #FF0000". There are variations on this, adding an alpha channel, using a different color model, etc. These are all reasonable engineering choices, and no one would fault a standard for choosing any one of them, even if the choice happens to match what a particular application does. They are all reasonable choices.
The final arbiter is engineering judgment. Making a capricious choice, merely because a particular application made that same choice, in spite of contrary engineering judgment, this would be a bad thing.
With this in mind, let's take a look at how OOXML and ODF represent a staple of document formats: text color and alignment. I created six documents: word processor, spreadsheet and presentation graphics, in OOXML and ODF formats. In each case I entered one simple string "This is red text". In each case I made the word "red" red, and right aligned the entire string. The following table shows the representation of this formatting instruction in OOXML and ODF, for each of the three application types:
| Format | Text Color | Text Alignment |
|---|---|---|
| OOXML Text | <w:color w:val="FF0000"/> | <w:jc w:val="right"/> |
| OOXML Sheet | <color rgb="FFFF0000"/> | <alignment horizontal="right"/> |
| OOXML Presentation | <a:srgbClr val="FF0000"/> | <a:pPr algn="r"/> |
| ODF Text | <style:text-properties fo:color="#FF0000"/> | <style:paragraph-properties fo:text-align="end" /> |
| ODF Sheet | <style:text-properties fo:color="#FF0000"/> | <style:paragraph-properties fo:text-align="end"/> |
| ODF Presentation | <style:text-properties fo:color="#FF0000"/> | <style:paragraph-properties fo:text-align="end"/> |
The results speak for themselves.
What is the engineering justification for this horror? I have no doubt that this accurately reflects the internals of Microsoft Office, and shows how these three applications have been developed by three different, isolated teams. But is this a suitable foundation for an International Standard? Does this represent a reasonable engineering judgment? ODF uses the W3C's XSL-FO vocabulary for text styling, and uses this vocabulary consistently. OOXML's representation, on the other hand, appear incompatible with any deliberate design methodology.
I fear that before we can tackle harmonization of ODF and OOXML, we will first need to harmonize OOXML with itself!
Labels: OOXML
Tuesday, March 11, 2008
Implementation-defined (Not really)
An easy way to find these extension points is to search the OOXML specification for "application-defined" or "implementation-defined". You will find dozens of them, such as:
- In general, scripting
- In general, macros
- In general, DRM
- Part 1 — "Application-Defined File Properties Part" which is totally undefined, but is referenced 13 times for specific fields in Part 4.
- Section 2.16.4.1 — implementation-defined date/time formatting
- Section 2.16.5.34 — implementation-defined document filters
- Section 3.17.2.6 — implementation-defined string-->number conversions in a spreadsheet
- Section 2.8.2.2 — character sets supported by a font
- Section 2.9.6 — the interpretation of the mysterious hex "template code" in numbered list overrides — "The method by which this value is interpreted shall be application-defined."
- Section 2.14.27 — application-defined storage of exclusion data for a mail merge
- Section 2.15.1.28 — application-defined cryptographic hash algorithms
- 2.15.1.76 — "Specifies a string identifier which may be used to locate the XSL transform to be applied. The semantics of this attribute are not defined by this Office Open XML Standard - applications may use this information in any application-defined manner to resolve the location of the XSL transform to apply."
- Section 5.6.2.12 — application-defined macro string reference for connection shape
- Section 5.6.2.15 — application-defined macro string reference for graphic frame
- Section 5.6.2.24 — application-defined macro string reference for a picture object
- Section 5.6.2.28 — application-defined macro string reference for a shape
- Section 5.8.2.9 — application-defined macro string reference for a connection shape
- Section 5.8.2.12 — application-defined macro string reference for a graphic frame
- Section 6.2.2.14 — "This element specifies the presence of an ink object. An ink object is a VML object which allows applications to store data for ink annotations in an application-defined format."
- Section 7.6.2.60 — implementation-defined bibliographic citation formats
- And many, many more.
behavior, implementation-defined — Unspecified behavior where each implementation documents that behavior, thereby promoting predictability and reproducibility within any given implementation. (This term is sometimes called “application-specific behavior”.)
behavior, locale-specific — Behavior that depends on local conventions of nationality, culture, and language.
behavior, unspecified —Behavior where this Standard imposes no requirements. [Note: To add an extension, an implementer must use the extensibility mechanisms described by this Standard rather than trying to do so by giving meaning to otherwise unspecified behavior. end note]
Note that this is not an entirely novel definition. Anyone who has spent time reading over the C and C++ Programming Language standards, in ANSI or in ISO, will recall a similar set of definitions. For example, these from ISO/IEC 9899:1999 C-Programming Language:
implementation-defined behavior
unspecified behavior where each implementation documents how the choice is made
locale-specific behavior
behavior that depends on local conventions of nationality, culture, and language that each implementation documents
unspecified behavior
behavior where this International Standard provides two or more possibilities and
imposes no further requirements on which is chosen in any instance
So, you can see that OOXML pretty much copies these definitions. However, ISO standards like ISO/IEC 9899:1999 go one step further and make an additional statement in their conformance clause, something that is distinctly missing from OOXML:
"An implementation shall be accompanied by a document that defines all implementation-defined and locale-specific characteristics and all extensions."
If you poke around you will see that all conformant C compilers indeed do come with a document that defines how their implementation-defined features were implemented. For example, GNU's gcc compiler comes with this document.
So, by failing to include this in their conformance clause, OOXML's use of the term "implementation-defined" is toothless. It just means "We don't want to tell you this information" or "We don't want to interoperate". Conformant applications are not required to actually document how they extend the standard. You can look at Microsoft Office 2007 as a prime example. Where is this documentation that explains how Office 2007 implements these "implementation-defined" features? How is interoperability promoted without this?
(This item not discussed at the BRM for lack of time.)
Labels: OOXML
Contra Durusau, Part 1
From the start Patrick has remained publicly silent on the topic of OOXML. No blog posts, no press, nothing. If you asked, he would say that this was his policy. Privately, you would get an earful (all negative), but as befits the unbiased chair of the committee which is responsible for the technical recommendation for the US NB, he kept his personal opinions out of the public arena.
This public orientation changed recently. As best I can figure it, on returning from a conference in Seattle in late January, Patrick was a changed man. Patrick is now an enthusiastic OOXML supporter and is eager to inform the world of his delight in OOXML at every opportunity. He posts his "open letters" on his web site, which are linked to, often within minutes, by the various Microsoft bloggers, and then sent around by Microsoft employees to the press and the various JTC1 NB's.
Patrick is entitled to his own opinions. Free speech (and free enterprise for that matter) are things which all red-blooded Americans believe in, among other things. So long as Patrick makes it clear that he is speaking for himself, I have no problem with this.
Of course, Microsoft will not be so careful to distinguish Patrick's personal opinions from his professional affiliations. So a post from Patrick's personal web site is retold on a Microsoft blog as "The ODF Editor says....", and then the next day is sent in an email to NB's with a larger set of "endorsements":
Chair, V1 - US TAG to JTC 1/SC 34
Convener, JTC 1/SC 34/WG 3 (Topic Maps)
Editor, OpenDocument Format TC (OASIS), Project Editor ISO/IEC 26300
Co-Editor, ISO/IEC 13250-1, 13250-5 (Topic Maps)
By the time it is actually discussed at the NB committee level, I wouldn't be surprised if it morphs into an assertion that JTC1/SC34, INCITS, the ODF TC and the City Council of Covington, Georgia have all approved OOXML. It is dangerous to wear many hats when dealing with Microsoft. They are not ones for fine distinctions.
But now on to the substance of Patrick's letters.
In his first note, the "OpenXML Poster Child", Patrick says:
OpenXML has progressed from being developed in a closed environment to being handed over to approximately 70% of the world's population for future development so I am missing the "not open" aspect of OpenXML. If anything, the improvements made to OpenXML during that process make it a poster child for the open standards development process.
.
.
.
I understand that SC 34 will be taking on the maintenance and future development of OpenXML (with the participation of Ecma). That will mean that approximately 70% of the world's population will have a say (through their respective national bodies) on how OpenXML continues to develop. I can't speak for anyone other than myself but that sounds pretty open to me. (That presumes approval of OpenXML as an ISO standard, which must be decided by every national body for itself.)
We've covered this before. Let's go down the list again. Where are the minutes from Ecma TC45 teleconferences? Where are the public archives of their mailing list? Where is the list of individuals participating in the TC? Where is the list of voting members? Where are the public comments they have received on OOXML? You call this open?
For ODF, all of this information is easily available to the public, here, here and here.
And don't give me the canard about how moving to SC34 results in greater representation.
In the US who represents our population? The 7 members of V1 before the DIS 29500 process began? Or the 26 members after Microsoft stuffed V1 (the committee that you chair) with business partners last summer? Or V1 after several of them were kicked out for not paying their dues? Or the V1 after the DIS 29500 procedure completes and the warm bodies fade away? In your opinion, which one do you believe truly represents our US population?
Similarly, SC34 was stuffed with new P-members and swelled from 9 P-members in 2006 to 40 today, most of which voted in favor of OOXML and then failed to participate in any other SC activities. Are you seriously suggesting that SC34 was increasing the world's influence over Microsoft's decisions? That sounds quite naive. To me this looks much more like Microsoft is increasing their influence over the world, and JTC1 NB's in particular.
The long list of shenanigans recorded, from Sweden to Portugal, from Poland to Switzerland is further evidence that the second interpretation is the accurate one. Is offering Microsoft partners rewards for joining a committee a way of increasing openness? Is joining JTC1 three days before the Sept. 2nd vote, then voting Yes without comments the way in which the world is able to gain a seat at the table?
Moving on.
Patrick's next post is "Co-Evolution". This, plus Microsoft's recent interoperability announcements (yes, yet more announcements) give the impression that they believe it is better to talk about interoperability than to do something about it. Interoperability is something we only talk about now, but accomplish sometime in the nebulous future, like weight loss or reducing the national debt. Create studies, write reports, open labs, make test cases, write more reports. But when given the opportunity to do something now which would actually improve interoperability, like adding missing features to OOXML to accommodate the richer text model in ODF, then just say "No". You can always do a study on this later, and write another report, and make a test case.
But if announcements alone could improve interoperability, then Microsoft would have solved this problem long ago and many times over.
The perspective that is missing in Patrick's analysis is that of the vast part of the world's population that does not benefit, and in fact is distinctly disadvantaged by having multiple incompatible document standards. We've been here before, in the 1980's and 1990's. It was not fun. We should not be seeking ways to repeat that failure.
Much of the world is also disadvantaged by the monopolist's rent paid on Microsoft products and the associated lack of choice in today's software monoculture. I'd rather help the world free itself of this oppression than appease the oppressor in hopes that he'll wield a more lenient whip.
Last September, the NB's of Great Britain, Brazil, Chile, Colombia, New Zealand, and the United States all requested that specific features be added to OOXML in order to improve interoperability with ISO ODF, in total 40 features such as the ability to have background images in tables or to have font weights beyond “normal” and “bold”. These were the exact features that Microsoft's translator project on SourceForge identified as needed to improve interoperability with ODF. Ecma rejected all of these requests. They did not reject them because the features were unreasonable. They were rejected purely because they were ODF features.
So given the chance to do more than just write reports and have panel discussions, Ecma refused to move interoperability forward even one inch. If this is them on their best behavior (they desperately need NB approval votes), then why would we expect greater consideration from them if OOXML were approved?
In his next letter,"Confusion", Patrick responds to Andy Updegrove, but not having followed that debate, I'm the one who is now confused. Patrick seems to be arguing that it doesn't matter whether OOXML is "good" or not (in fact he seems to argue that there is no "good" or "bad" when it comes to XML) but that it will be better if OOXML was someplace where we could talk at it more.
I don't know whether I'd choose to use moral terms when describing engineering artifacts either, but I would note that if the basic protocols and formats of the web were as poorly designed as OOXML, the web would never have thrived to become the glory it is today.
In "On the Importance of Being Heard" Patrick generously gives us his opinion of the DIS 29500 BRM he did not attend. The argument formally comes down to this:
- Based on published and unpublished reports from the BRM, it appears that "everyone at the table was heard" and "Microsoft was listening to everyone" in a "public and international" forum.
- If we now reject OOXML, we "all lose a seat at the table where the next version of the Office standard is being written".
- If we approve OOXML, even though "rough" then this "gives all of us a seat at the table for the next Office standard".
- Therefore, Patrick recommends approval of DIS 29500.
This argument has several critical flaws.
First, it is inaccurate to call the BRM proceedings "public". Neither the public nor the press was allowed to attend. Security guards were posted at the door to enforce this mandate. JTC1 is a private, Swiss-headquartered NGO, answerable to no one, with no statutory responsibility to the public. Patrick talks about "ordinary users, governments, smaller interests" having a seat at the table. This is a fantasy. I did not see any such representation at the table in Geneva. One in five BRM attendees were Microsoft employees. Over 25% of the 114 people in attendance were either Microsoft or Ecma TC45 members. I fear that Patrick underestimates the extent to which NB's have been stacked over the past two years and that he preserves some illusion of SC34 NB's comprised of "ordinary users, governments, smaller interests". Maybe that was true a few years ago, but the neighborhood has changed.
Was everyone at the table heard? Formally, it is true that every delegation had the opportunity to raise a single issue during the week. Some (those earlier in the alphabet) had the opportunity to raise two issues. But I think it is disingenuous to cast that as "everyone at the table was heard". For many delegations it was true that for every issue they were able to raise, they had 10 or 20 more that they wanted to raise, based on their analysis of Ecma's proposed dispositions, but were unable to because of insufficient time.
Was Microsoft listening? Yes. Everyone in the room was listening. Formally only the BRM itself could authorize changes to the standard at this point, regardless of Microsoft's or Ecma's opinion. So it is moot as to whether Microsoft was attentive. Whether they listened or not has zero impact on the ability of the BRM to make changes.
Patrick also appears to be impressed that this discussion all takes place "at a table where a standard for a future product was being debated by non-Microsoft groups?" What future product? The future product is Office 14 (Office 2009). Microsoft has not informed JTC1 nor Ecma on what the changes to OOXML will be for Office 14, due out later this year in beta form.
And then we come to main point of Patrick's argument. Vote "Yes" so we all have a seat at the table. Before we buy into that logic, I suggest we examine other Microsoft/Ecma standards and see how their approval has or has not lead to increased participation.
Microsoft has two primary ways to negate broader participation in a standard's maintenance. The first is standards abandonment. Take for example Ecma-234 "Application Programming Interface for Windows". A contemporary observer might have been just as enthusiastic as Patrick is now. Wow! Isn't this great? They are finally opening up and listening to the world! We finally have a seat at the table! I have a feeling that things are going to be better from now on!
Unfortunately, this standard was approved in December 1995 and covers the Windows 3.1 API only. Since Windows 95 shipped in August 1995, this Ecma standard was obsolete on the day it was approved. No revision of the standard was ever issued. Microsoft abandoned it.
Now certainly, there was nothing in principle that prevented the non-Microsoft Ecma members from continuing to maintain Ecma-234, creating errata documents, polishing up the language of the clauses, etc. But they had no effective way of actually evolving the standard when Microsoft withdrew from the process. That is the danger when you approve a single-vendor standard on the false assumption that this leads to openness.
The other way to negate broader participation in standards development is to create technical revisions at a rapid pace, and to create them within Microsoft with little outside participation. Note that this is how OOXML was created in the first place. And this is how Microsoft/Ecma maintains standards like the C# Programming Language. Ask your friends in JTC1/SC22 whether "70% of the world's population" has a "seat at the table" in evolving that standard. Let me know what you hear. I believe you'll hear that there has been negligible WG activity around C# maintenance, and that new revisions are promulgated by Microsoft, rubber stamped by Ecma, and sent on to SC22, canceling the previous standard and replacing it with the new one.
This trick can be very effective whenever the underlying Microsoft product has an update every 2-3 years. If your product revisions are more frequent than the required JTC1 maintenance checkpoints, then you can effectively ignore JTC1. That's how Microsoft/Ecma has played the game in the past.
Note that Office 2007 has been out since late 2006. Office 14 (Office 2009) is due out in beta form this year, with expected release next year. Any bets on whether the file format will require a technical revision to accommodate Office 2009? There is absolutely nothing that prevents Microsoft from submitting a revised file format specification for Ecma, getting a rubber stamp approval and then Fast Tracking it back into JTC1. Since that is how they have treated other Microsoft/Ecma standards, the burden is on those who argue the contrary to support their optimism.
So consider the facts:
- Microsoft has not supported the JTC1 maintenance process with their other Ecma Fast Tracks. There is no broader "seat at the table", no power sharing, no ownership by "70% of the world's population". It is 100% Microsoft.
- Microsoft's current charter in Ecma TC45 explicitly calls for Ecma to own maintenance of OOXML if approved, not SC34.
- Ecma in fact has submitted a proposal [PDF] to SC34 asking for control of OOXML to be handed back to them.
- With their "rejuvenation" of SC34 (from 9 to 40 P-members in 2 years) Microsoft clearly has the votes it would need to force any maintenance regime they desire.
- No one at Microsoft has made an official statement in writing confirming Patrick's vision of future maintenance. In fact their only official statement, the Ecma proposal to SC34 cited above, contradicts what Patrick is suggesting. So why are only 3rd parties speaking so glowingly about the future control of OOXML? Plausible deniability, anyone?
- Ecma changes their TC45 charter to explicitly call for all maintenance activities (corrigenda as well as technical revisions) to be performed in an SC34 WG.
- Ecma explicitly withdraws their submission on DIS 29500 maintenance from the agenda of the Oslo SC34 Plenary and instead submits a proposal asking for future OOXML work to be done in a new WG in SC34, with a non-Microsoft chair.
- Microsoft publicly states that they will hand operational control of OOXML to SC34, not only for maintenance of OOXML 1.0, but also for technical revisions, and that they will support this being done under JTC1 IPR rules, and using the JTC1 process, and that they will implement whatever revisions SC34 develops within 1 year of approval.
In his most recent post, "Russian Peasant" Patrick suggests that the only reason one would vote against OOXML is spite, and that any problems could be fixed in maintenance.
Let's try another analogy. You are shopping for a new TV and you go to your local consumer electronics store and look at the array of television sets lined up. Most come with a warranty. Any defects detected within the maintenance period will be fixed at the manufacturer's expense. This is generally a good thing, having a maintenance period to fix problems that were not evident at purchase time.
So you find the model TV you want, the salesperson rolls out the box and just before you hand over your credit card, you notice a big gash on the side of the box, where a forklift had pierced it. You say, "I can't accept this TV, it has been smashed!". The salesperson says, "Don't worry. No TV is perfect. We can fix this in maintenance. You're fully covered."
Do you hand over your credit card? Of course not. Maintenance periods, with TV's as with standards, are for defects detected after the fact. It is not a replacement for proper inspection, review and approval processes. You expect a TV to work properly at the start.
No standard is perfect. We all know that. But at the time of approval, NB's should be confident that their technical review was sufficient to find all of the important issues, and that these issues have all been fixed in the standard. OOXML should not be approved unless it is suitable now. The maintainers of OOXML will be busy enough fixing other problems that will be found later. We should not willingly approve a defective standard and set up a future maintenance group for failure by front-loading their agenda with defects that we already know about.
Consider: If we do that, then on what grounds can we reject another Fast Track proposal ever again? This slippery argument — we can fix that in maintenance — can be used for every single proposal that ever comes along. Why even have JTC1 at this point? Easier for everyone involved just hand the "International Standard" stamp over to Ecma and allow them to rubber stamp their own International Standards. This will save the time and expense of engaging hundreds of representatives from 87 JTC1 NB's for a year for a sham review.
My advice is this. Let's turn this train wreck around. Vote No on DIS 29500 and send a clear message that 6,000 page immature standards are not appropriate for JTC1 Fast Track. It showed poor judgment and great disrespect toward JTC1 NB's for Microsoft to send this mess via Fast Track in the first place.
Microsoft has every right to feel that they are late to the game, and risk being left behind for their lack of an open document standard. But they should not expect that they can simply throw money around and remedy their long neglect overnight. And certainly they should not expect JTC1 NB's to do the work for them. Microsoft should work on their specification at the consortium level and get it right first. Once when they have something mature, then they should send it along, preferable in smaller parts submitted sequentially. If they are unwilling or incapable of fixing the specification in Ecma then they could propose it as a new work item in SC34, where they may find some assistance. But if they persist on the standard remaining a single vendor standard, unilaterally controlled to benefit that single vendor, then I wouldn't expect a warm reception in SC34 either.
Labels: OOXML
Thursday, March 06, 2008
JTC1 Improv Comedy Theater
The latest "let's invent a new rule" came at the BRM in Geneva, where a novel approach to tallying meeting votes was surreptitiously foisted on delegations, one which is clearly against the plain text of JTC1 Directives.
The question is how votes should be counted at a Fast Track BRM, where consensus cannot be reached, in this case for lack of time. Specifically, in that final batch-vote on 1027 comments, how should votes be counted. I believe the rules call for positions to be established by the majority of P-members. The leadership of the meeting instead counted both P-members and O-members. In the balance lies the fate of over 100 Ecma proposals which may or may not be included in the final text of the DIS, depending on how this question is resolved.
Let's review the rules, from the current JTC1 Directives (5th Edition, Version 3.0)
First let's start with the overriding rule from section 1.2 "General Provisions":
These Directives shall be complied with in all respects and no deviations can be made without the consent of the Secretaries-General.
Or in plain English — "These are the rules, you can't just make stuff up".
So what is a P-member and an O-member? This is covered in chatper 3 "Membership Categories and Obligations". P-members are defined as:
P-members within JTC 1 shall be NBs that are Member Bodies of ISO or National Committees of IEC, or both. Only one NB per country is eligible for membership in JTC 1. P-members have power of vote and defined duties.
and O-members are defined as:
Any NB that is a Member Body of ISO or National Committee of IEC, or both, may elect to be an O-member within JTC 1. Correspondent members of ISO are also eligible to be O-members of JTC 1. O-members have no power of vote, but have options to attend meetings, make contributions and receive documents.
So clear enough? O-members can attend meetings and contribute, but cannot vote. P-members can vote at meetings.
Section 9 deals with the voting rules, and 9.1.4 speaks about meeting votes in particular:
In a meeting, except as otherwise specified in these directives, questions are decided by a majority of the votes cast at the meeting by P-members expressing either approval or disapproval.
So, in a meeting, only P-members vote and they vote by majority. "Except as otherwise specified in these directives" means that this rule can be overridden in specific cases. But the override must be "specified", i.e., actually written down that it is an override of the normal meeting voting rules.
So drilling down a level deeper we come to the Fast Track rules themselves in chapter 13, where in 13.8 is covered meeting votes at a Fast Track BRM:
At the ballot resolution group meeting, decisions should be reached preferably by consensus. If a vote is unavoidable the vote of the NBs will be taken according to normal JTC 1 procedures.
So on the surface this seems to be a vague statement. What are "normal JTC 1 procedures"? However, a moment's reflection on 9.1.4 above shows that the Directives have already declared this as the normal procedure for meeting votes by saying that this is the rule that holds unless specified otherwise.
One can easily seek confirmation of this by looking at the parallel rules for PAS process BRM votes, given in 14.4.3.9. Here it is more explicit:
At the ballot resolution group meeting, decisions should be reached preferably by consensus. If a vote is unavoidable, the approval criteria in the subclause 9.1.4 is applied.
So despite the clear and plain text of the Directives, the JTC1 leadership decided to improvise a new rule, or more precisely the application of a different rule in the wrong context. The argument appears to be that section 9.5 applies to BRM votes. Section 9.5 "Combined Voting Procedure" is introduced as:
The voting procedure which uses simultaneous voting (one vote per country) by the P members fo [sic] JTC 1 and by all ISO member bodies and IEC national committees on a letter ballot is called the combined voting procedure. This procedure shall be used on FDISs, DISs, FDAMs, DAMs and FDISPs.
This is absurd. JTC1 Directives are not a menu. You can't just pick what voting procedure you want to use from the list. The Directives tell you what procedure to use. First, the combined voting procedure is for letter ballots given to an NB, not for a BRM meeting vote by a delegation. Second, the BRM was not voting on an FDIS, DIS, FDAM, DAM or FDISP. We were voting on whether to include changes into a set of meeting resolutions. We were told repeatedly that the BRM could not take a position on the DIS. Finally, if combined voting procedures are read as applying to Fast Track, then they would also, by that same logic, need to apply equally to PAS, since both PAS and Fast Track are DIS's. But as shown earlier, the PAS process explicitly calls for P-member majority voting according to 9.1.4.
One does not arrive at the voting rules of 9.5 by any straightforward or natural reading of the Directives.
So again, repeating from JTC1 Directions 1.2:
These Directives shall be complied with in all respects and no deviations can be made without the consent of the Secretaries-General.
I wasn't in favor of having any batch ballot, because it violates the spirit of the consensus process, as defined in JTC1 Directives 1.2:
These Directives are inspired by the principle that the objective in the development of International Standards should be the achievement of consensus between those concerned rather than a decision based on counting votes.
[Note: Consensus is defined as general agreement, characterised by the absence of sustained opposition to substantial issues by any important part of the concerned interests and by a process that involves seeking to take into account the views of all parties concerned and to reconcile any conflicting arguments. Consensus need not imply unanimity. (ISO/IEC Guide 2:1996)]
To resort to "counting votes" on the vast majority of the technical issues of DIS 29500, without discussion or opportunity for objection, this is a failure of the JTC1 process. But if we are to have a vote at all, then let it be done in accordance with the rules.
So, let's stop the nonsense. Let's quit the tortuous post facto reinterpretation of the rules. Let's recount and republish the results of the BRM counted according to the Directives and move on with the process. If JTC1 cannot consistently adhere to its own rules, then it should consider another line of business.
Tuesday, March 04, 2008
OOXML, Macros and Security
Such scripting capabilities are essential for the creation of high-value scripted documents. These features are essential in modern applications. Almost every word process or spreadsheet today has automation capabilities. Even open source applications like OpenOffice have macro features. So, considering the popularity and value of scripting in a productivity application, it is much lamented that DIS 29500 does not define how scripts or macros are to work. This lack will cause serious interoperability concerns, as each vendor, lacking standards guidance, will implement these features in incompatible ways.
Specifically, in order to have any interoperability among scripted documents, it is necessary to define:
- How and where a script is stored and located within the Open Packaging Convention (OPC) container file.
- How is the script bound to the document. In other words, how does the document content associate itself with the macro?
- What is the runtime language of the script?
- What is the core and extension API's available to the script?
- What is the security model?
Note that there is ample precedent for a markup standard answering these questions in a flexible and interoperable manner. For example the common web paradigm would be:
- Script is located via URL specified in a "src" attribute of a script element, or is given inline
- The script is invoked by a function call at a particular point in the document, or triggered from a standard event such as onLoad().
- Multiple runtime languages are supported, often EcmaScript
- The API's allowed are defined by the W3C's DOM API
- There is a defined security model to deal with hazards such as cross-frame scripting, etc.
Note however that scripting is not without its problems. We all remember the Word Macro Viruses of several years ago, such as Melissa. Portable code has well-known risks, and these risks have well-known counter-measures. For example, it is common for anti-virus software to scan Word documents for viruses. It is also common for mail servers to scan incoming emails for attachments with viruses, and even remove the macros or block documents with macros, according to admin policy. So there is a need toenable 3rd party applications that can locate, retrieve, scan and delete scripting elements from documents. However, since OOXML does not define even where the scripts are stored, or how they can be located, such 3rd party applications cannot be written in general for a document described by this specification. The standard provides an insufficient foundation for implementing a reasonable security policy around OOXML documents.
For example, take Ecma Response 101, approved in Geneva in a 9-4 vote as part of a large batch 0f 1027 changes, without discussion or opportunity for dissent. Four NB's, in their ballot comments from last September, pointed out that Section 2.16.5.41 of DIS 29500's Part 4 defines a "MACROBUTTON" field that allows the definition of a button in the document that will trigger a macro. But nothing is said about how the macro is stored, bound, what API's are available, what the security model is, etc.
The request from one NB was to "Describe this feature to a level where cross-platform, cross-application interoperability is possible." However, what Ecma provided in their draft Disposition of Comments report, approved in batch by the BRM without discussion or opportunity for objection, was something quite different. They merely added the the following text:
The mechanism by which the command specified by text in field-argument-1 is located and/or executed by an application is implementation-defined
So not only is it impossible to have cross-platform interoperability of this feature, it is not even possible to implement a reasonable security policy to detect, scan or block macros. Even the location of the macro is outside the scope of the standard. It could be just another file in the Zip. It could be a binary blob with an obscure content type that varies from application to application. It could be base64Encoded in the XML. Or it could be steganographically encoded in low-order bits of an image file. The OOXML standard is singularly unhelpful in telling us how to deal with this risks of this macro function.
Finally, note that this lack of information on how to locate macros within a document makes it impossible for anyone to programmatically combine or divide OOXML documents which may contain macros. For example, imagine a 2-page spreadsheet, with a macro on sheet one only. How can it be split into two one-page documents, if there is no defined way to locate the script associated with page one? This is the type of automated composition and document manipulation that OOXML should be enabling. Similarly, how can one combine two single documents containing macros into one document, if there are no defined rules for locating and naming macros? Many basic types of applications,such as merging slide shows, etc., will break in the presence of macros.
The above topic was of interest to several NB's in Geneva, but could not be discussed for lack of time at the BRM.
Labels: OOXML
The Carolino Effect
Pedro Carolino wanted to write and publish an Portuguese/English phrase book.
"Another time there was plenty some black beasts and thin game, but the poachers have killed almost all."
But one small problem — Carolino did not know English.
"Look a hare who run! let do him to pursue for the hounds! it go one's self in the plonghed land."
Undeterred, Carolino hatched a clever plan.
"Here that it rouse. let aim it! let make fire him!"
He had a copy of an Portuguese/French phrasebook, O Novo guia da conversação em francês e português by José da Fonseca. And he had a French/English dictionary.
"I have put down killed."
With these two resources, writing his phrase book would be easy. Or so he thought.
"Me, i have failed it; my gun have miss fixe."
Starting from the French half of the text in da Fonseca's book, Carolino dutifully used his dictionary to translate, word-for-word, the French into English.
The result, O Novo Guia da Conversação, em Português e Inglês, em Duas Partes was published in Paris in 1855, and is now considered to be a classic of unintentional humor.
"Here certainly a very good hunting."
A similar problem occurs in DIS 29500 "Office Open XML". The scope of OOXML, as amended by the BRM is stated as:
This International Standard defines a set of XML vocabularies for representing word-processing documents, spreadsheets and presentations. The goal of this standard is, on the one hand, to represent faithfully the existing corpus of word-processing documents, spreadsheets and presentations that have been produced by Microsoft Office applications (from Microsoft Office 97 to Microsoft Office 2008 inclusive). It also specifies requirements for Office Open XML consumers and producers , and on the other hand, to facilitate extensibility and interoperability by enabling implementations by multiple vendors and on multiple platforms.
Faithful representation of Microsoft Office 97-2008. I've learned it is rarely polite to ask a man what he means by "faithful", but let me make an exception here. We have now the binary Office format specifications, not part of the standard, but posted by Microsoft. And we have OOXML specification. In what way does the OOXML "represent faithfully" the "existing corpus" of legacy documents?
Does OOXML tell you how to translate a binary document into OOXML? No. Does it tell you how to map the features of legacy documents in OOXML? No. Does it give an implementor any guidance whatsoever on how to "represent faithfully" legacy documents? No. So it is both odd and unsatisfactory that primary goal of the OOXML standard is so tenuously supported by its text.
Now, certainly, someone using the binary formats specifications, and using the OOXML specification, could string them together and attempt a translation, but the results will not be consistent or satisfactory. It is the Carolino Effect. Knowing the two endpoints is not the same as knowing how to correctly map between them. A faithful mapping requires knowledge not only of the two vocabularies, but also the interactions.
Also, having the two specifications does not help with the 77 features in OOXML which are declared to the "implementation-defined" or "application-defined". How are these translated from the binary formats?
Note that DIS 29500 bears the obvious marks of its legacy roots, from the use of VML and non-hierarchical run structures in WordProcessingML, to bit fields and idiosyncratic leap year calculations in SpreadsheetML. This suggests the likelihood that the authors of this standard did not just sit down and design the standard from scratch, but that they in fact had access to the binary format specification and mapped it into XML as a preparatory step. It is difficult to explain the presence of elements such as "lineWrapLikeWord6" without positing the presence of such a mapping.
Microsoft should simply publish this mapping. Without such a mapping, conversions will be inconsistent, interoperability will suffer and a primary goal of the standard will not be met. Given the same binary document, Microsoft Office, Apple iWork, OpenOffice.org, etc., will all produce different OOXML documents. How is this "faithfully representing" existing documents? What is needed is a canonical mapping.
Note that the initiation of a open source project to develop a convertor between the binary formats and OOXML is insufficient. What is required is a canonical mapping. Otherwise we are faced with the reality that the true goal of OOXML is more accurately stated as:
To allow Microsoft the ability to represent their legacy documents in XML and pretend that it is a capability that other vendors can practice as well.
Though this issue was of great interest to several NB's, it was not able to be raised at the BRM for lack of time.
Labels: OOXML
Sunday, March 02, 2008
The Art of Being Mugged
I ordered a chicken stir-fry sub with American cheese, large, to go, and headed off home. As I walked down one deserted street, I heard foot steps rustling in the leaves. Turning my head, I saw a figure 50 yards away walking in my direction. Nothing to worry about. The city is full of people. Then a few seconds later, I heard the person running in my direction. Nothing to worry about. I was near a bus stop, and no doubt he was rushing to catch his bus.
But there was no bus. Before I realized what was happening, he had come upon me from behind, knocked me to the wall, and with his left arm around my neck, pressed a blade against my throat with his right hand.
"Don't turn around or I will plunge this knife into your chest cavity," he said.
Many things enter one's mind in such a situation. I'd like to say that my thoughts went immediately to self-preservation, or clever plans for escape, but in all honesty, my first thought was on how awkward his demand was phrased, how clichéd it was. "Plunge into my chest cavity"? I wanted a rewrite.
But a look down at the blade reminded me that this was not a prop, and that however inelegant his words, this gentleman and I had a business transaction to complete.
"Gimme your wallet," was his request, as I had expected.
"I don't have a wallet," I replied.
And thus began a protracted negotiation session,where I offered my $3 and change, and even my sandwich, and this was repeatedly refused as being inadequate. We reviewed the alternatives. He suggested an ATM. I said the nearest one was over a mile away. He said I could drive him. I said I didn't have a car.
And so we went on, trying to find mutually acceptable solution. When he finally demanded that we go to my apartment and I write him a check, I knew I had to take charge. For the first time that evening I was really scared. This needed to end now, on the street, while the mugger was still relatively calm. I was willing to fight and risk the blade before I would let him into my apartment, where I knew my odds would be greatly diminished. But was there any other solution?
Then it hit me. What if I put him in a context he was familiar with? If he is going to give me clichéd lines, why don't I give him the same? We'll work this out according to the book.
"This ends here. I'm not moving an inch more. I don't have a wallet, and if you don't believe me, then search me!," I said in a louder voice that startled him a bit.
I threw my sandwich on the ground, faced the building, dramatically placed my hands spread above my head on the wall, and spread my legs, just like in the movies. He quickly got the idea, patted me down and satisfied himself that I was not carrying a wallet.
His final words were: "Don't try to follow me." He then tossed $2 of my $3 back to me and ran off, saying "I only need money for the bus".
I then walked, a bit faster than usual, back to my apartment, locked and bolted the door, called the police and started eating my sandwich, with cold, shaking hands.
I learned two important lessons from that night. First, I now always carry enough money to satisfy a mugger, $40-$50. Carrying too little money is as risky as carrying too much.
The second lesson... Well, I'll get to that later, after a few words about the DIS 29500 BRM.

I'm just back from Geneva, where delegations from 32 National Bodies (NB's), plus Ecma, met for five days at the CICG. Present were 104 delegates in a large, double room with tables four deep, in two sections arrayed in a chevron. A microphone was placed between each two delegates. The delegations were generally arranged in alphabetical order by the English names of their countries.
At the front of the room was a table with the meeting officials, including the SC34 Chair and Secretariat, ITTF representatives (one of their responsibilities is to supervise "the application of the ISO and IEC Statutes and Rules of Procedure"), Ecma's Project Editor and his assistant, and in the center (or centre) was the BRM Convenor, Alex Brown.
The agenda was essentially improvised based on NB interests. Each NB, called in alphabetical order, was invited to raise an item for discussion, from one of the 3,522 NB comments from the failed Sept 2nd. ballot, or from one of the 1,027 Ecma responses. A quick calculation shows that, a 5-day meeting in session 6.5 hours a day (9-5, with one hour for lunch and 15m breaks mid morning and afternoon) will have at most 1,950 minute of meeting time, or less than 2 minutes per Ecma response. This should have raised warning signals. More than warning signals, it should have triggered action. But not in JTC1. Fast Tracking a 6,045 page Ecma standard. No problem. Processing 1,027 Ecma proposals in one week. No problem.
"Forward, the Light Brigade!"
Was there a man dismay'd?
Not tho' the soldier knew
Someone had blunder'd:
Their's not to make reply,
Their's not to reason why,
Their's but to do and die:
Into the valley of Death
Rode the six hundred.
— Tennyson
So with a bit of derring-do, we plunged into the work on Monday morning. We had a delightful mix of personalities. We had world-renowned experts in XML technologies (I won't mention any by name for fear of giving offense by leaving one out), experts in JTC1 process, experts in accessibility, RTL writing conventions, computer security, etc. We also had many people who had never attended an ISO meeting before. Individual delegates came from a mix of backgrounds, some government bureaucrats, some standards professionals, independent XML consultants, academics and employees of small and large corporations. The presence of open source and open standards advocates was notable, and I'll write more on the significance of that another day.
Monday went well. There was a little commotion and several objections when it was announced that detailed minutes would not be recorded. And one delegate did protest that the Head of Delegation for his country was improperly determined. But we plowed ahead and started hearing from delegations by 9:30am, starting in alphabetical order with Australia.
One NB, in lieu of a technical comment raised the objection that the DIS was too long and inappropriate for Fast Track. The response from ITTF was that we should do a "best effort" in the time available this week. If this is not sufficient, NB's should change vote to No, or maintain No vote.
By lunch time we had made it to India. And what had we resolved? Ten substantive technical issues were raised by NB's. One was resolved (add Ecma's OOXML accessibility report as an informative annex) and 9 issues are taken off-line for further discussion.
Six more issues are brought up in the afternoon, as we made it through Malaysia. These items were discussed, and all taken-off line for further discussion. So net for Monday is 15 issues raised, and one resolved. Many of us had homework to do.
Tuesday and Wednesday proceeded much the same, but with time given in the morning and upon returning from lunch to update the BRM on the progress of the "off-line" discussions. But we were still clearly in the accumulation stage. Issues were piling up faster than they were being resolved.
By Tuesday at 12:11 we had made it once through the alphabet, so every delegation had the opportunity to raise, thought not necessarily resolve, a single issue of importance to them. We started again with Australia and end the day with Ireland. The BRM never completed the second pass through the delegations. As the days progressed, more and more time was given to reports from the off-line discussions and trying to gain consensus on those issues. Less time was given to raising new issues.
Given the constraints of the meeting, this was the appropriate thing to do. But the net result was that New Zealand had the last opportunity to raise new issues, on Thursday at 4:34pm. The US, being last in the alphabet, was able to raise only a single issue during the week.
As we reached mid week, we were presented with a set of ballot choices, to deal with the responses that could not be discussed during the meeting, for lack of time, which would at the current rate of processing amount to 800-900 of the 1,027.
As any real estate agent will tell you after a few drinks, the trick to selling a house is to make the buyer think they have made a wise decision. To do that, first show them a few overpriced, dilapidated houses, and then show them the house you want them to buy. A similar approach was used on the BRM.
The four options presented were:
- Option 1: Submitter's responses (Ecma's) are all automatically approved.
- Option 2: Anything not discussed is not approved.
- Option 3: Neutral third-party (ITTF) decides which Ecma responses are accepted
- Option 4: Voting (approve + disapprove) must be at least 9 votes. Abstentions not counted.
We break for lunch.
After lunch and after more discussion, the meeting adopted a variation of option 4, by removing the vote minimum. I believe in this vote the BRM and ITTF exceeded its authority and violated the consensus principles described in JTC1 Directives.
Consider 1.2 of the Directives, the General Provisions:
These Directives are inspired by the principle that the objective in the development of International Standards should be the achievement of consensus between those concerned rather than a decision based on counting votes.
[Note: Consensus is defined as general agreement, characterised by the absence of sustained opposition to substantial issues by any important part of the concerned interests and by a process that involves seeking to take into account the views of all parties concerned and to reconcile any conflicting arguments. Consensus need not imply unanimity. (ISO/IEC Guide
2:1996)]
However, 80%+ of the resolutions of the BRM were resolved by a ballot, without discussion, without taking into account any dissenting views, without reconciling any arguments. Indeed, there was not any opportunity to even raise an objection to an issue decided by the ballot. Many of the issues were decided in 6-5 or 7-6 split votes, with no discussion. How can that be said to be a consensus? This is an utter failure to follow the cardinal principles of JTC1 process.
Note that votes are explicitly allowed at a BRM, though they are discouraged. Section 13.8 of the Directives states:
At the ballot resolution group meeting, decisions should be reached preferably by consensus. If a vote is unavoidable the vote of the NBs will be taken according to normal JTC 1 procedures.
What are normal procedures? Section 9.1.4 says:
In a meeting, except as otherwise specified in these directives, questions are decided by a majority of the votes cast at the meeting by P-members expressing either approval or disapproval.
(Somehow this got this confused and O-member votes were included in the initial reported totals, but I assume this will get fixed and the results restated.)
The key observation is that there are a number of solitary experts at the BRM, those who bring a perspective or expertise that cannot be matched by anyone else at the BRM. For example, the Israeli delegation speaks with authority on matters of Hebrew writing, counting and calendaring systems. In fact, they made a number of valuable contributions to the meeting in these areas. Similarly, other delegation or specific delegates bring their own unique expertise to the table. If an issue is brought up,and the question is asked whether Ecma's resolution is satisfactory, if Israel objects, their objection should be heard. Period. It doesn't mean we necessarily will agree with, or adopt their suggested change. But they should be heard. This is what is meant by "seeking to take into account the views of all parties concerned and to reconcile any conflicting arguments".
The majority has the right to have its will enacted. That is the easy part. The responsibility of the Convenor, however, is to ensure that the minority has the right to be heard, and the equal opportunity to make their case. That is the price of consensus. This is the price of open standards.
This point was demonstrated when one set of comments was being approved. The list of response numbers was projected on the screen. The Convenor asked if there were any objections to approving this batch of comments, which were said to be all related. Three hands went up, objecting, from three delegations, including the US. "So resolved" was the response, without asking the nature of the objections. Certainly, if the only thing that mattered was majority rule and efficiency, and if JTC1 did not have explicit consensus goals, then this might be the correct thing to do. But how, in practice, can one determine the "absence of sustained opposition to substantial issues by any important part of the concerned interests" without hearing what the objections are?
In this case the US Head of Delegation continued to gesticulate and was eventually recognized to speak. We stated that we observed that the list of response numbers in this resolution was in error and included a response number that should not have been part of this bunch. Once pointed out, the proposer of the resolution agreed, and the US change was adopted without dissent.
This is the kind of thing you lose with an electronic ballot. There is no way to object, no way to discuss, no way to amend or to substitute language. It is not a consensus process. It is just a Hobson's Choice -- take it or leave it, and the BRM suffers for it, as the value of solitary expertise is the room is neutralized by the NB's who voted blanket approval for all of Ecma's proposals.
As the meting progressed into Thursday, the tension mounted. As new issues were identified, they were taken off-line and told they could be brought up "Friday morning". But no one really believed that. It was clear that there was not enough "Friday morning" to go around.
Thursday 9:20am, a delegation objects that they were told only to review Ecma's responses to their own comments, and that there was never sufficient time to review all 1,000 Ecma responses since January 14th. ITTF's response: "Nothing we can do about it in the rules -- Nothing we could have done in our judgment".
2:18pm the Convenor announces "This is zero hour".
There is clearly not even enough time to fully discuss in the meeting the resolution of items that were taken off-line for further discussion. The US is not allowed to present our multi-part proposals to the meeting. We are told get consensus outside of the meeting first, so it can be brought up for quick approval.
Into Friday the BRM spirals further downwards. The issue is not now that NB's cannot raise new issues. The problem is now that NB's who have been diligently working on issues off-line with other delegations, meeting over lunch, or early in the morning or into the evening, may not be able to have their proposals heard and acted on.
There simply is not enough time. The anxiety-driven, frantic delegates push even harder. More resolutions are approved with 2 or 3 delegations trying to raise objections, but without being recognized. Tempers grow short. One highly respected Head of Delegation, of unimpeachable reputation and experience started to voice an objection "I am extremely disgusted by the way procedures have been..." before being called out of order by the Convenor, saying that discussion of procedural issues will not be allowed. Another delegation tries to raise a new issue, as they had for the last two days without luck. "We're using the public money from NNN to come here to speak on our issue. Can we speak on our issue?" Convenor – "We have run out of time."
And so the BRM came to an end, with the announcement of the results of the paper ballot. Five delegations gave default approval to the Ecma comments (Chile, Cote D'Ivoire, Czech Republic, Finland, Norway) and four gave default disapproval positions on the undiscussed Ecma responses (India, Malaysia, South Africa, United States). Most delegations gave a default abstain position, or registered no position. A delegation could additionally override their default position on any particular issue.The net is that, although the discussions on Monday and Tuesday demonstrated that the quality of the Ecma responses was such that almost every one required substantial off-line work to make it acceptable, we gradually lowered our standards, so that by week's end, we approved 800+ comments without any discussion, even in the presence of clear objections.
I want to make it clear that I in no way wish to criticize the Convenor. I think Alex did a remarkable job in trying to carry out his duties and be fair in this no-win situation. He was given an impossible task and had to find out how to fail in the least offensive way. There is an art to crash-landing an airplane and we must acknowledge that.
There is an art to being mugged successfully as well and I have learned that lesson. But when someone tells you, "Your money or your life", although one is better off having that choice than not having it, one can still complain vigorously at the injustice of having a choice forced onto you. It is an artificial constraint, determined entirely without your consultation, and without considering your welfare, that limits you to those two choice. Similarly, the choices given at the BRM were arbitrary, artificial, and not to the benefit of JTC1, NB's or to users and implementors of DIS 29500.
As the meeting concluded, ITTF requested that we not call the vote a "default" vote. "These were your choices, voted according to the rules you adopted," we were told. I reject this revisionist portrayal of the events. This was not my choice. This was merely the least bad of several bad choices that the ITTF deigned to allow us at the end of a grueling week trying to resolve 3,522 issues in bloated, technically immature proposal that has been mismanaged from the start.
"Your money or your life". Not a day goes by that I am not appreciative I was given that choice. But that decision was made under duress, and does not diminish my revulsion of the person who forced that choice upon me. The question is, why do NB's put up with this OOXML Fast Track nonsense? Who is holding a knife to our throats?
Labels: OOXML
Thursday, February 21, 2008
Legacy Inflation
Well, to be fair, I haven't quite caught up yet.
But I am reminded of this when I hear Microsoft's claims about "legacy document compatibility". At first they used the term "legacy documents" to refer to the masses of existing binary documents, these "exobytes" of documents in Office binary formats. The argument seemed to be, that since Microsoft Word 95 had a bug, therefore Apple iWork 08 must also have this same bug when using OOXML format. This form of argument is used to defend all manner of defects in OOXML.
But in recent weeks, the argument has morphed. The legacy era is catching up with us. Microsoft's unwillingness to fix errors in OOXML is now being defended because the fixes (Microsoft claims) would break compatibility with Ecma-376. In other words, Office 2007 files are now part of this large legacy that must be preserved. I can only call it call this "legacy inflation".
First, note that Microsoft shipped Office 2007 with support for OOXML as the default, and this was entirely their choice. Beta versions of Office 2007 did not have OOXML as the default. If Microsoft had left the binary formats as the default, it would have been far easier for their customers. They could have waited for the Mac Office to support OOXML, Mobile Office, developer tools, etc., and then have a coordinated rollout of the new format, rather than dump it on an unprepared world. They could have also waited for standards approval for OOXML, wait for the standard to stop changing before forcing on their customers. But the didn't do that. They took the approach that caused maximum disruption for their customers. And now that Office 2007 is in use, Microsoft wants ISO to bail them out, and not make any changes that would result in even a single attribute in OOXML differing from Ecma-376.
We see similar brinkmanship in wireless networking protocols where chip manufacturers rush to be the first to ship support for "draft" standards like 802.11n, build up an inventory of chips, and then lobby to ensure that the draft does not change, so they can cement their first mover advantage. This does not benefit the consumer, this does not benefit the standard, this does not benefit interoperability. It is all about maneuvering for market advantage. We should not be encouraging or supporting this.
It is interesting to note that in the wifi world, any company that plays this game with draft standards takes a big risk. They may win, or they may lose. It is a gamble. Only a monopolist would assume that they can play this game risk free. Microsoft does not face the same market risks that others would face for making a bad decision.
In any case, the argument that DIS 29500 must remain identical to Ecma-376 is technically deficient. Consider: Ecma-376 is not identical to the binary formats, but Microsoft Office can still read both. That is because Office can tell these files apart and call different code to parse the two different formats. Similarly, if OOXML diverges from Ecma-376, Microsoft can tell, with 100% certainty, which documents were created in Ecma-376 format versus which ones were made according to the ISO version of the standard.
The key is that all OOXML documents describe the application that created them, as well as a detailed version number. These are described in DIS 29500, Part 4:
7.2.2.1 Application (Application Name)
This element specifies the name of the application that created this document.
7.2.2.2 "AppVersion (Application Version)"
to differentiate between different versions of the same producer
If we look at three Office 2007 documents, we see the following in app.xml:
<application>Microsoft Office Word</application>
<appversion>12.0000</appversion>
<application>Microsoft Excel</application>
<appversion>12.0000</appversion>
<application>Microsoft Office PowerPoint</application>
<appversion>12.0000</appversion>
So the way to ensure compatibility in the fact of the standard changing through the approval process is clear. If the version is "12.0000" then interpret as Ecma-376. But when Office is updated to support an approved DIS 29500 (if this ever occurs) then they can simply update the version number in the files. That way Microsoft Office and every other application can tell them apart and process them correctly.
So let's reject Microsoft's push for legacy inflation. Otherwise we will soon find that the next version of OOXML is also unchangable, since Office 14 will be out before the next version of OOXML is standardized. Will we then be unable to change anything in OOXML 1.1 because Office 14 is already in beta? Where does this end?
This doesn't mean we should be capricious with changes in DIS 29500, but where something is clearly wrong, let's fix it. The assumption should be that the future is bigger than the past, that no matter how many documents existed before, there will soon be many more created in the future. We should be optimizing for that future.
Labels: OOXML
Saturday, February 16, 2008
Fast Track versus PAS
So when I hear people lump Fast Track and PAS process in JTC1 together, I roll my eyes and think... If only they knew how different they really are.
Let's give it a try, starting with PAS.
PAS stands for "Publicly Available Specification" and the PAS process in JTC1 allows an existing standard from outside of JTC1 to be submitted, reviewed and approved in an accelerated review cycle. An organization that wishes to make a PAS submission (typically a standards consortium) must first seek recognition as a PAS Submitter. This requires that they submit to JTC1 for approval a list of standards they wish to submit, as well as documentation that explains their organizational qualifications. The long list of organizational acceptance criteria are outlined in JTC1 Directives, Annex M:
M7.3 Organisation Acceptance CriteriaOnce this documentation is provided, a three-month JTC1 ballot is held on the question of whether to approved the applicant as a Recognized PAS Submitter. If approved, this status last for 2 years, but may be renewed by reapplying with updated organizational documentation. Renewals must also be approved by a 3-month letter ballot.
M7.3.1 Co-operative Stance (M)
There should be evidence of a co-operative attitude toward open dialogue, and a stated objective of pursuing standardisation in the JTC 1 arena. The JTC 1 community will reciprocate in similar ways, and in addition, will recognise the organisation's contribution to international standards.
It is JTC 1's intention to avoid any divergence between the JTC 1 revision of a transposed PAS and a version published by the originator. Therefore, JTC 1 invites the submitter to work closely with JTC 1 in revising or amending a transposed PAS.
There should be acceptable proposals covering the following categories and topics.
M.7.3.1.1 Commitment to Working Agreement(s)M.7.3.1.2 Ongoing Maintenance
- What working agreements have been provided, how comprehensive are they?
- How manageable are the proposed working agreements (e.g. understandable, simple, direct, devoid of legalistic language except where necessary)?
- What is the attitude toward creating and using working agreements?
- What is the willingness and resource availability to conduct ongoing maintenance, interpretation, and 5 year revision cycles following JTC 1 approval (see also M6.1.5)?
- What level of willingness and resources are available to facilitate specification progression during the transposition process (e.g. technical clarification and normal document editing)?
M.7.3.1.3 Changes during transposition
- What are the expectations of the proposer toward technical and editorial changes to the specification during the transposition process?
- How flexible is the proposing organisation toward using only portions of the proposed specification or adding supplemental material to it?
M.7.3.1.4 Future Plans
- What are the intentions of the proposing organisation toward future additions, extensions, deletions or modifications to the specification? Under what conditions? When? Rationale?
- What willingness exists to work with JTC 1 on future versions in order to avoid divergence? Note that the answer to this question is particularly relevant in cases where doubts may exist about the openness of the submitter organisation.
- What is the scope of the organisation activities relative to specifications similar to but beyond that being proposed?
M7.3.2 Characteristics of the Organisation (M)
The PAS should have originated in a stable body that uses reasonable processes for achieving broad consensus among many parties. The PAS owner should demonstrate the openness and non-discrimination of the process which is used to establish consensus, and it should declare any ongoing commercial interest in the specification either as an organisation in its own right or by supporting organisations such as revenue from sales or royalties.
M.7.3.2.1 Process and Consensus:
- What processes and procedures are used to achieve consensus, by small groups and by the organisation in its entirety?
- How easy or difficult is it for interested parties, e.g. business entities, individuals, or government representatives to participate?
- What criteria are used to determine "voting" rights in the process of achieving consensus?
M.7.3.2.2 Credibility and Longevity:
- What is the extent of and support from (technical commitment) active members of the organisation? b) How well is the organisation recognised by the interested/affected industry?
- How long has the organisation been functional (beyond the initial establishment period) and what are the future expectations for continued existence?
- What sort of legal business entity is the organisation operating under?
M7.3.3 Intellectual Property Rights: (M)
The organisation is requested to make known its position on the items listed below. In particular, there shall be a written statement of willingness of the organisation and its members, if applicable, to comply with the ISO/IEC patent policy in reference to the PAS under consideration.
Note: Each JTC 1 National Body should investigate and report the legal implications of this section.
M.7.3.3.1 Patents:
- How willing are the organisation and its members to meet the ISO/IEC policy on these matters?
- What patent rights, covering any item of the proposal, is the PAS owner aware of?
M.7.3.3.2 Copyrights:M.7.3.3.3 Distribution Rights:
- What copyrights have been granted relevant to the subject specification(s)?
- What copyrights, including those on implementable code in the specification, is the PAS originator willing to grant?
- What conditions, if any, apply (e.g. copyright statements, electronic labels, logos)?
- What distribution rights exist and what are the terms of use?
- What degree of flexibility exists relative to modifying distribution rights; before the transposition process is complete, after transposition completion?
- Is dual/multiple publication and/or distribution envisaged, and if so, by whom?
M.7.3.3.4 Trademark Rights:
- What trademarks apply to the subject specification?
- What are the conditions for use and are they to be transferred to ISO/IEC in part or in their entirety?
M.7.3.3.5 Original Contributions:
- What original contributions (outside the above IPR categories) (e.g. documents, plans, research papers, tests, proposals) need consideration in terms of ownership and recognition?
- What financial considerations are there?
- What legal considerations are there?
Once an organization has Recognized PAS Submitter status, it may now propose a PAS submission. Such a submission must be within scope of the Submitter's original application, and must be accompanied by an Explanatory Report that speaks to JTC1's strategic interests in Interoperability, Cultural and Linguistic Adaptability, as well as the following document-related acceptance criteria:
M7.4 Document Related Criteria
M7.4.1 Quality
Within its scope the specification shall completely describe the functionality (in terms of interfaces, protocols, formats, etc) necessary for an implementation of the PAS. If it is based on a product, it shall include all the functionality necessary to achieve the stated level of compatibility or interoperability in a product independent manner.
M.7.4.1.1 Completeness (M):
- How well are all interfaces specified?
- How easily can implementation take place without need of additional descriptions?
- What proof exists for successful implementations (e.g. availability of test results for media standards)?
M.7.4.1.2 Clarity:
- What means are used to provide definitive descriptions beyond straight text?
- What tables, figures, and reference materials are used to remove ambiguity?
- What contextual material is provided to educate the reader?
M.7.4.1.3 Testability (M)
The extent, use and availability of conformance/interoperability tests or means of implementation verification (e.g. availability of reference material for magnetic media) shall be described, as well as the provisions the specification has for testability.
The specification shall have had sufficient review over an extended time period to characterise it as being stable.
M.7.4.1.4 Stability (M):
- How long has the specification existed, unchanged, since some form of verification (e.g. prototype testing, paper analysis, full interoperability tests) has been achieved?
- To what extent and for how long have products been implemented using the specification?
- What mechanisms are in place to track versions, fixes, and addenda?
M.7.4.1.5 Availability (M):
- Where is the specification available (e.g. one source, multinational locations, what types of distributors)?
- How long has the specification been available?
- Has the distribution been widespread or restricted? (describe situation)
- What are the costs associated with specification availability?
M7.4.2 Consensus (M)
The accompanying report shall describe the extent of (inter)national consensus that the document has already achieved.
M.7.4.2.1 Development Consensus:
- Describe the process by which the specification was developed.
- Describe the process by which the specification was approved.
- What "levels" of approval have been obtained?
M.7.4.2.2 Response to User Requirements:
- How and when were user requirements considered and utilised?
- To what extent have users demonstrated satisfaction?
M.7.4.2.3 Market Acceptance:
- How widespread is the market acceptance today? Anticipated?
- What evidence is there of market acceptance in the literature?
M.7.4.2.4 Credibility:
- What is the extent and use of conformance tests or means of implementation verification?
- What provisions does the specification have for testability?
M7.4.3 Alignment
The specification should be aligned with existing JTC 1 standards or ongoing work and thus complement existing standards, architectures and style guides. Any conflicts with existing standards, architectures and style guides should be made clear and justified.
M.7.4.3.1 Relationship to Existing Standards:
- What international standards are closely related to the specification and how?
- To what international standards is the proposed specification a natural extension?
- How is the specification related to emerging and ongoing JTC 1 projects?
M.7.4.3.2 Adaptability and Migration:
- What adaptations (migrations) of either the specification or international standards would improve the relationship between the specification and international standards?
- How much flexibility do the proponents of the specification have?
- What are the longer-range plans for new/evolving specifications?
M.7.4.3.3 Substitution and Replacement:
- What needs exist, if any, to replace an existing international standard? Rationale?
- What is the need and feasibility of using only a portion of the specification as an international standard?
- What portions, if any, of the specification do not belong in an international standard (e.g. too implementation specific)?
M.7.4.3.4 Document Format and Style
- What plans, if any, exist to conform to JTC 1 document styles?
The Explanatory Report also sets the maintenance regime for the submission, if approved
The proposed standard, along with the Explanatory Report is then distributed to JTC1 NB's for a 6-month ballot. Approval criteria is 2/3 approval of voting P-members, and no more than 25% disapproval in total. At the end of the ballot a Ballot Resolution Meeting may be held if needed.
So, that is PAS process, in brief. PAS process is how ODF was approved back in 2006, with OASIS as the Recognized PAS Submitter.
Fast Track process, is almost the same from the time the ballot is issued. The six-month period is split into a 30-day "contradiction period" and a 5-month ballot. (That is an odd difference, with no clear reason). But the voting criteria, the BRM process, etc., this is all the same between the two. What is different (and there are critical differences) is everything that happens before the ballot.
Who can submit a Fast Track? Any JTC1 P-member, or any Class A Liaison can propose a Fast Track.
We all know about P-members. They are NB's, typically the highest standardization committee in any country. A P-member used to also mean that you had a broad interest in many or most JTC1 matters. But now it may mean merely that Microsoft asked you to join as a P-member.
Class A Liaison are "Organisations which make an effective contribution to and participate actively in the work of JTC 1 or its SCs for most of the questions dealt with by the committee". Any organization can apply to be a Class A Liaison and be voted in via a letter ballot or at a meeting. There are no formal organization qualifications, no requirement to state an interest in eventually making Fast Tracks, or to answer any of the types of questions that PAS Submitters must answer.
Further, once approved as a Class A Liaison, the status lasts forever. There is no requirement to renew or reapply. In fact JTC1 Directives even lack a documented procedure for removing a Class A Liaison.
So what about the proposals for Fast Track submission. What is required of them? No Explanatory Report is required. No checklist of document-related criteria must be answered. JTC1 Directives say merely "The criteria for proposing an existing standard for the fast-track procedure is a matter for each proposer to decide." That's it. It is at the sole discretion of the Class A Liaison.
So you can see what great power Ecma has over JTC1 -- they can submit any standard they want for Fast Track, and no one in JTC1 can stop them, or even remove their right to submit more Fast Tracks.
This may explain why Ecma is able to command such high membership fees. A full voting membership in OASIS, which would allow a company to help produce an OASIS Standard for later submission to JTC1 under the arduous PAS process, this costs $1,100 for a small company. To join the US NB and be able to lobby for a Fast Track submission from the US, this will cost you $9,500. But to join Ecma as a voting member (what they call an "Ordinary Member") this will cost you 70,000 Swiss Francs, or $64,000. That is what no-questions-asked Fast Track service is worth. I think that, from Microsoft's perspective, the extra $62,900 is money well spent. But what about from JTC1's perspective? They don't get this extra money. So what's their excuse for having these permissive Fast Track procedures that give Ecma so much control?
In any case, that is why I roll my eyes when people lump PAS and Fast Track together, and say that they are essentially the same process. They clearly aren't. PAS Submitters like OASIS are given intense scrutiny, and are required to document in great detail how their organization and their proposals meet JTC1 criteria. The scrutiny never ends, as a new Explanatory Report is required for every submission, and their status as Recognized PAS Submitter only lasts for a few years before requiring re-approval.
Fast Track submitters, as Class A Liaisons, on the other hand, are the monarchs of JTC1. They serve for life and are answerable to no one. They can submit a Fast Track on any subject they want, at any time. So a standards consortium like Ecma, with primary expertise in optical disk standards, but never having produced an XML standard before, can rubber stamp the world's largest XML standard and submit it for Fast Track processing to JTC1. And no one can do a thing about it.
A Pre-BRM Miscellany
Our delegation has been warned that there will be a dangerous group of agitators at the event and we may need to walk past them to get to the meeting room, and we should not lend our support in any fashion to this event, which includes such known disruptive elements as Vint Cerf, Håkon Wium Lie, Bob Sutor, and Andy Updegrove. Eyes front, do not look to the left, do not look to the right.
I'm certainly impressed that JTC1 is taking the BRM process so seriously, and everyone is so concerned with the integrity of the process. But I must wonder where all this attention was when NB's were reporting to JTC1 that OOXML was too large to review under Fast Track procedures? Where was the concern when NB's were objecting that the proposal contradicted numerous international standards? Where were the precautions when committees were being stuffed, and new NB's were joining JTC1 only days before the ballot ended? Who was watching out for the integrity of the process then? Why is an OFE panel discussion on "Standards and the Future of the Internet” by international experts on the subject a threat to the international standards system, but no one in JTC1 even blinked when Côte-d'Ivoire joined JTC1 as a P-member three days before the end of a 6-month standardization process and voted "Yes" without comments on a 6,000 page proposal?
In other news, Martin Bekkelund has a look at some of the much vaunted "support" for OOXML on the Mac. Despite the claims, the support is quite underwhelming. As Gertrude Stein said, "There is no there there". (That probably won't translate well, so for my non-native English-speaking friends, trust me, that was hilarious.)
From ZDNet Australia and Brett Winterford comes a summary of some analysis by IP law practitioners and academics of OOXML and IPR. "Can Microsoft be trusted on OOXML covenants?" My summary: individuals, small companies and open source projects are roadkill.
Google searches for "ODF" and "OpenDocument" or even "noooxml" are now returning sponsored links with phrases like "Learn the truth about the standard for interoperability" that lead you to a pro-OOXML petition on Microsoft's faux OOXML community site. For example, try this query.
Let's see if I understand how this pay-per-click system works. Every time I click these sponsored links, money gets transferred out of some pro-OOXML supporter's bank account and is sent to Google? These seems the expensive route to go, but there is some logic to it. A look at Google Trends shows that Google queries for "ODF" far outnumber queries for "OOXML".
On the other side, at the real <NO>OOXML petition, the count stands at 82,422 signatures. Apparently they did not need to trick people into visiting their web site.
Three ODF applications in the news this week.
IBM Lotus Symphony takes Datamation's "Product of the Year" award in the "Office Productivity Software" category, beating out Microsoft Office. Congratulations to Symphony team!
Also, CNet TV puts OpenOffice.org in the #1 slot in their "Top 5 Best downloads of 2007".
As reported by <NO>OOXML, the OpenDoc Society has an interesting tease in their February newsletter about a proposal "under investigation" by an "ODF standards group" within Microsoft to add better support to MS Office for ODF. Interesting.
Get it while you can. Microsoft is making their legacy binary format documentation available for download. This timely disclosure comes a few month after they silently disabled access to many of their legacy formats in Office 2003 SP3.
My advice -- download these binary formats, burn them to CD and store them in a safe place. Over the years Microsoft has made these formats available for download (ca 1996), put them on MSDN CD's (ca 1998), then added restrictive terms that specifically forbade use by competitors (ca 1999), removed the documentation entirely from the web and MSDN CD's (ca 2000), made the formats available under commercial license only, made a RF license available only after filling out an intrusive questionnaire and only when the use was "complementary to Office" (ca 2005), to the present download terms. So get them now, since there are no guarantees on how long they will remain available this time.
In any case, it is good to have this material available once again. We now have a file format specification, controlled exclusively by Microsoft, with all sorts of quirks and bugs necessary to be an accurate and compatible description of the billions of existing Microsoft Office documents, available for anyone to download and implement under terms granted by Microsft's Open Specification Promise. In fact, the observant reader will note that the same could be said about OOXML. But why should either be an ISO Standard? They both remain a description of the anomalous quirks of a single vendor's proprietary products, with no generality or applicability to other uses.
In fact, if a claim is made for needing ISO standardization, the better claim should be given to the legacy binary formats, since they indeed are widely implemented, and are used for billions of legacy documents. Microsoft would not even need to pad their résumé with toy implementations. The binary formats are implemented in everything from MS Office, to OpenOffice, to Lotus SmartSuite to Lotus Symphony, to Corel WordPerfect Office, to KOffice, Google Docs & Spreadsheets, to Apple iWork, to MindJet. In fact, for every partial implementation of OOXML that Microsoft claims, we could point to dozens of fuller implementations of the legacy binary formats.
So why the rush to make an ISO standard for OOXML? I wonder if instead we should be taking Adobe's example and standardizing the existing binary formats, as insurance for long-term access to the legacy base of MS Office documents. Then moving forward, MS Office could use a clear, modern format like ODF, enhanced with Microsoft's participation in the ODF TC, to ensure that it includes all of the capabilities that they require for moving forward in the office productivity market. Do we really want to drag deprecated VML and incorrect leap calculations into the 21st Century?
Labels: OOXML
Tuesday, February 12, 2008
Punct Contrapunct
We’ve made the overview available for free (I must admit I'm not sure for how long), as we believe this topic warrants expanded industry debate before a February, 2008 ISO ballot on OOXML, and we want to help catalyze and advance the debate.
The degree of expanded debate achieved may be estimated by noting that Microsoft is sending this report to every JTC1 national body involved in the OOXML ballot, from Pakistan to Ecuador, and has invited Peter O'Kelly to speak on this paper both at the recent OOXML press event in Washington as well as this week's Office Developers Conference.
Much could be said of this report, but I'll limit myself to commenting on a single passage:
[S]everal vendors interviewed for this overview indicated that it's essentially impossible to get ODF proposals approved if they're not also supported in OpenOffice.org, and further noted that Sun closely controls OpenOffice.org (much as it also holds control over Java).
It should be noted that, before making this statement, the authors neither contacted OASIS nor the OASIS ODF TC in order to check their facts.
The ODF Alliance published a rebuttal of this report, and in particular took umbrage at that passage, saying:
This is demonstrably false, and the use of unnamed “vendors” as sources does not eliminate the need for doing basic fact checking on such claims. Rumors and innuendo do not objective analysis make.
First, on the control aspect, note that ODF 1.0, the standard, is owned and controlled by OASIS, a standards consortium of over 600 member organizations. Sun is just one company among many members. Indeed, for most of the development of ODF, Microsoft was on the Board of Directors of OASIS.
Second, OASIS is a corporation. It is legally bound to its Bylaws. There is no arbitrary control by member corporations.
The ODF TC is co-chaired by an IBM employee and a Sun employee, and is regulated by the OASIS TC Process document, which is publicly readable by all and has clear rules of procedure and appeal.
The ODF TC has three subcommittees. The Accessibility SC is co-chaired by IBM and Sun, while the Formula Subcommittee and the Metadata Subcommittee are each chaired by individual members of OASIS who are not affiliated with any large corporations.
Voting rights in the ODF TC, for accepting or rejecting features, is currently as follows:
- Sun – 3 voting members
- IBM – 4 voting members
- Individuals – 3 voting members
This can easily be verified at the OASIS ODF TC website.
Is sharing the chair position on the TC and on 1 of 3 subcommittees considered “closely controlling”? Is having 30% of the votes considered “closely controlling”?
As for proposals being accepted into ODF, we note that all three major features for ODF 1.2, RDF metadata, OpenFormula, and enhanced accessibility, are new proposals which have not been yet implemented in OpenOffice. Moreover, the ODF TC is currently processing a set of features requested by the KOffice open source project. So the assertion that it is “essentially impossible” to get new features into ODF if they are not already supported by OpenOffice is not true. This error is unfortunate and needs correcting through rigorous fact checking, as do the others, in our opinion.
Oddly enough, this particular error occurs in several places. A search of the report for the word “control” shows it used six times, once in reference to “Chinese communists” and five times in reference to Sun Microsystems. Note, however, that no mention is ever made of the strong direct control Microsoft asserts over OOXML, its having sole chairmanship of the Ecma TC45, and its having secured a committee charter that prevents any changes to OOXML that are not compatible with Microsoft Office.
Again, we're puzzled by the inaccuracy on one hand and the lack of balance on the other.
Now, back to the Burton Group, where Guy Creese responds on the Burton Group blog:
We were not expecting to be told that Sun had significant sway over the standard, but several people told us that (spread across more than one ODF-oriented vendor), which is why we noted it in the report. As the ODF Alliance notes, IBM and Sun—two of Microsoft’s most powerful productivity application archrivals today (as well as partners to Microsoft in myriad other domains, e.g., Web services-related standards initiatives)—collectively control 70% of the votes in the ODF TC which determines if proposals will be accepted or rejected. This suggests there is ample opportunity for conflicts of interest.
Guy, excuse me, did you say "conflicts of interest"? Please explain. Or maybe when Peter O'Kelly comes back from speaking at Microsoft's Office Developers Conference he can explain it for us?
In any case, the factual errors in your report with respect to the control of ODF have been clearly demonstrated, but instead of simply admitting and correcting the error, you hide beyond anonymous sources and further impugn OASIS by charging some sort of "conflict of interest".
To follow your logic further demonstrates the absurdity of it. If you believe that the fact that IBM and Sun "collectively control 70% of the votes in the ODF TC" lends weight to your argument, then what is shown by the equally true mathematical fact that IBM plus independent members also control 70% of the votes? Why is this equally true fact not mentioned? This is the nature of plurality, that there are many different combinations of votes that could make a majority position. Further, note that these groups in practice do not always vote as a bloc. We've had votes where the independent members split their vote, and we even had a vote where the IBM members did not all vote alike. So much for your simplistic control theory.
I will not question whether your anonymous sources indeed misled you. For sake of argument, I will accept unquestioningly that you indeed had sources and that they said exactly what you claim they said. However, having sources does not excuse you, as an analyst, from doing basic fact checking. The rules of OASIS and the voting composition of the ODF TC are facts, not opinions, and the correct information was sitting there, on public web sites, for you to check. It is not your fault that you were misled by sources, but it is your fault that you did not verify their claims. To publish controversial statements based on anonymous sources without fact checking, this is not something that represents the Burton Group's finest work.
The Burton Group has denigrated the work and the members of the OASIS Open Document Format Technical Committee (of which I am Co-Chair) with published statements that have been shown to be false. The Burton Group owes us an apology and an immediate retraction.
Waiting until after February, after the DIS 29500 process concludes, to make corrections is unacceptable. Since your stated purpose in making this report public was to "advance the debate" in the current OOXML ISO process, withholding factual corrections until after that process concludes would imply that you and the Burton Group see no problems with knowingly persisting in influencing an ISO ballot with false information published under the Burton Group name. I don't believe that is the image that the Burton Group would want to project. So I urge that a correction is in order now.
Wednesday, February 06, 2008
By Metes and Bounds
But why were so many people involved? Why so much complexity?
Each, aside from having a professional specialty, looks out for a specific interest and has a duty to a particular participant in the transaction: to the buyer, to the seller, to the lender or to the tax collector.
In the end, I have my house, my land, and a little piece of paper called the Quitclaim Deed, which conveyed the seller's interest in the property to me. The deed specifies the parties to the transaction, the amount paid, and references a legal description of the property, which reads in part like:
N 20-07-00W 94.41 feet to a lead pipe, N 21-06-30W 372.04 feet to a stone bound, N 63-17-40E 291.05 feet to a stone bound, 52-29-20E 360.60 feet to a stone bound, etc.
This style of land description is called Metes and Bounds and is particular to New England, and has been in use here since the colonial period.
What is interesting, in terms of the real estate transaction, is the division of responsibilities. The attorney looks at the paperwork, but the surveyor verifies the property description. The attorney ensures that the form of the deed is in accordance with local law and customs, but he is not going to be able to tell you that your garage has been built half on your neighbor' plot. That is the job of the surveyor.
In the end, I am happy only if I have successfully purchased the property I intended to purchase. If the formal paperwork is executed properly, and if the survey matches the bounds of the property I believe I am purchasing, then I am happy. If either of these fail, I will not be pleased with the transaction, even if the other criterion is met.
I am reminded of the above mechanisms when thinking about IP issues in Microsoft's OOXML. The analogy works like this:
You only have full rights to implement OOXML only if you are satisfied with:
- The Conveyance = The formal language of Microsoft's Open Specification Promise is bullet-proof
- The Survey = The technical detail of the OOXML text is complete and accurate and matches what you need to implement in your software
- The Title = Microsoft owns (and continues to own) all of the patents required to implement the portion of OOXML you wish to implement in your software
I guess I'm just not as easily impressed. Who there is looking out for your interests? Who has a fiduciary duty to you? Not Microsoft. Not Ecma. Not ISO. So you better watch out for your own interests.
So let's ask, what is Microsoft actually promising? The Open Specification Promise says in part:
Microsoft irrevocably promises not to assert any Microsoft Necessary Claims against you for making, using, selling, offering for sale, importing or distributing any implementation to the extent it conforms to a Covered Specification...
...“Microsoft Necessary Claims” are those claims of Microsoft-owned or Microsoft-controlled patents that are necessary to implement only the required portions of the Covered Specification that are described in detail and not merely referenced in such Specification.
Certain rights are given to you, and the bounds of these rights are circumscribed by the stated restrictions. In this case, instead of enumerating exactly what patents are made available, they are described implicitly by these criteria:
- Microsoft owns or controls the patents
- They are necessary to implement "only" the required portions of the specification
- That portion of the specification must be described in detail and not merely referenced.
I think of the above as the "Metes and Bounds" of Microsoft's OSP. Finer minds than mine have tried to make sense of this, parsing the language, thinking about how every word could be interpreted, etc. But you can pour over this all you want, and not determine what your rights are. Edward Coke, Lycurgus of Sparta and Moses could all declare this to be the finest legal document since the Code of Hammurabi, but that would mean little.
The key point is that because the way the OSP is crafted, your rights are intrinsically tied to the quality of the underlying OOXML specification. If that text is vague, and lacking in detail in places, then your rights are reduced. Your rights are secured only to the extent the text is detailed. Low quality, low level of detail or missing detail equates to lack of protection for the implementor.
Since no one pontificating on how there are no IP issues has actually read the standard, I think they should be a little less effusive in their praise. But then again, they have no duty to watch out for your best interests, have they?
Let's look at an example, to make this concrete. DIS 29500 contains the following passage which we've discussed before:
2.15.3.26 footnoteLayoutLikeWW8 (Emulate Word 6.x/95/97 Footnote Placement)
This element specifies that applications shall emulate the behavior of a previously existing word processing application (Microsoft Word 6.x/95/97) when determining the placement of the contents of footnotes relative to the page on which the footnote reference occurs. This emulation typically involves some and/or all of the footnote being inappropriately placed on the page following the footnote reference.
[Guidance: To faithfully replicate this behavior, applications must imitate the behavior of that application, which involves many possible behaviors and cannot be faithfully placed into narrative for this Office Open XML Standard. If applications wish to match this behavior, they must utilize and duplicate the output of those applications. It is recommended that applications not intentionally replicate this behavior as it was deprecated due to issues with its output, and is maintained only for compatibility with existing documents from that application. end guidance]
I don't think there is anyone out there who would argue that this part of the specification is "described in detail". In fact it disclaims all detail. Further, it would be hard to argue that this portion is required, when the text itself so clearly recommends that applications do not try to implement it. So, I would question whether the Open Specification Promise would apply to any patents necessary to implement this feature.
Would anyone assert that a plain reading of the OSP suggests otherwise?
But you might say, "Please Rob, you can't be serious. Who would try to get a patent on laying out a footnote? That just doesn't happen in the real world."
But consider for Microsoft's patent application "Method and computer readable medium for laying out footnotes" (US20060156225A1). I'm not saying that application matches the above feature in the standard, but if it did, is there anyone who will argue that the Open Specification Promise would not apply in this case?
This is the essential risk with OOXML, that its rushed preparation has lead to a 6,000 page standard full of errors, omissions and ambiguities, and your rights as a implementor are reduced by this low level of quality. This is not necessarily from malice, but from neglect.
Now what if I told you, that Ecma, as part of its proposed Disposition of Comments report has added a detailed functional description to the footnoteLayoutLikeWW8 element. This is a good thing, right? Not only do you have more technical detail, but you also have more IP rights because of that detail.
So this should give us pause. An aggressive review of OOXML (thank you) has resulted in implementors having more rights under the Open Specification Promise. But along with that joy should come concerns that the OOXML standard has been very sparsely reviewed. Many NB's complained about not being able to review it thoroughly. I know I have not read more than 20% of it. So the full set of features which are unimplementable due to vague or missing detail is unknown. This same set of features may require Microsoft patents which are not eligible under the Open Specification Promise for lack of such detail.
So this is yet another reason why a rushed review is so harmful. Not only does it leave OOXML full of technical errors, including portability, accessibility, and interoperability flaws, but the resulting pervasive lack of detail means that implementors have far less IP protection than they may think they have.
You are buying a house. The attorney has blessed the paper work, the title insurance is in hand. The surveyor shows up at the closing and says, "I'm sorry. Your property was so big and complicated, that we were able to only survey 20% of your property line in time for the closing."
The lawyer grins and says, "All the paper work is in order. All you need to do is sign."
What do you do?
Labels: OOXML
Thursday, January 31, 2008
The Case for Harmonization
First note that many JTC1 NB's raised the issue of harmonization in their DIS 29500 ballot comments last September. Some merely requested harmonization, such as Korea, South Africa, Belgium, Peru, Switzerland, or the Czech Republic, while others in addition outlined ways to achieve harmonization. For example, AFNOR, the French NB stated:
After 5 months of extensive discussions between stakeholders in the field of revisable document formats, AFNOR, in the aim to obtain a single standard for XML office document formats within 3 years, makes the following proposal:(Note that a Technical Specification, in ISO process, is for proposals which lack insufficient support for approval as an International Standard, but for which publication is still desired. This may be appropriate for OOXML.)
- Split the current ECMA 376 standard in 2 parts in order to differentiate the essential OOXML core functions necessary for easy implementation from those functionalities that are needed for the exchange of legacy office file formats;
- Incorporate the technical comments below and those in the attached comment table submitted to the Fast Track;
- Attribute the status of Technical Specification to both parts;
- Establish a process of convergence between ODF (already standardized as ISO/IEC 26300) and the above mentioned OOXML core. ISO/IEC shall invite parties involved to commit themselves to initiate simultaneously the revisions of the existing ODF v1.0 and the OOXML core in order to obtain at the end of the revision process a standard as universal as possible.
New Zealand's proposal was similar:
- OOXML should be considered by JTC 1 for publication as a Type 2 Technical Report.
- Seek to harmonize with the existing ODF standard to reduce the cost of interoperability, cost of having two standards, and cost of support/maintenance .
- to have more than 63 columns in a table
- to have background images in tables
- to have font weights beyond “normal” and “bold”.
Ecma rejected every single one of these requests. They did not argue that the requested features were unreasonable. They did not argue that the requested feature was not needed. Their argument was that harmonization of the formats was not necessary because there exist tools that will translate between OOXML and ODF. In other words, they rejected these requests merely because they were pro-harmonization, regardless of the underlying merit or need of the feature. Ironically, Microsoft's conversion tools are restricted in their fidelity because of the lack of these very features.
On the question of harmonization, we are either moving toward it, or we are moving away. There is no time better than the present to harmonize. Waiting will only make matters worse, as we will then need to consider legacy OOXML documents as well as legacy binary and legacy ODF documents. The Ecma response does not move us toward harmonization, but starts down the road toward further divergence, a long and costly divergence.
Tim Bray made the critical observation back in 2005, “The world does not need two ways to say 'This paragraph is in 12-point Arial with 1.2em leading and ragged-right justification'.”
Microsoft likes to claim that harmonization is impossible, that slapping together the features of both standards would lead to a messy, impenetrable mess. Of course, but only an idiot would suggest that as an approach to harmonization. So why do they always bring that up as their strawman?
A look at OpenOffice and Microsoft Office shows a huge degree of functional overlap. Harmonization starts from looking at this functional overlap – and there is a significant, perhaps 90%+ area where they do overlap – and expresses the functional overlap identically, using the same xml schema. In other words, harmonization identifies the commonalities at the functional level and finds a common representation for that commonality.
It would also be expected that the common functionality between ODF and OOXML would also include a common extensibility mechanism, a way for a vendor to express application-specific features that are outside of the harmonized standard.
The remaining 10% of the functionality would be the focus of the harmonization work, the area that requires the most attention. Some portion of that 10% will represent general-purpose features that we can imagine multiple application supporting. We take those features and add them to ODF. That remaining portion of the 10%, which only serves one vendor's needs, such as flags for deprecated legacy formatting options, could be represented using the common extensibility mechanism.
Does this sound impossible? That's not what Microsoft says. Gray Knowlton, Group Product Manager for Microsoft Office, was candid to PC World a couple of weeks ago:
Also, if individual governments mandate the use of ODF instead of Open XML, Microsoft would adapt, Knowlton said. The company would then implement the missing functionality that ODF doesn't support. However, those extensions would be custom-designed and outside of the standard, which is counter to the idea of an open document standard, Knowlton said.
So we've agreed that this approach is technically feasible. We're also agreed that extending ODF outside of the standards process is not a good idea. So the obvious solution is to extend ODF within the standards process. So, let's do it! What are we waiting for?
There is no reason why, by a harmonization process, all of the functionality of Microsoft Office cannot be represented on a base of ISO 26300 OpenDocument Format. I personally, as Co-Chair of the OASIS ODF TC, stand ready and willing to sponsor such a harmonization effort in OASIS. So let's start harmonization now, and avoid further divergence.
My read of NB comments indicates that there is a sizable bloc, perhaps even a decisive bloc, of NB's who are in favor of harmonization. Lets push on this and articulate a roadmap along the lines of the proposals by France and New Zealand, that accomplishes this.
Friday, January 25, 2008
What every engineer knows
Let us start by imagining that a new bridge is being built in your area. The company that is building the bridge is very eager to have it open by a particular date. In fact, their contract calls for monetary penalties for every day the opening is delayed beyond that date. However, before it can be opened to traffic, it must be inspected to ensure that the welds conform to the applicable standard. For sake of argument let's say the standard is the AASHTO/AWS D1.5M/D1.5:2002 Bridge Welding Code.
The inspectors may inspect all of the welds and find that they are all acceptable. What do you you think of this, as someone who will soon ride over that bridge? Is this good news? Yes, if you trust the expertise and independence of the inspectors, and their testing process and equipment. If the inspectors do their job properly, and they find no defects, then this indeed is cause for celebration.
But what if the inspectors found a handful of defects, perhaps some welds that failed fatigue testing? If indeed the defects are few, and are localized, then they can be fixed and retested and we can still open the bridge on time. But it is critical that the changes are localized, that there are no far reaching changes. A bridge is not just a collection of independent pieces of metal. They all work together, and as a whole have static and dynamic mechanical properties and relate to load capacity, stresses, thermal characteristics, resonance, etc. Although some fixes may be only localized in their impact, meaning only the area changed needs to be retested, other fixes may have a larger impact and require that everything be retested.
In any complex system, some defects are expected. A sign of good of engineering process is that larger, structural defects are detected or prevented at the earliest possible moment, when they are easiest and least expensive to fix. Where this is not accomplished, large design defects may be first detected at final inspection time, and costly and pervasive rework and retesting may be required, or in the extreme, the bridge may need to be torn down.
The engineering maxim is "fail early". Now this may seem like an odd thing to say. Shouldn't we always try to prevent failure or at least delay it as long as possible? Certainly, if you can prevent failure, then do so. But it is rarely the case where all defects can be prevented. But as engineers, we can design systems, and testing procedures so that flaws become evident as early in the process as possible, when they can be fixed in architecture and design documents rather than in built structures, or at least be found as early in the construction process as possible. This is a frequent source of stress between those who build and those who sell. The important thing for all to understand is that failing early is actually a form of risk reduction. The sooner you fail, the sooner you can fix the defect and start again.
Back to the analogy.
Let's build another bridge. Along comes MegaCorp, who wants to build a bigger bridge, a much bigger bridge than any attempted previously, a MegaBridge. There is nothing wrong with that per se. The history of engineering is the history of making bigger pyramids, wider vaulted ceilings, taller skyscrapers and longer bridges.
Of course, the fact that MegaBridge is right down the street from the new bridge that just opened last week is a bit odd. But MegaCorp tells us that is OK. We're not required to use their bridge if we don't want to.
Further suppose MegaCorp also wants to construct this MegaBridge in record time, faster than others have constructed bridges even a fraction of their size. This is certainly ambitious, but there is no law against ambition. Progress is made by those who are ambitious. We learn from their successes as well as their failures. The important thing is that an ambitious MegaBridge, like any other bridge, is held to the same standards as any other bridge, that proper inspections are carried out and that quality criteria are satisfied.
Months later and the construction of MegaBridge is complete. Time for inspection. But one problem -- the MegaBridge is so large that it is impossible to carry out an inspection in the scheduled time. There are simply not enough inspectors available to carry out the task and complete it by the targeted opening time.
What should we do?
It is useful at this time to consider another engineering maxim, "fail safe". If a system is overloaded, or detects an error condition, it should fail to a safe state, a state least likely to cause damage. We see this applied in many of the systems we use every day. Traffic lights fail safe to flashing red, GFCI circuits fail safe by switching off current if a ground fault is detected, and train air brakes fail safe by applying the breaks if air pressure is lost.
The concept of a "fail safe" applies to processes as well as mechanical systems. A committee, by having a quorum requirement, ensures that it fails to a harmless, inactive state if a snowstorm prevents a representative portion of the committee from attending a meeting. A criminal trial, by presuming innocence and requiring a unanimous verdict to convict, ensures that in case of deadlock, the defendant is let free. Similarly, a bridge quality inspection protocol should include a fail safe provision, that if the inspection cannot be completed, the bridge should not be certified as fit for use. The inspection process should fail safe to non-certification. Ordinarily, engineering practice would be to take whatever time is necessary to inspect the bridge fully, or fail the inspection.
(Here our tale diverges from standard engineering practice and starts to relay, by analogy, the increasingly bizarre tale of OOXML's exploits in and of ISO.)
But MegaCorp wants the MegaBridge to open on time. They force the inspection to continue, even though the inspectors claim there is not enough time. In order to "help" the inspection and despite the obvious conflict of interest, MegaCorp instructs a large number of its own employees, qualified or unqualified, to volunteer as bridge inspectors. They further recruit employees from subsidiaries and suppliers to become inspectors as well. In at least one case, MegaCorp tells a supplier, newly-minted as an inspector, "Don't worry if you know nothing about bridges. We'll tell you what to say. All you need to do is say that the bridge is safe. You'll be rewarded later for helping us here."
So the bridge inspectors go out, old and new, qualified and unqualified and come back with their individual preliminary reports. The older, more experienced inspectors are critical in their evaluation:
The bridge is full of defects. Although, as we mentioned earlier, the mandated schedule did not permit us to test all of the critical welds, of the ones we did test, we found numerous defects. In fact, the number of defects we report is artificially low, since it was limited by our available inspection time. If we had been able to complete a full inspection, we would have detected and reported many more problems.
We further found pervasive structural problems. This bridge is unsound. We can not certify it. We further question why it is necessary to open up a new toll bridge at all, when we just opened up a new free bridge down the street.
The newly-minted inspectors, who for the most part are economically dependent on MegaCorp, were more supportive:
Although some minor problems were indicated, we believe these can all be fixed during routine maintenance. We are not concerned about the time permitted for inspection. We did what the process required. And when you count all the new inspectors that MegaCorp has brought to the process, no bridge has been more inspected. Considering the number of defects reported, this is the most-inspected bridge in history. We recommend that MegaBridge be certified and opened as scheduled.
Of course, from an quality control perspective, this is seriously flawed. The checks and balances between those who build, those who test and those who sell have been eliminated. Although it would not be unusual for some MegaCorp inspectors to be involved in the inspection process, the late arrival of so many unqualified, newly-minted inspectors, and the shift of balance to MegaCorp's hand-picked inspectors, calls into question the independence and technical sufficiency of the entire inspection process.
The inspectors are polled to see whether the bridge can be certified. The vote is close, but the answer is no, the MegaBridge cannot be certified in its current condition. The inspectors, mainly the older, more experienced ones, record a report of 3,522 specific defects in the MegaBridge, far more defects than have ever been found in any other bridge.
MegaCorp is irate. They blast the experienced inspectors in the press, while simultaneously reassuring their stockholders that this setback is just the next step forward to success. They give their engineers the inspection report and demand a quick response. "We must open the bridge on time!" they yell. The MegaCorp engineers work day and night, over weekends, over the holidays even, in order to develop written proposals to address each of the reported flaws in the bridge.
The inspectors are given the proposals and asked whether they believe the proposals are sufficient to allow the MegaBridge to be certified. Although the newly-minted inspectors are quick to affirm the adequacy of the proposal, the old-timers just shake their heads in disbelief, with one stating to the press:
You could fix every last defect in that report and the MegaBridge would still not be sound. Since we never inspected all of the critical welds in the first place, fixing only the defects we reported is insufficient. It is not enough for us to merely retest the ones we reported as defective. We need to test all of them.
Also, the fact that you are making pervasive changes to the road surface, the suspension materials and the pillar diameters, far-reaching design changes which were clearly rushed and have not gone through normal review procedures, I'm afraid that all of our previous tests are now invalidated as well.
Additionally, many of your proposals either avoid addressing the flaws, paper around the flaws, or even introduce new flaws. We need to re-certify the new design before we can even think about retesting the bridge.
However considering the huge number of defects reported, the even larger number of defects undetected because of lack of inspection time, the questionable competency of the newly-minted inspectors, and overt corruption of the process by MegaCorp, my recommendation would be to tear this thing down before it falls over and hurts someone.
Thus ends the tale of what every engineer knows.
Labels: Engineering, OOXML, Standards
Comedy tonight!
First up is Tiffany Maleshefski in her eWeek Desktop Confidential column, and her critique of the Burton Group Report, written in the form of a love letter to Microsoft.
Next is Ecma's Magic 8 Ball, the source of responses to your NB's ballot comments, contributed by a reader who wishes to remain anonymous.
Finally, on YouTube, a droll faux-endorsement of OOXML by FFII President Pieter Hintjens called "Six Reasons to Migrate to OOXML".
Labels: OOXML
Monday, January 21, 2008
The Standards Trolls
Mary Jo Foley at least poses it as a question:
What’s your take? Are IBM and Google talking out of both sides of their mouths, when it comes to their “OOXML is evil” claims? Or is Microsoft increasingly grasping at straws, as the late February ISO vote on OOXML standardization inches closer?
Eric Lai, not to be outdone, comes up with the sensational headline "Lotusphere: Whoops! IBM products support Microsoft's Open XML doc format", a headline made all the more remarkable by the fact that the article has absolutely nothing whatsoever to do with Lotusphere.
Well, let's look at the "evidence" Microsoft points to and try to make some sense of it.
First up we have the Lotus Symphony support forums, where in response to a question from a user about Office 2007 support, "supportadmin3" replied "Great idea Robert!!! I will submit your feature request to the proper team...Thanks!"
I don't know what to say about that. It is a sign of fanciful desperation to turn the courtesy of "supportadmin3" into a statement of IBM support for OOXML. It is bizarre that anyone would portray this as more than courtesy.
What else do you have?
The thundering herd also pointed out this article from developerWorks last August on how to use ODF and OOXML with DB2 9 pureXML. Does pureXML support OOXML? It sure does!!! In fact it supports any well-formed XML document or fragment, OOXML, ODF, BerniesOldTimeMedicineShowAndJamboreeML, whatever you have. DB2 pureXML can handle it all, and let you query it via SQL or XQuery. Yes, OOXML and every other XML format known to man is supported!!!
[24 January It has come to my attention that the above paragraph has caused confusion among readers who are not familiar with the term "well-formed". This is a technical term, defined by the XML standard. It essentially means that the XML will parse, that it follows the underlying syntax of XML. It is the minimum qualification for something to be called "XML". It does not imply any fitness for a particular purpose, or any level of quality. It certainly does not mean the XML is "easy to process by standard XML tools".
By analogy, I could say that a novel is poorly written, boring, in bad taste and artistically without merit, but at least the author spell-checked.]
What else do you have?
Ah, yes, there is the claim that DB2 Content Manager supports OOXML. Sorry guys, that page is just noting how to add MIME types for OOXML to Content Manager. Wow, you're moving now. With apologies to Steve Martin in The Jerk:
application/vnd.openxmlformats-officedocument.wordprocessingml.document! I'm somebody now! Millions of people look at this registry every day! This is the kind of spontaneous publicity, you're name in print, that makes people. I'm in print! Things are going to start happening to me now.
What else do you have?
Oh yes, Document Conversion Services, in WebSphere Portal. The claim is that DCS supports OOXML. Yes, indeed it does!!! DCS, via 3rd party code, supports 100's of file formats. So OOXML joins the elite company of supported formats, along with XYWrite (1985), VolksWriter (1982) and Lotus Manuscript (1986).
I'm elated that Microsoft thought this DCS support was worth 6 blog posts. In fact in the 5 years since I designed and wrote the DCS framework (yes, ironically, I was the architect of that component) it has never gained such notice.
But here is something you might not know about DCS. Its main purpose is to handle the graveyard formats, the formats that you might rarely find in a document repositories,but don't want to bother installing editors for on your clients. So instead of locating and installing dozens of legacy word processors, you simply have DCS run on the server and convert, on-demand, these legacy documents into a quick & dirty HTML rendering for viewing. So, DCS is great for handling those rare times when you run across an OOXML file.
So welcome, OOXML, to the exclusive company of "Every Document Format Known to Man" . I'm glad that you are so excited. But your desperation in trying to dredge up examples of support for OOXML, any examples, is so pitiful that I feel must offer some assistance.
Let's start with the conformance clause for OOXML (Part 1, Section 2.5), the standards language that defines what an OOXML supporting application must be able to do, in order to claim it supports OOXML:
2.5 Application ConformanceOr, in plain English, in order to be able to claim conformance with OOXML, an application must not crash when presented with a valid OOXML document, or must be able of producing at least a single valid OOXML document. This is not exactly a high threshold.
Application conformance is purely syntactic; it also involves only Items 1 and 2 in §2.3 above.
- A conforming consumer shall not reject any conforming documents of the document type (§4) expected by that application.
- A conforming producer shall be able to produce conforming documents.
Some examples of applications that support OOXML by that definition:
- The DOS "copy" command is a conforming OOXML consumer and producer, since it can accept and produce valid OOXML documents.
- The DOS "del" command is a valid OOXML consumer, and one which I heartily recommend.
- WinZip, PkWare and every other Zip application out there are also conforming producers and consumers of OOXML.
- Every text editor and every XML editor and every other application that can consume text or XML also supports OOXML.
- Every web server and every application server, and every file system in the world can support OOXML if it can store an OOXML document without crashing or restore a valid OOXML document.
- A USB memory stick is also a conformant OOXML application, since it can consume and restore OOXML documents,
- An MP3 player is a conforming OOXML application, if it has a disk mode that allows files to be stored and restored.
Feel free to suggest your own nonsense.
By analogy to patent trolls, what we're seeing here is the behavior of a standards troll -- defining a conformance clause so vague that everything in the world is considered to support it, and then searching through competitor's web sites in hopes of finding some place where they stumbled into supporting it, and then trying to extract some advantage from it.
The point should be to look for examples of where OOXML is supported to the highest degree, to point out the best examples of high-fidelity interchange that your standard allowed. You would think that with so many people at Microsoft with "interoperability" in their job titles, that this would be obvious. I guess not. But don't be sad. You can always count on "supportadmin3" to cheer you up!!!
Labels: OOXML
Sunday, January 13, 2008
You are Here
What has happened so far?
OASIS OpenDocument Format (ODF) is the current ISO standard (IS 26300:2006) for XML-based word processing, spreadsheet and presentation documents. By using an open standard format like ODF, consumers avoid vendor lock-in and are able to have a choice of suppliers. ODF is widely supported by vendors, in both commercially-available and open-source software, and is seeing strong adoption world wide.
In early 2007 the European Computer Manufacturer's Association (Ecma), after a superficial review clothed in secrecy, submitted the Microsoft-authored document format specification, Office Open XML (OOXML), to ISO/IEC JTC1 for approval as an International Standard. This provocative submission occurred only three months after JTC1 published OpenDocument Format (ODF) as a unanimously approved International Standard.
OOXML has been widely criticized as flawed standard, having been designed with only a single vendor's objectives in mind and designed to work fully only with that vendor's products. Also, in its rush to catch up with ODF, OOXML was submitted to JTC1 in an immature state, hastily written and insufficiently reviewed. At the time it was approved by Ecma, there were zero commercially available implementations of OOXML. The only support was in the beta version of Office 2007.
In a preliminary 30-day "contradiction period", JTC1 member bodies were invited to raise objections if they believed that the OOXML submission contradicted existing ISO or IEC standards. Twenty countries responded in this period, most of them raising concerns over OOXML. Several NB's raised objections to the extreme length of the proposal (over 6,000 pages) and raised IP concerns. JTC1 administrators effectively ignored all of these objections and proceeded to a 5-month ballot.
On September 2nd, 2007, after a 5-month review period by ISO/IEC JTC1 national bodies (NB's), the ballot to approve DIS 29500 Office Open XML (OOXML) failed, not reaching the required 2/3 approval by JTC1 P-members. This ballot was tainted by many documented irregularities. Over 3,500 comments were submitted by NB's with this ballot, documenting specific errors, ambiguities and omissions in the OOXML proposal.
What is next for OOXML?
JTC1 procedures allow a proposer of a failed standard the opportunity to respond to the ballot comments submitted, in hopes of persuading members to change their vote from disapproval to approval.
This procedure occurs in several steps. In the first step, the formal proposer of the standard, Ecma in this case, writes a Proposed Disposition of Comments report, in which they recommend their proposed resolutions for each comment from the September 2nd ballot. This is the document that is due on January 14th.
We've seen draft versions of these proposals resolutions, but they were provided in a rough form, impossible to review, as 3,000+ separate PDF files, amounting to over 5,000 pages, ordered alphabetically by the country that made the underlying technical comment. This is not exactly a convenient arrangement for seeing, for example, all comments related to spreadsheet date serial numbers, or for doing any other topical review. So it will be good to finally have Ecma's full Proposed Disposition of Comments report, which presumably will be in a more usable format.
Note that the Ecma submission on January 14th will be non-binding, merely a set of proposals. Ecma does not have the power to change a single line in OOXML, since the proposed standard is under JTC1 control. Ecma can propose solutions to comments, as can JTC1 members themselves, as they did in in great numbers in the proposals that accompanied ballot comments on September 2nd. No changes are actually made until approved by the Ballot Resolution Meeting (BRM). This fact should be kept in mind as Ecma reviews with some NB's a draft of their Proposed Disposition of Comments Report. This is just a proposal at this stage.
What about the BRM?
The BRM, or Ballot Resolution Meeting, will occur February 25th-29th in Geneva. All NB's who voted on the September 2nd ballot are able to attend, and approximately 35 NB's are planning on sending delegates, with attendance expected to fill the hall to capacity, 120 people. Ecma can attend, but they cannot vote.
The BRM, preferably by consensus, though formal votes are also possible, will agree on a set of changes to the text of OOXML. Proposals for changes may come from Ecma's Proposed Disposition of Comments report, as well as from NB ballot comments. Resolutions may be debated, amended, substituted, approved, rejected, etc., according to a vote of the meeting. Or at least that's my understanding. The actual documented BRM process in JTC1 Directives is entirely inadequate, with a lack of detail that is better suited to the by-laws of a Ladies Over-60 Bowling League than it is to ISO. The Convenor of the BRM, Alex Brown has the unenviable task of consulting bird entrails or performing whatever other divinations are required to turn JTC1's vague scratchings into a working meeting. We wish him luck !
At the adjournment of the BRM we will have be an agreed-upon set of editing instructions for the Ecma Project Editor to apply to OOXML. Only changes approved by the BRM may be made to the standard. Note that the BRM does not indicate approval or disapproval of the OOXML standard itself. Its purpose, its technical role, is merely to make changes to the text of the standard.
What occurs after the BRM?
After the BRM adjourns (February 29th) there will be a 30-day "reconsideration" period in which those NB's who voted on the September 2nd ballot will be able to change their vote. They can change their vote in any direction, from approval to disapproval, approval to abstention, abstention to approval, abstention to disapproval, disapproval to approval, or disapproval to abstention.
Note that the criteria for the vote is the same as on September 2nd – Should DIS 29500 Office Open XML be approved as an International Standard? It is not a vote on the BRM, nor is it a vote on Ecma's Proposed Disposition of Comments. The question on February 29th is the same question as Sept 2nd -- Is the DIS 29500 proposal acceptable?
Labels: OOXML
Thursday, December 20, 2007
Those who forget Santayana...
What is interesting to me, and why this "old news" is worth talking about, is the analysis Novell made in their complaint of Microsoft's treatment of document format standards. The concerns of 2004 (or 1995 even) are very similar to the concerns of 2007. Let's go through Novell's argument and see where it leads us.
91. As Microsoft knew, a truly standard file format that was open to all ISVs would have enhanced competition in the market for word processing applications, because such a standard allows the exchange of text files between different word processing applications used by different customers. A user wishing to exchange a text file with a second user running a different word processing application could simply convert his file to the standard format, and the second user could convert the file from the standard format into his own word processor's format. This, a law firm, for instance, could continue to use WordPerfect (which was the favorite word processor of the legal profession), so long as it could convert and edit client documents created in Microsoft Word, if that is what clients happened to use...
This is a good statement of the benefits of an open document standard. Note that Novell is not arguing that the benefit of a standard is to get information in or out of a single vendor's product, like Microsoft Office. The benefit is that a standard provides for interchange between any pair of word processors.
...Microsoft knew that if it controlled the convertibility of documents through its control of the RTF standard, then Microsoft would be able to exclude competing word processing applications from the market and force customers to adopt Microsoft Word, as it soon did.
Note also that Novell is not complaining here about Microsoft's control of the binary DOC format (and its many variations). Instead, what Novell complains about is Microsoft's control over the document exchange format RTF, or Rich Text Format, used in those days to exchange data between word processors. He who controls RTF, controls document exchange, controls vendor lock-in and has the sole means of improving the fidelity of document exchanges.
In fact, Microsoft claimed that RTF addressed this very concern -- document exchange in a cross-platform, cross-application fashion, as stated in the introduction to version 1.0 of their self-styled "standard":
The RTF standard provides a format for text and graphics interchange that can be used with different output devices, operating environments, and operating systems. RTF uses the ANSI, PC-8, Macintosh, or IBM PC character set to control the representation and formatting of a document, both on the screen and in print. With the RTF standard, you can transfer documents created under different operating systems and with different software applications among those operating systems and applications
It should have been obvious at the time that vesting exclusive control of an interoperability interface in a single company was a bad idea. But I guess the world didn't realize what dealing with Microsoft meant. But we know better now. So why are we making the same mistakes in 2007?
Those who control the exchange format, can control interoperability and turn it on or off like a water faucet to meet their business objectives. I don't know how many people noticed the language in Microsoft's press release announcing their sponsored interoperability track at XML 2007 a few weeks ago:
In its approach, Microsoft strives to bring technologies to market in a way that balances competitive innovation with the real interoperability needs of customers and partners.
Let that sink in for a minute. Microsoft is saying that they need to balance interoperability and profit. (Their profit, not yours) They can't maximize for both simultaneously. They need to trade one off for the other.
Continuing with Novell's 2004 complaint:
92. The specifications for RTF were readily available to Microsoft's applications developers, because RTF was the format they themselves developed for Microsoft's office productivity applications. Microsoft withheld the RTF specifications from Novell, however, forcing Novell to engage in a perpetual, costly effort to comply with a critical "industry standard" that was, in reality, nothing more than the preference of its chief competitor, Word. Indeed, whenever Word changed its own file format, Microsoft unilaterally and identically changed the RTF standard for Windows, forcing Novell and other ISVs constantly to redevelop their applications. In this manner, Microsoft gave Word a permanent, insurmountable lead in time-to-market and made document conversions difficult for users otherwise interested in running non-Microsoft applications. Many WordPerfect users were thus forced to switch to Microsoft Word, which predictably monopolized the word processing market....
So, the RTF standard was just a dump of Word's features, done when and how Microsoft felt like doing it. As one wag quipped, "RTF is defined as whatever Word saves when you ask it to save as RTF."
This should sound familiar. OOXML is nothing more than the preferences of Microsoft Office. Whenever Word changes, OOXML will change. And if you are a user or competitor of Word, you will be the last one to hear about these changes. ISO does not own OOXML. Ecma does not own OOXML. OOXML, in practice, is controlled and determined solely by the Office product teams at Microsoft. No one else matters.
Consider that Microsoft has recently proposed over 1,700 changes to the OOXML specification, including fixes that presumably will be made into a future Office 2007 fixpack. Microsoft knows what these fixes will be. The Office developer teams know what these fixes will be. But if you are a competitor of Microsoft's in this space, do you know what these changes are? No. Microsoft has decided to keep them a secret, claiming that the ISO process allows them to withhold interoperability information from competitors in what they maintain is an "open standard".
Further, the coding of Office 14 a.k.a. Office 2009 is well underway. Beta releases are expected in early 2008. But are file format changes needed to accommodate the new features being discussed in Ecma? No. Are they being discussed in ISO? No. Are they being discussed anywhere publicly? No.
Is this how an open standard is developed?
My prediction is that the first time anyone hears about what is in the next version of OOXML will be when Office 14 Beta 1 is announced. Other vendors will not hear a word about the format changes until after the Beta 1 is already released. Not even Ecma will hear about the changes until after then.
DIS 29500 is already obsolete, has already been embraced and extended. You just don't know about it yet. You weren't meant to know. In fact, pretend you don't know. Give Microsoft a big head start. They need it.
Further from the Novell complaint:
93. ...As in the case of of RTF, Microsoft forced Novell to delay its time-to-market while redeveloping its applications to an inferior standard. Because these standards were lifted directly from Microsoft's own applications, those applications were always "compatible" with the standards.
And that is the key, isn't it? By owning the "standard" and developing it in secret, without participation from other vendors, in an Ecma rubber-stamp process, Microsoft rigs the system so they can author an ISO standard with which they are effortlessly compatible, while at the same time ensuring that their products maintain an insurmountable head start in implementing these same standards. There is no balance of interests in OOXML. It is entirely dictated by Microsoft, and voted on, in many cases, by their handpicked committees in Ecma and ISO.
So much for Novell's complaint from 2004. I'm told that this is still case is suspended as of November, 2007, as the two parties pursue mediation. A status report on that mediation is due to Judge Motz by January 11th, 2008. Maybe we're hear more then.
Looking at this long history of standards abuse by Microsoft, in the file format arena and elsewhere, I'm drawn to take a broader view of this controversy. It is not really a battle between ODF and OOXML. It isn't even really a battle between OOXML and ISO. It is, in the end, a battle between having document standards and not having them. Microsoft is trying to dumb down the concept of standards and interoperability to a point where these concepts are meaningless and ineffective. This is not because they want to support standards more easily in their products. No, it is because they do not want standards at all.
Remember, standards bring interoperability, the ability to try out new tools and techniques, the ability to migrate, the ability to chose among alternatives, the ability even to run non-Microsoft products. If standards are meaningless and ineffective, then the incumbent' vendor lock-in will win every time. At that point, isn't it convenient for them to have a monopoly in operating systems and productivity applications? This, in my opinion, is the essence of Novell's 2004 complaint, Opera's present complaint, and the ongoing file format debate. Microsoft's monopoly power and the resulting network effects have lead to a relationship with standards where they win by winning, by drawing, or even by cheating so much that they discredit the system.
Thursday, December 06, 2007
Bait and Switch
Let's review the record.
We start with the Ecma whitepaper, "Office Open XML Overview" [pdf] which was included in their submission to ISO:
Standardizing the format specification and maintaining it over time ensure that multiple parties can safely rely on it, confident that further evolution will enjoy the checks and balances afforded by an open standards process.
OK. So we were told that if OOXML is standardized its future evolution will be in an open standards process,with checks and balances.
Brian Jones, from a mid 2006 blog post:
There has also been talk though of taking the formats to ISO once they have been approved by Ecma, which would mean that if ISO chooses to adopt the Open XML formats the stewardship of the formats would be theirs. We've had a number of governments indicate that they would like the formats to be given to ISO, and it's likely that after the Ecma approval that will be the next step.
Again, saying that if approval by ISO is tantamount to transferring custody of the format to ISO.
Six months later, Brian wrote:
Some feedback that we got primarily from governments was that they wanted to see these formats not just fully documented, but that the stewardship and maintenance of that documentation should be handed over to an international standards body.
.
.
.
Obviously, a great way to guarantee the long term availability of OpenXML, and the confidence that it won't change is for an organization like ISO to take ownership of the spec.
OK. Not exactly a signed-in-blood promise, but still a clear, leading indication that the feedback they received from customers was for stewardship and maintenance and even ownership of OOXML to be handed over to ISO.
As the OOXML (DIS 29500) ballot drew nearer to a close, these vague intimations became outright promises. We heard over and over again that we should approve OOXML because that was the only way to ensure that the format would remain open. The first version might be a mess, but if we approve it just this once, all future versions will be developed in openness and transparency.
For example, John Scholes writes of a Microsoft promise made at an National Computing Centre (NCC) file format debate held in London on July 4th:
Would the maintenance of the standard be carried out by Ecma (assuming OpenXML became an ISO/IEC standard) or would it be carried out by JTC1? No question, JTC1. But would the detail be delegated to Ecma? No, it would all be beyond MS’ control in JTC1. Well at this point there was apparently some sotto voce discussion between Stephen and Stijn, followed by a little backtracking, but it came across loud and clear in subsequent discussions in the margins that Stephen and Jerry believed this was for real. MS was handing over control of OpenXML to JTC1 (or trying to).
I participated in this debate as well, and I can confirm that it occurred exactly as John relates. I even asked a follow-up question to make sure that I hadn't misunderstand what Microsoft was saying. They were adamant. ISO would control OOXML.
Jerry Fishenden, Microsoft's lead spokesman in the UK wrote two week's later:
There's an easy question to consider here: would you prefer the Microsoft file formats to continue to be proprietary and under Microsoft's exclusive control? Or would you prefer them to be under the control and maintenance of an independent, open standards organisation? I think for most users, customers and partners that's a pretty easy question to answer: they'd prefer control and maintenance to be independent of Microsoft. And the good news is that the Open XML file formats are already precisely that: currently under the control of Ecma International (as Ecma-376) and, if the current voting process is positive, eventually under the control of ISO/IEC. Many major and significant UK organisations have already made clear that they support this move for Open XML to become an ISO/IEC standard.
.
.
.
The United States vote is one step in the direction to put Open XML under the control of the ISO/IEC standards body.
So Jerry is stating in no uncertain terms that approval of OOXML puts it under ISO control. This statement was repeated on an August 24th update on Microsoft's "Open XML Community" web site.
(I've heard many second-hand reports of additional repetitions of this promise made at NB meetings around the world, in the run up to the Sept. 2nd ballot. If anyone participated in such a meeting and heard such assurances first hand, feel free to add the details as a comment.)
So much for the promises. What makes this story worthy of a blog post is that we now know that, even as these promises were be made to NB's, at that same time Ecma was planning something that contradicted their public assurances. Ecma's "Proposal for a Joint Maintenance Plan" [pdf] outlines quite a different vision for how OOXML will be maintained.
A summary of the proposed terms:
- OOXML remains under Ecma (Microsoft) control under Ecma IPR policy.
- Ecma TC45 will accept a liaison from JTC1/SC34 who can participate on maintenance activities and only maintenance activities.
- Similarly, Ecma TC45 documents and email archives will be made available to the liaison (and through him a set of technical experts), but only the documents and emails related to maintenance.
- No mention of voting rights for the liaison or the experts, so I must assume that normal Ecma rules apply -- only Ecma members can vote, not liaisons.
- Future revisions of OOXML advance immediately to "Stage 4" of the ISO process, essentially enshrining the idea that future versions will be given fast-track treatment
So what Ecma is offering SC34 is nothing close to what was promised. Ecma is really seeking to transfer to SC34 the responsibility of spending the next 3 years fixing errors in OOXML 1.0, while future versions of OOXML ("technical revisions") are controlled by Microsoft, in Ecma, in a process without transparency, and as should now be obvious to all, without sufficient quality controls.
This maintenance proposal is on the agenda for the JTC1/SC34 Plenary meeting, in Kyoto on December 8th. I think this one-sided proposal should be firmly opposed.
Consider JTC1 Directives [pdf], 13.13:
If the proposed standard is accepted and published, its maintenance will be handled by JTC 1 and/or a JTC 1 designated maintenance group in accordance with the JTC 1 rules.
JTC1's practice in such matters is to delegate to the relevant subcommittee, so read "SC34" for "JTC 1" above. So it is within the procedures for SC34 to make this decision. In fact, ownership by the SC is the norm. The clause "and/or a JTC 1 designated maintenance group" is a new addition to the Directives which was added right before the OOXML procedure in ISO began. (Curiously this was the same revision of the Directives that added the escape clause to the Contradiction phase that allowed OOXML to continue despite the numerous unresolved contradictions with existing ISO standards.)
So what does a counter-proposal look like?
First, I think we should defer decision on this until the next SC34 Plenary, presumably in Spring 2008. It is not clear whether or not OOXML will ultimately be approved as an ISO standard, and even if it does, maintenance does not need to be completed for 3 years. So I don't think we should rush into anything.
The UK has made a proposal to create a new working group (WG) in SC34 dedicated to "Office Information Languages":
SC34/WG4 would be responsible for languages and resources for the description and processing of digital office documents. The set of such documents includes (but is not limited to) documents describing memoranda, letters, invoices, charts, spreadsheets, presentations, forms and reports.
WG4 would be expected to work on the maintenance of, for example:and be responsible for reviewing any future office document formats.
- ISO/IEC 26300:2006
- ISO/IEC 29500 (should it exist)
I think this deserves serious consideration. This may be the type of neutral venue -- not Ecma and not OASIS -- that would be conducive to getting the technical experts together to refactor OOXML and harmonize it with ODF. Even in the likelihood that OOXML ultimately fails in its bid as an ISO standard, the draft could still be referred to a new WG4 for further work. This would also be a way for Microsoft to fulfill their promise to transfer stewardship, control and ownership of OOXML over to ISO, a promise made they made publicly and repeatedly.
Labels: OOXML
Sunday, December 02, 2007
662 resolutions, but only if you can find them
They claim to be transparent and acting so that NB's can easily review their progress in addressing their comments.
Well, let's take a closer look.
First, Microsoft has managed to get JTC1 to clamp down on information. What was a transparent process is now mired in multiple levels of security leading to delay, denial of information to some NB participants and total opaqueness to the public.
Let's review how things worked with ODF.
- OASIS ODF TC mailing list archives are public for anyone to read
- OASIS ODF TC public comment list archives are public for anyone to read
- OASIS ODC meeting minutes, for every one of our weekly teleconferences going back to 2002, are all public for anyone to read.
- The results of ODF's ballot in ISO are public, including all of the NB comments
- The comments on ODF from SC34 members are also public
- The ISO Disposition of Comments report for ODF is also public for anyone to read
Short of allowing the public to read my mind, there is not much more we can do in OASIS to make the process more transparent. (And if you read this blog regularly you already have a good idea of what I'm thinking.)
But what about the OOXML process? Every single one of the above items is unavailable to the public, and in many cases cases is not available even to the JTC1 NB's who are deciding OOXML's fate.
In fact, OOXML is moving in reverse. Documents that were once public, such as the Sept 2nd. ballot results and NB comments, have been taken down and replaced with password-protected versions (Look for the DIS 29500 documents here. They all used to be available for all to download.) How do you get access to the password? The password is made available to NB points of contact "on request". But so far few NB's have requested the password. You can see here which ones have requested the password and which have not. As of today, only 18 of 51 NB's have requested the password. Only 35% of SC34 NB's have access to the same information they had back in September. Indeed, we're moving backwards.
In the particular cases of these "662 responses", Ecma is hosting them on their web site, on a different password protected page. (Yes, the comments and the resolutions to the comments are on two different web sites with two different passwords.) I'm hearing as well that few NB's actually have the password, and some who do are not passing it on to their own committee members. I've heard from a few NB members who explicitly requested access to these documents but were denied. Others are simply unaware that these comment resolutions are available. What was once an open process is now closing up.
(12/04/2007 Update: Brian Jones claims that these 662 resolutions are protected by JTC1 rules. But JTC1 rules apply to documents submitted into the JTC1 process, hosted by JTC1 , assigned JTC1 "N" numbers, and archived by JTC1, as required by JTC1 process. But these 662 resolutions are not called for by the JTC1 process, are not hosted by JTC1, are not assigned JTC1 "N" numbers and are not archived by JTC1. They are Ecma documents, hosted by Ecma, assigned ID's by Ecma, and controlled by Ecma passwords. These documents were never submitted to JTC1. Ecma is in total control over whether or not the public has access to them.
Brian highlights some rules that apply to the Disposition of Comments report, but that is not what we have before us. We won't have the Disposition of Comments report until after the Ballot Resolution Meeting. At that point, it will be an official JTC1 document, assigned an "N" number, hosted by JTC1 and accessible via their password .
Note also that Microsoft continues to dodge how closed the Ecma TC45 process has been and remains. Why not open up the TC45 mailing list archives, Brian? Are the ISO meanies stopping you? I know that Ecma is not forcing you. Their policy is to let each TC decide for themselves. I'm sure if Microsoft took a leadership position in favor of openness that you could convince the other members of TC45 to increase their transparency. What do you say?)
(12/06/2007 Update: The former Ecma Secretary General weighs in on the topic in a blog post, confirming that the responses are not controlled by ISO access rules, though the original NB comments are:
Consequently, Ecma is not constrained in posting its interim responses on a publicly available page as long as they are not tied to specific NB comments. In other words, Ecma would have to do some work to separate the proposed responses from the specific NB comments, but then Ecma may make its work publicly visible. If there is so much interest outside the NB circuit, then Ecma will surely do something here..
.
.
.
Indeed, seen from Ecma there is nothing that forbids Ecma to distribute its proposals. But it should also be clear, in the light of the longstanding relationship, that it is not a MUST for Ecma to do this. Good habits and rules have a value, like in any great game, such as football. And also there the rules and habits don’t change overnight because somebody has another, maybe even brilliant idea
But suppose you get through your local NB politics and actually lay your hands on the password to the Ecma web site, what do you get then? You then have the privilege of navigating 50 or so different pages, scrolling through them and click on 662 links to download 662 separate PDF files, all from a painfully slow server. Ughh... It hardly seems worth it. It is almost like someone wants to discourage NB's from actually reading this stuff.
Aiming to lessen the pain a little, I downloaded all 662 comments, and made a singe PDF file that contains all of the comment responses. I also included the original NB comments, and cross-linked everything, so I can navigate from comment to response, and slice and dice it by similar comments, or by NB. It is full text indexed, so I can search for things like "VML" and see all comments or responses relevant to that topic. Since it is liberated from the Ecma website, I can even use it off-line.
Doesn't my method sound easier to use than downloading 662 PDF files? If you agree, then I'll make you an offer. If you are a JTC1 or SC34 NB member, and would like access to this consolidated document, let me know via email. (You can find my email address here.) Note that my compilation is not a formal JTC1 document, and that this is not an offer from the US NB. This a personal offer from me to other individuals who are also JTC1 or SC34 NB members. (Of course, if Ecma wants a copy of this as well to make available for all NB's to download, then that is even better. They know where to find me.)
So, now that I've read through these 662 responses, let me fill you in what we have here. First, I'd like to define some terms, so we're all on the same page and understand the status of these 662 proposals.
At the BRM, baring any breakdown from lack of consensus, there will be issued an official "Resolution of Comments" document. This is the set of textual changes that JTC1 NB's authorizes the Project Editor (Microsoft's contractor in Ecma) to make to the DIS 29500 specification. Only the BRM can authorize these changes.
By January 14th, JTC1 NB's will receive from the Project Editor a "Proposed Resolution of Comments" document. This will be Ecma's proposals for how they would like to see the Sept. 2nd ballot comments resolved. The BRM is not limited to considering Ecma's proposals. Their own NB comments from Sept 2nd may also be in play, since those often came with their own proposed resolutions which differ from the ones that Ecma will propose.
So what do we have now, these recent drop of 662 documents from Ecma? I call these by the verbose name: "Ecma's Draft Proposed Resolution of Comments". The are not the final Resolution of Comments, and they are not even the final Proposed Resolution of Comments. They are a draft of proposed resolutions to 662 of the 3,522 comments submitted by JTC1 on Sept 2nd.
So the time line is:
- From now until late January we receive updates from Ecma in the form of Draft Proposed Resolution of Comments. If they continue to be posted in a user-unfriendly form, I will continue to produce updates to my consolidated report.
- By January 14th, Ecma submits their final Proposed Resolution of Comments
- At the adjournment of the BRM we have the approved Resolution of Comments
- The Project Editor then has 30 days to apply the Resolution of Comments to produce the new text of DIS 29500
- It is the above revised text that NB's will consider whether to approve or not. Note that since the NB only has 30 days to reconsider their Sept. 2nd vote, and the revised text is not due until 30 days after the BRM, it is likely that NB's will need to use their imagination and decide based on the approved Resolution of Comments document (perhaps 4,000+ pages in length), not having seen the actual revised text of the DIS.
This initial set of responses are almost entirely minor, dealing with corrections to examples, spelling errors, punctuation errors, cleanup of broken links, fixing illegible formulas, adding missing units on quantities, etc. There are also many, many duplicates in this area. In particular, the issue regarding spreadsheet functions missing units on some functions (not specifying radians or degrees) was picked up by 12 NB's. Since there are multiple instances of that defect in the OOXML specification, each one repeated by several NB's, this single observation results in 48 proposed resolutions. Ecma appears to have concentrated on comments like this, easy to fix and duplicated, in this batch. So although there are 662 resolutions on paper, this maps to perhaps only 80 or so unique issues.
The breakdown of proposed resolutions by NB is in the table below. These numbers are a bit tricky to interpret with the duplicate comments, since one NB's comments might have been addressed in passing while fixing another NB's issues. So I doubt Microsoft is spending a lot of time on Columbia, since they voted yes. But there may be a significant duplication between Columbia's comments and another NB which Microsoft is trying to please. But by looking at unique comments, those submitted by only one NB, we can get a good sense of which NB's Ecma is trying to please most. And no, I'm not going to tell you which ones they are.
| Member | Comments Submitted | Ecma Responses | % Responded to |
|---|---|---|---|
| UK | 635 | 218 | 34% |
| Ecma | 76 | 23 | 30% |
| Colombia | 237 | 71 | 30% |
| Philippines | 7 | 2 | 29% |
| USA | 288 | 69 | 24% |
| Chile | 217 | 44 | 20% |
| Malta | 5 | 1 | 20% |
| Japan | 82 | 16 | 20% |
| Canada | 79 | 15 | 19% |
| Czech Republic | 75 | 13 | 17% |
| Uruguay | 18 | 3 | 17% |
| Ireland | 12 | 2 | 17% |
| France | 592 | 97 | 16% |
| Australia | 30 | 4 | 13% |
| Germany | 162 | 20 | 12% |
| Portugal | 118 | 14 | 12% |
| Brazil | 64 | 7 | 11% |
| Greece | 113 | 11 | 10% |
| Denmark | 168 | 15 | 9% |
| Kenya | 81 | 7 | 9% |
| Ghana | 12 | 1 | 8% |
| India | 82 | 5 | 6% |
| Israel | 33 | 1 | 3% |
| Venezuela | 73 | 2 | 3% |
| Iran | 58 | 1 | 2% |
| Turkey | 1 | 0 | 0% |
| Jordan | 1 | 0 | 0% |
| Ecuador | 1 | 0 | 0% |
| Thailand | 1 | 0 | 0% |
| Spain | 1 | 0 | 0% |
| Belgium | 1 | 0 | 0% |
| Austria | 1 | 0 | 0% |
| Argentina | 1 | 0 | 0% |
| China | 1 | 0 | 0% |
| Singapore | 2 | 0 | 0% |
| Italy | 2 | 0 | 0% |
| Tunisia | 3 | 0 | 0% |
| Bulgaria | 3 | 0 | 0% |
| Poland | 4 | 0 | 0% |
| Mexico | 7 | 0 | 0% |
| Peru | 10 | 0 | 0% |
| Norway | 12 | 0 | 0% |
| Finland | 15 | 0 | 0% |
| South Africa | 17 | 0 | 0% |
| Switzerland | 19 | 0 | 0% |
| Malaysia | 23 | 0 | 0% |
| Korea, Republic of | 25 | 0 | 0% |
| New Zealand | 54 | 0 | 0% |
To be fair, not every resolution in this batch was editorial. There was some technical detail added. For example, the following points were clarified:
- The SpreadsheetML AND/OR functions do not short circuit, so all parameters must be evaluated.
- The CHAR() function converts an integer into a character. But no character set was defined in the DIS to govern this conversion. Microsoft clarrified tis saying that the function uses the "Macintosh character set"on the Mac and ANSI on all other platforms.
- Spreadsheet functions that do searches or string compares (EXACT, FIND, FINDB, SEARCH, SEARCHB. etc.) do so with lexical character comparisons, not collation-based operations.
- Part names in an OPC package can be IRI's, not just URI's. So this allows Unicode characters, with some restrictions in items names
However, the 662 comments carefully tip-toed around the controversial issue. I guess we'll read proposals for those in a future update. So NB members, take the opportunity now to get access to this portal. Ask your NB head for access if you haven't already been given the password. And if you want a copy of my consolidated PDF file, let me know.
Labels: OOXML
Friday, November 30, 2007
The Myth Of OOXML Adoption
Anyone else want to recognize reality? Maybe I can help.
Two questions to consider: 1) What is the actual state of OOXML adoption? and 2) What influence should market adoption of a technology have on its standardization?
On the first question, we should note that the 400 million users figure quoted by vdBeld in no way concerns OOXML. That figure is merely Microsoft's estimate of the total number of Microsoft Office users, of all versions, world wide. Only a small percentage of them are using OOXML.
Let's see if we can estimate the number.
How are Office 2007 sales? One (leaked) estimate (in September) was 70 million. But a follow-up statement makes it clear this is total Office licenses sold, of all versions. This is probably on the high end, not indicating installations, or even real end sales, since Microsoft typically reports sales into the channel. So that number must be reduced by some factor to account for real installations.
What percentage of Office users are running Office 2007? Joe Wilcox quotes Gartner, saying "Our Symposium survey showed Office at greater than 10 percent installed base..."
And not every Office 2007 will use the default OOXML formats. I've heard that corporate installations are often choosing to change their configuration to default to Compatibility Mode, so that Office 2007 saves in the legacy binary formats, for the increased interoperability this offers.
How does this net out? Something more than 40 million and less than 70 million seems the right neighborhood.
Let's look for some more data points.
Take the example of OpenOffice, which has has seen over 100 million downloads, not including copies which are included already with Linux distributions. So I believe there are far more OpenOffice users than Office 2007 users. Of course, not all OpenOffice users save in ODF format. Some will change the defaults to use the legacy Microsoft binary formats.
Let's take a look at an updated version of a chart I made back in May, with data now current through 11/27/2007.

The data here shows the number of documents reported by Google over time for ODF and OOXML documents. Hollow circles are ODF data points; solid circles are OOXML data points. (Yes, I need to figure out how to do scatterplot legends in R) The X-axis does not show the date. That would not be fair, since ODF had a significant head start in standardization and adoption. So in order have a fair comparison, both formats are shown against to the number of "days since standardization", which is May 1st, 2005 for ODF, and December 7th, 2006 for OOXML, the days the formats were approved by OASIS and Ecma respectively.
Next week is the one year anniversary of Ecma's approval of OOXML as an Ecma Standard. The news is not good. There are fewer than 2,000 OOXML documents on the entire internet (as indexed by Google at least) and the trend is flat.
What about ODF? Almost 160,000 and growing strongly.
Now we shouldn't be so careless as to say that there are only 2,000 OOXML document in existence, or for that matter only 160,000 ODF documents. Not all documents are posted on the web. In fact, most of them are sitting on hard drives, in mail files, behind corporate firewalls, etc. The documents that Google sees is only a sampling of real-world documents. But this is true of both ODF and OOXML. My hard drive is loaded with ODF documents that are not included in the above sampling. But however you spin it, the minuscule number of OOXML documents and their pathetic growth rate should be a cause of concern and distress for Microsoft.
Where are all the OOXML documents? What governments have adopted OOXML? What agencies? What major companies? If there was an adoption bigger than a Cub Scout pack we would have heard it trumpeted all over the headlines. Listen. Do you hear anything? No. The silence speaks volumes.
But for sake of argument, what if the numbers were different? What if there were millions of documents on the web in OOXML format? Would that have any relevance to the JTC1 standardization process? The answer is a clear "No". Market share, or even market domination, is not a criterion. In the US NB, INCITS, we are required to make our decision based on "objective technical factors". Making a decision to favor a proposed standard because of the proposer's market share would bring antitrust risks.
Consider this: In JTC1 we vote. One country one vote. We do not vote based on a nation's GDP. Jamaica and Japan are equal in ISO. We have engineers review the standards. We do not bring in accountants to review financial statements and verify inventories. If we want to make decisions based on market share then we should scrap JTC1 altogether and hand standardization over to revenue department authorities to administer.
But that would then perpetuate a technological neo-colonialism where the developed world controls the the patents, the capital and the standards, and the rest of the world licenses, pays and obeys. There's the rub. Where standards are open, consensually developed in a transparent process and made available to all to freely implement, there we lower barriers to implementation, level the playing field and allow all nations of the world to compete based on their native genius. But where standards are bought we end up with bad standards and a worse world for it.
Labels: OOXML
Sunday, September 09, 2007
Office 2007's Confusion Mode
For document exchange between different versions of MS Office, on the surface it looks a little bit better. Office 2007 provides a "compatibility mode" for users of Office 2007 who wish to create or edit documents that will remain compatible with earlier versions of Office.
That's the theory at least.
In practice, things are rather messy. I recently received an email from Julie Watson, a project manager who has been doing enterprise deployments & migrations for 15 years. She has spent the last few months working on a plan to migrate 18,000+ workstations, trying to find a way to have a gradual rollout while still maintaining round-trip collaboration between her Office 2003 and Office 2007 users. Julie has put together a nice report showing what works and what doesn't. Ignore the official documentation and ignore intuition, since neither will serve you well here. Take a gawk at the seedy side of reality in "[Compatibility Mode] Confusion in Office 2007."
Labels: Office 2007, OOXML
Tuesday, September 04, 2007
How to Hack ISO
The short of it is that DIS 29500 has failed in its attempt to be approved as an International Standard. The Microsoft spinmeisters are trying to make defeat sound like it is a good thing, that this is just the next step in the approval process.
Jason Matusow claims that "The next 6 months will be where the rubber really meets the road for the work on Open XML." This is nonsense. The work should have been done back in Ecma, before submission to ISO. Fast Track is not a standards development process. It is intended for standards that are already completed and for which there is already industry consensus, to quickly transpose them into International Standards. Fast Track starts at the last stage, the Approval stage, of ISO's 5 steps. By this point it is assumed that the text is complete, accurate, and has already been thoroughly reviewed. Since JTC1 NB's have registered hundreds of technical flaws in OOXML, it is clear now that it never should have been put on Fast Track in the first place. The types of errors that are being reported now should have been found and fixed back at the committee draft stage or earlier, in Ecma. This defeat is an indictment of Ecma's shoddy review. It is an abuse of ISO process for Microsoft to try to ram it through Fast Track in this state. They deserve the rebuke they have been given for this poor judgment.
Let's drill into the numbers a bit and see what this all means.
First, recall JTC1's two approval criteria for Fast Track submissions:
- 2/3 of JTC1 P-members must approve
- No more than 25% of total votes may be negative
So, the ballot results for OOXML appear to be:
P-members: Approval: 17, Abstain: 9, Disapprove: 15. With only 53% Approval of P-members, DIS 29500 fails by the first criterion.
Overall vote: Approval: 51, Abstain: 18, No: 18. With 26% overall Disapproval, DIS 29500 also fails by the second criterion.
Microsoft highlights this second number (74%) in their press release, but does not even mention the P-member number. This is deceptive since even if they raised that number to 76% OOXML would have still failed. Only P-members can cause a Fast Track to be approved.
What is interesting is the large number of NB's who participated in this process that have never participated in JTC1 before (at least to my knowledge). In fact, a number of them have such a strong interest in JTC1's activities that they have joined as P-members -- the highest level of participation -- in some cases only in the last week or so. This is, I assume, what Tom Robertson, Microsoft's GM of Interoperability and Standards, means when he talks of "rejuvinating" standards bodies:
Robertson dismissed criticism of Microsoft’s efforts to encourage its partners to join standards bodies. Most standards bodies are filled with "an old guard" membership that needs rejuvenation, he said. He also likened Microsoft's recruitment efforts to a voter registration drive. "Have we been speaking to our community of companies about this issue? Yes, we have," he said. "They needed to know. They, in many cases, decided to participate. [But] there is no basis to allegations that we are gerrymandering the process.
"Old Guard" NB's appear to be those like Canada, France, New Zealand, Japan, Korea, Ireland, China or Norway that voted against OOXML. The new blood presumably are countries like Cote d'Ivoire, Syria, Kazakhstan and Tanzania that are participating in JTC1 for the first time, and voting in favor of OOXML.
JTC1 has historically had a rather stable membership of NB's active in its technical agenda. There has been only a slow increase in membership, 1 NB joining in 2001, but none in 2002, 4 joining in 2003, 1 joining in 2004, none in 2005, 4 joining in 2006. But in 2007 JTC1 has been blessed with 12 new P-members, many of then joining in only the last week. There is a very clear trend in how these new P-members have voted:
| Approval | Abstain | Disapproval | |
| Old Guard | 7 | 8 | 14 |
| New NB's | 10 | 1 | 1 |
| Total | 17 | 9 | 15 |
In that table I'm defining "old guard" as those NB's who were P-members of JTC1 before the OOXML process started. As you can see, the "old guard" voted overwhelmingly against OOXML by 2-to-1 margins. But the new P-members have almost all voted in favor of OOXML.
We can look at this graphically as well, showing the P-member composition of JTC1 over time and how they ultimately voted. As you see, JTC1 was overwhelmingly against OOXML until the blip at the very end, when Kazakhstan, etc. joined.

Another difference between the "old guard" and the "rejuvenated" membership is the level of public input and industry participation in their national committees. The old guard members had public forums, invited all sides to come in and speak, had all their stakeholders participate, reviewed technical comments and tried to come to a consensus. With openness like this, no wonder Microsoft believes they need rejuvenation! The new members, well... let's just say the transparency of their decision making process is not uniformly great.
(6 Sept 2007 Update. I've removed the previous mention of CPI correlations to avoid confusion. The "transparency" of a national standards body's process does not bear any necessary relationship with a country's overall business climate. We've certainly seen in-depth, thorough technical reviews in the developing world, and we've seen suspect political dealings even in the United States. With the specific instances so damning to Microsoft, there is no need to make generalizations.)
I suppose that no one should be surprised that Microsoft, which has been stuffing committees at the national level throughout this ballot, would also attempt the same at the JTC1 level. From what I have been able to determine, NB's, never having sat in a single JTC1 meeting and never having joined a single JTC1 technical committee, were able join as a P-members, in the last hours of the OOXML ballot, simply by sending an email to ISO.
Although this attempt to juice their results by signing up new P-members did not help Microsoft win approval for OOXML, it remains to be seen what adverse effect this will have on other JTC1 activities. We need to remember that a participation rate of 50% of JTC1 P-members is required to transact most JTC1 business. So this "rejuvenation" may very well paralyze JTC1 entirely unless the new members are earnest and participate in ballots beyond OOXML.
Labels: OOXML
Wednesday, August 29, 2007
Pseudorandom Thoughts
So, the Q&A section rolls around, I asked some questions and an attempt was made by the MS reps to paint me as ill-informed and obtaining all my information from blogs on the internet run by anti-Microsoft fundamentalists. Oh, and of course IBM was mentioned as the prime company lobbying everyone and providing them with groundless reasons to vote against OOXML. Then came the best tactic of the day. Dismissing my questions as ‘too academic’ and ‘concerned with the needs of other nations, not Ghana’. After I stopped being annoyed at the attempt to shut me down, I was highly amused.
From Africa News is a report "African civil society warns Microsoft":
(FOSS) Foundation for Africa (FOSSFA), Ms Nnenna Nwakanma, told HANA that Nigeria like any other African country stands to gain by properly investigating the issue on the ground, stressing that Microsoft lobbyists have not been able to convince stakeholders how the OOXML document formats would benefit the public except for those who have Office 2007, which is a proprietary software .
"Only those using Office 2007 can benefit from it. If you use any Office apart from 2007, you first have to upgrade. I cannot understand why norms cannot be used unless certain proprietary changes had to be made," she said.
On the implication of voting 'No' to OOXML being proposed by Microsoft to Africa, especially in relation to e-School initiative, she said, already some African countries are warming up to embrace Open Document Formats (ODF), as an alternative file format.
But back to Sweden. My, my, what a mess. I suspect the same has happened elsewhere, including the US. But no one has been so careless as to leak a memo over here. We feel left out! So, if anyone has a similar "smoking gun" letter sent by Microsoft to line up MS Partners in the US to join INCITS V1 at the last minute, and doesn't know what to do with it, you might consider letting me know. I'll trade an original copy of the Utica Saturday Globe of Sept 21st, 1901, the President McKinley memorial issue, with full coverage of his funeral and burial, including a still brilliant page one color portrait (over the fold) of McKinley with Lady Liberty on the side, weeping, draped in flag with shield. Suitable for framing. A true collector's item for any McKinley fan.

(Trivia: Ever wonder why there are so many McKinley High Schools in the US? Because so many of schools were built after soon after his death.)
So what is wrong with stacking a committee? Isn't it just an expression of our freedom to associate? An interesting perspective from the Supreme Court, in a case that no one is talking about, but everyone should know: ALLIED TUBE & CONDUIT CORP. v. INDIAN HEAD, INC., 486 U.S. 492 (1988). This appears to be the highest profile case involving stuffing a standards committee:
Petitioner...can, with full antitrust immunity, engage in concerted efforts to influence those governments through direct lobbying, publicity campaigns, and other traditional avenues of political expression. To the extent state and local governments are more difficult to persuade through these other avenues, that no doubt reflects their preference for and confidence in the nonpartisan consensus process that petitioner has undermined. Petitioner remains free to take advantage of the forum provided by the standard-setting process by presenting and vigorously arguing accurate scientific evidence before a nonpartisan private standard-setting body. And petitioner can avoid the strictures of the private standard-setting process by attempting to influence legislatures through other forums.
What petitioner may not do (without exposing itself to possible antitrust liability for direct injuries) is bias the process by, as in this case, stacking the private standard-setting body with decisionmakers sharing their economic interest in restraining competition.
(Over on Slashdot one reader says of the above, "And I don't think normal people go around reading and quoting 20 year old anti-trust cases for fun." You don't know me very well, do you? I read legal analysis for fun. I have my own copy of Tribe's "American Constitutional Law", a facsimile edition of Blackstone's "Commentaries on the Laws of England", a three volume set of the writings of Edward Coke, and Fergus Kelly's "A Guide to Early Irish Law". Never confuse me with normal. But never confuse me for a lawyer either. I don't generalize well.)
In the "When Your Mom is the Beauty Pageant Judge" department comes news that the most influential "products, applications or technologies of the past 25 years", according to a super duper scientific poll by CompTIA, is Internet Explorer. Second place is Microsoft Word. Third place is Microsoft Excel. And tied for Fourth Place is Windows 95.
Joe Wilcox over at Microsoft Watch takes a pin to the Microsoft-sponsored puff piece IDC did on OOXML called "Adoption of Document Standards." And if the data is not rosy enough, Microosft can make it look even better by cutting off the y-axis labels to make a more impressive bar chart. "This one goes to eleven." You could spend hours exposing the flaws in that paper, but why bother? Life is too short.
Wait... this just in. In a survey of most dumb-ass Microsoft-sponsored surveys of August, first place goes to CompTIA's "Microsoft, Creator of Civilization, Inventor of Fire & Universal Benefactor of Mankind" and second place goes to IDC's "4% Looks More Important in a Bar Chart if the Maximum is set to 5%."
This one brought a smile to my face. Software Engineer job postings at Red Hat in Pune. Resumes must be submitted in ODF format.
Freecode in Norway has link to an an essay [pdf] by Sun's XML Architect, Jon Bosak entitled "Why OOXML Is Not Ready for Prime Time". Although I may disagree with Jon on the suitability of this single-vendor format for international standardization (His position is more along the lines of "Not yet" while mine is more like "Hell no"), I must admit he makes some excellent points.
Also, the Linux Foundations Desktop Architects have a statement just out: OOXML - vote "No, with comments"
And speaking of "No, with comments", now that the Microsoft checks have presumably cleared, self-proclaimed "standards activist" Rick Jelliffe, is recommending that Australia vote "No, with comments." This after a summer of speaking in favor of OOXML in India, Thailand (twice), Australia, New Zealand and who knows where else. How unfortunate for us all that his sage advice comes only after Standards Australia and most other countries have already finished their deliberations. I can only respond with the words of Lord Byron, from his "Ode to Napoleon":
And she, proud Austria's mournful flower,
Thy still imperial bride;
How bears her breast the torturing hour?
Still clings she to thy side ?
Must she too bend, must she too share
Thy late repentance
In the following post Rick reinterprets time, gives sophistry a bad name and takes such liberties with geometry that would make M.C. Escher blush, all in attempt to show that OOXML is really not 6,000 pages long, and it really wasn't created in less than a year. You can read his attempt to seek a logical basis for redefining reality to fit his preconceptions here, or just consult my one-slide summary below.
(Oh, Rick. One more thing. My last name is shared by Australia's most famous film director, Peter Weir. I manage to spell your name right. Maybe this mnemonic will help you spell mine right.)

Labels: OOXML
Monday, August 27, 2007
The OOXML BRM
I hear that IBM is still telling national bodies that a BRM isn't guaranteed. I am unsure how IBM reached that conclusion but this seems to be concrete evidence to the contrary.
Well, let me help refresh Mr. McGibbon's seemingly repressed memories.
First, scheduling a BRM does not guarantee it will be held. For example, have you heard of DIS 26926 "C++/CLI"? It was another Microsoft/Ecma Fast Track, just last year. The BRM meeting announcement went out on 25 October 2006, saying the BRM would be held 13-15 April 2007 in Oxford, England. Stephen, do you recall that BRM by any chance? Of course not, because it was canceled in February 2007 with the following message from the SC22 Secretariat:
We have been advised that the comments accompanying the Fast Track ballot for DIS 26926 are not resolvable and that holding a Ballot Resolution Meeting (BRM) would not be productive or result in a document that would be acceptable to the JTC 1 National Bodies. Therefore, our proposal is to not hold the BRM and to cancel the project.
So there is one example of a BRM that was scheduled and then canceled.
Want another? Sure, I can do that.
Take the case of DIS 26300 "Open Document Format." A Ballot Resolution Meeting was scheduled for May 29 to June 1, 2006 in Seoul, Korea, concurrently with the JTC1/SC34 Plenary. But was the BRM actually held? No. It was canceled by the Plenary:
Following the advice of the JTC 1 Secretariat, JTC 1/SC 34 cancels the previously-scheduled ISO/IEC 26300 Ballot Resolution Meeting and the SC34 Secretariat will forward the revised DIS text and accompanying disposition to SC34 national bodies for a 30-day default ballot when ready.
Why? Because ODF received no Disapproval votes. Although 8 of the 23 NB submitted comments with their ballot, these were all "Approval, with Comments" votes rather rather than "Disapproval, with comments. So a BRM was not deemed necessary. Only comments that accompany Disapproval votes must be addressed at a BRM.
So there you go, two examples of BRM's that were scheduled, but then canceled. The SC Secretariat has some discretion here. JTC1 Directives, Section 13.5 says, "In some cases the establishment of a ballot resolution group is unnecessary and the SC Secretariat can assign the task directly to the Project Editor." The two examples given show that if a ballot passes by large margins, or fails by large margins, a BRM may not be necessary.
How about another example from the recent past, the Fast Track DIS 29361 "Information technology – Basic profile." Their ballot closed on June 18th. Its ballot passed with 17 of 20 P-Members voting in favor of it. All Disapproval votes were accompanied by comments, as did one of the approval votes. Since there were Disapproval votes surely there must have been a BRM, right? No, that's not how it worked. The JTC1 Secretariat decided a BRM was not necessary and the comments could be forwarded directly to the Submitter of the Fast Track for them to "review and respond". So even having Disapproval votes does not guarantee a BRM will be held.
Does this make more sense now?
Of course, Microsoft already knows all this, and no doubt that is why they are working so hard to urge NB's to vote "Approval, with comments" with promises that their comments will be addressed at the BRM, a BRM that might not even occur. In fact, if everyone listened to Microsoft and followed their advice then that would almost guarantee that no BRM would be held and no NB's comments would be adopted.
Labels: OOXML
Sunday, August 26, 2007
Disenfranchisement
There is also the less savory kind of disenfranchisement, the kind that borders on electoral fraud. For example, in the 2004 presidential election there were reports of activities like:
- In Ohio, some registered voters received anonymous phone calls telling them that they were not properly registered to vote and that if they tried to vote they will be arrested.
- In Florida, registered members of one political party received phone calls reminding them to vote on November 3rd. Too bad the election was really on November 2nd.
- In Wisconsin, fliers were handed out falsely stating "If you already voted in any election this year, you can’t vote in the Presidential Election."
I just received an email from someone in a national standards committee considering the OOXML ballot, concerning false information given to his committee which suggested the Sept. 2nd ballot deadline was not real, that they actually had 30 more days to decide. I'm not going to name names in this post, but I will say that this isn't the first note I've received regarding such tactics. Some of the other ploys I've heard of include:
- In the 3o-day contradiction period, one NB was told that the stated deadline from ISO had been extended and that they actually had two more weeks to debate before sending in their response. If they had listened to this advice, this NB would have missed the deadline and their comments would have been disregarded.
- Another NB was told that they were not allowed to vote in the 5-month ballot because they had not participated in the contradiction period. This is totally false and has no basis in JTC1 Directives or past practice. Luckily this NB decided to check the facts for themselves.
- Several NB's were told that JTC1 had resolved all contradiction concerns with OOXML and that these issues therefore cannot be raised again in the 5-month ballot. This is utterly false. No one at JTC1 has made such a determination.
- Several NB's have been asked not to submit comments to JTC1 at all, but to send them directly to Ecma. (Yeah, right. Just sign your absentee ballot and give it to me. I'll make sure it gets in the mail)
- Many NB's are being asked to throw away their right to a conditional approval position by voting Approval on a specification that they believe is full of defects that must be fixed, even though JTC1 Directives clearly states that "Conditional approval should be submitted as a disapproval vote."
- Many NB's are being persuaded to vote Approval with the promise that all of their comments will be "addressed at the BRM" without explaining that "addressing a comment" may entail little more than entering it in a Disposition of Comments Reports with the remark "No action taken".
ISO works on a voting principle of one country/one vote. Don't let confusion over proper voting procedures deprive your country of their vote.
Labels: OOXML
Friday, August 24, 2007
Defective by Design
Enjoy!
27 August Update: Slashdot has further coverage of Stephane's article. Some good comments and perspectives can be read there.
Thursday, August 23, 2007
Is it safe?
I had a trip to the dentist on Monday. Whenever I have to go to the dentist I have images in my mind from the 1976 film Marathon Man, namely Lawrence Olivier as Dr. Szell, the elderly Nazi war criminal, torturing Dustin Hoffman with various unorthodox dental procedures. I figure that if I mentally prepare for the worst, the real dentist will be gentler in comparison. I sometimes mention this movie to the dentist, but they all deny ever having seen the movie. Very odd. I think they are hiding something. Surely this classic must be a staple of dental school film societies everywhere. That and the Tim Conway dentist skit from The Carol Burnett Show. What else is there in terms of great moments in dental cinema?
In any case, a story is told, perhaps apocryphal, that Hoffman prepared rigorously for his role in this movie by depriving himself of sleep for two days, so his character would appear worn and haggard. Olivier, seeing Hoffman that morning, and hearing of his co-star's preparation, is said to have quipped, "Dear boy, next time why not try acting?"
I'm reminded of this line when I witness Microsoft's machinations in JTC1, as they attempt to get OOXML approved. They are mounting an enormous offensive and expending great sums of money to convince ISO members that this rubbish heap of a format is acceptable as an ISO standard. Someone needs to ask, "Dear boy, next time why not try engineering?" Instead trying to force this ill considered mess through JTC1 (causing a great deal of collateral damage in the process), why not take your great base of engineering talent and produce a good standard and have that sail through JTC1 with thanks and praise?
We're also seeing a shell game at play with the technical comments. Many of the technical flaws were uncovered and discussed on this blog back last summer and fall, before OOXML was even completed by Ecma. Microsoft didn't fix them then. In the 30-day contradiction period, in February 2007, many NB's raised these same issues. Microsoft didn't fix them then, saying that they should be raised in the 5-month ballot. Now these same comments are being raised for the third time, in the 5-month ballot, and Microsoft is suggesting that they can be fixed at the ballot resolution meeting (BRM) in February 2008. At the BRM I predict that Microsoft will suggest that the issues should be fixed during maintenance of the standard. That would fit their plans well since they have already petitioned JTC1 to have the maintenance of OOXML handed over to Ecma TC45, closing the circle. Microsoft will never need to fix any problems in OOXML at this rate.
Another curious ploy is the way Microsoft is trying to convince JTC1 members that "Yes" means "No", that if they have serious issues with OOXML a NB should still vote Approval. Let's look into what the voting rules really are.
First a simple question to warm up. If you see a tunnel with a sign that says "exit" do you think that you can enter it as well? If you answer "No," then congratulations, you are smarter than Microsoft thinks you are. Microsoft is essentially arguing around the globe that unless the tunnel has a sign that says "do not enter", then you are welcome to enter the tunnel regardless of the "exit" sign. They are arguing that a NB can do anything they want unless the JTC1 Directives explicitly forbid it.
The counter argument is actually quite simple. You just need to consult Section 9.8 of the JTC1 Directives, 5th Edition, Version 3.0, which I've extracted below:

As it says, an Approval vote is approval of the technical content as presented. It is not approval pending the addressing of comments, or contingent on future work being performed. It is not approval of the importance of the proposal or approval of the market importance of the technology or approval of the company or organization making the proposal. It is explicitly approval of the technical content as presented. Although comments may be appended, the approval is clearly not contingent on anything at all happening to those comments, since the language clearly says the approval of the DIS as presented. Nowhere in the Directives does it suggest that NB's may substitute their own criteria or procedures for evaluating a Fast Track DIS. The criterion is clearly stated, Approval of the technical content of the DIS as presented. In fact JTC1 Directives, Section 1.2 says "These Directives shall be complied with in all respects and no deviations can be made without the consent of the Secretaries-General." So any NB that substitutes their own evaluation criteria for the language of section 9.8 is violating the Directives.
Now, for a Disapproval vote, the Directives say that disapproval is made for specifically stated technical reasons, accompanied with proposals that would make the DIS acceptable, and that if these changes are made, the NB has the opportunity then to change their vote to Approval. Note that it is giving a clear ordering. The NB first votes Disapproval, listing the reasons why along with their proposals to fix the problem, then if the changes are accepted, the NB has the opportunity to change their vote to Approval.
This mechanism is called out again a few lines later when it speaks of "conditional approval" and that it should be registered as a Disapproval vote.
Note that under JTC1 Directives, neither Microsoft nor Ecma has the power to accept an NB proposal. They do not own DIS 29500. They are not NB's. Ecma's ownership of the proposal ended when the 5-month ballot began. The only entity that can formally address NB technical comments is the assembled NB's at the Ballot Resolution Meeting. Certainly Ecma can offer an opinion, but it is no longer theirs to accept or deny changes at this point. If Microsoft is promising resolutions to NB's, then it is promising something which is not theirs to give. (Before you buy a used car from someone, it may be wise to first verify that they actually own it.)
In summary, when Microsoft says that an NB should vote Approval, with comments, and that they promise that all comments will be addressed, this is defective analysis for several reasons:
- The Directives clearly state that Approval indicates that the NB accepts the technical content as presented. Certainly, if the NB has only small editorial comments but otherwise accepts the technical content, then an Approval vote is entirely appropriate. But if technical content is not acceptable as presented, then they must vote Disapproval or else they ignore the plainly stated language of the Directives.
- Voting Approval, with comments with a private promise from Microsoft that your comments will be addressed at the BRM anyways — this contradicts the clear statement that "conditional approval should be submitted as a disapproval vote."
- Neither Microsoft nor Ecma is competent to provide any assurance as to what the BRM will or will not do. They do not run the BRM and they do not control what comments are addressed. The BRM is an NB meeting.
Labels: OOXML
Tuesday, August 21, 2007
The dog that didn't bark
“Is there any point to which you would wish to draw my attention?”A curious blog post from Brian Jones, looking at spreadsheet interoperability between Gnumeric and Apple's new Numbers spreadsheet, using OOXML. Take a read there and come back and we can compare notes.
“To the curious incident of the dog in the night-time.”
“The dog did nothing in the night-time.”
“That was the curious incident,” remarked Sherlock Holmes.
— Silver Blaze by Sir Arthur Conan Doyle
Did anything strike you as odd? What raised my eyebrows was the utterly trivial nature of the spreadsheet document that was tested. Typically an interoperability demonstration will be a little flashy, showing as much functionality as possible. But this one has no text attributes, only a single, default numeric style, no charts, no use of spreadsheet functions, nothing. Why bother? There is nothing in this spreadsheet that one could not easily have created in VisiCalc 25 years ago. So why is this simplistic document being used to demonstrate interoperability with OOXML? This seems very odd. Interoperability with a more substantial document would have been far more persuasive. So why didn't they do that? Hmmm....
So I decided that I would give it a try, on my Windows XP laptop running Office 2007, OpenOffice 2.1 Novell Edition (giving it a test drive this week) and Gnumeric 1.7.10. Let's see what really works.
First, let's start with a more substantial spreadsheet document. I created the following in Office 2007, illustrating a variety of everyday features:
- numeric format
- simple text styles
- cell background fills
- cell alignment
- spreadsheet functions
- charts
- row widths
- worksheet password protection
- cell validation
- hyperlinks
- word art and shapes
- OLE embedding

Next, I tried opening the XLS file in OpenOffice. We see that it handled the file well:

The colors in the chart are clearly different, but I didn't set any particular colors in the original, opting for the default. So this may just be an indication that the charts in OpenOffice have different default colors. As you can see from the above picture, everything else looks fine. I did verify that worksheet protections, cell validation and the hyperlink worked correctly. However, although the OLE embedding seems to be there, I was not able to activate it.
Next, I fired up Gnumeric to see how it would fare. I first tried the same XLS file which loaded and displayed like this:

What do we notice?
- Cell A7 did not format properly. It should be in long date format, but it is displaying in time format.
- Chart colors differ, but this is probably just a difference in defaults.
- Chart text is clipped in several instances.
- The OLE embedding failed to come through with correct metafile for display.
- Workbook protections and cell validation worked as expected.
- Hyperlink worked correctly.
- Arrow shape was dropped.

Hmm...OK... I think we hear the dog barking now. The OOXML import into Gnumeric is not really usable yet. In addition to the problems indicated above with the XLS import, we can add the following:
- None of the charts converted
- Worksheet password protection was lost
- The hyperlink is broken
- The OLE embedding is missing
- Cell validation is broken
- The word art is missing
However, Microsoft points to Gnumeric as proof that OOXML can be implemented by other vendors. I suggest the jury is still out on this. 1-2-3 release 1.0a (1984) supported more functionality than Gnumeric does via OOXML. Note that even though there is no complete public documentation on the legacy binary formats, Gnumeric does a far better job at supporting them than it does with "standard" Ecma-376 OOXML and its 6,000 pages of documentation.
Now certainly, with much time and much effort, I'm sure Gnumeric will reach the point where it can read an OOXML document as well as it can read an XLS document. It might take another two or three years, but that day will come. But what benefit is that? All that effort will be spent writing code and testing to achieve practical results that Gnumeric already has achieved with the binary formats. This effort comes at the expense of other development activities such as adding features or fixing bugs. I hardly think that Jody wakes up in the morning joyed by the prospect of adding OOXML support to an application that is already compatible with billions of legacy Microsoft documents.
Similarly I have to scratch my head at OpenOffice and their announcement that they are adding OOXML support. As shown earlier, their support of the binary formats is already excellent. I guess that is why Microsoft is so eager to change their default formats. When a product like OpenOffice is able to effectively exchange documents with Office, it is too much of a threat to their Office revenue. So OpenOffice must now spend several person-years recreating this same level of interoperability, and the net result is that they will end up with the same capability they had before, but at the expense of forgoing work on new features. I wonder what Microsoft will do when OpenOffice catches up again in a few years? Hmmm...
(It would be interesting to examine out some of the other products that are said to support OOXML. Of the ones that support OOXML as well as the binary formats, how many of them also have OOXML support that is far worse than their binary format support? Is any editor vendor able to stand up and say that OOXML is a blessing to them because it allows higher fidelity interchange with Office than they were able to achieve with the binary formats?)
Everyone is in the same boat with this: KOffice, Corel, Google, IBM, anyone who has applications that work with Microsoft documents. We're all faced with the prospect of significant expenses to rewrite our file format support with no net benefit to our customers. This is the toll we all must pay to Microsoft just for the ability to fight for the scraps their monopoly may leave behind. If Microsoft jerks their format around, we all must run and chase after it, reallocating resources away from feature work, becoming in the process less competitive in the marketplace, while Microsoft forges ahead with new features. They can easily repeat this game every few years, just to keep competitors busy. This is what a death spiral looks like.
Giving absolute control of a standard document format to a monopolist that is notorious for abusing their control of file formats in the past is insanity. It doesn't take a Sherlock Holmes to figure that out.
Labels: OOXML
Sunday, August 12, 2007
e to the power of hype
What especially caught my eye was this claim:
Global support for Open XML is growing exponentially. Thousands of organizations have joined OpenXMLCommunity.org, hundreds of ISVs are developing solutions on Open XML, and more and more governments are opting for Choice in standards policies. Additionally, more than 10 million compatibility packs that allow users of earlier versions of Microsoft Office to work with Open XML have been downloaded around the world. The momentum is growing, the adoption is real.
Exponential growth is quite a claim. But what is the evidence? Microsoft provides this chart further down on the page, showing the growth in their "community":

Years ago, when I was a student, we had a technical term for curves like this. We called them "lines" and referred to this type of growth as "linear." We did not call it "exponential growth"
Let's take a look at the growth in document usage, instead of community membership. Here's an update of a chart I showed a couple of months ago:

In this chart you see two series, one for ODF (blue) and one for OOXML (red). The horizontal axis shows the number of days since each standard was published, namely May 2005 for ODF and December 2006 for OOXML. The vertical axis shows the number of documents in that format on the web, according to Google, by doing "filetype" searches. For example, a query of "filetype:ods" gives you all of the ODS (ODF spreadsheet) documents on the web.
(Ben Langhinrichs also has some updated numbers and analysis on this topic.)
Is this what you would call exponential growth? Eight months after Office 2007 shipped, and despite the claim of "10 million compatibility packs" downloaded, the OOXML line is only slowing and linearly rising (R-squared=0.943). ODF remains 100-times more prevalent on the web today and is growing 20-times faster than OOXML.
So "Global support for Open XML is growing exponentially"? Uh. I don't think so. Maybe something is growing exponentially, like the hype. But the users, the documents and the "community" — these appear to be only slowly and linearly growing.
But lest you leave without some dramatic growth to think about, let me share some with you. If you recall, back in April I brought your attention to the fact that two scientific journals, Science and Nature, were both rejecting submissions from authors in OOXML format. I've been looking around and found an embarrassingly large number of additional journals which explicitly disallow OOXML.
The Optical Society of America's journal, Optics Letters, will not accept Word 2007 format. The American Phytopathological Society's Plant Disease warns in bright red print [pdf], "This journal does not accept Microsoft Word 2007 documents at this time." The American Institute of Physics, tells their authors "Word 2007 and the new Word docx format should not be used. Docx files will currently cause problems for reviewers and complicate many existing preproduction and production routines." Vandose Zone Journal warns submitters that they cannot use the new equation editor in Word 2007 and should use MathML instead. "Word 2007 .docx format is not accepted" according to The Journal of Nutrition.
But wait, there's more!
Wiley InterScience tells authors for almost 200 of its journals that "This journal does not accept Microsoft WORD 2007 documents at this time," ruling out OOXML for authors of these journals:
- Journal of the History of the Behavioral Sciences
- International Journal of Quantum Chemistry
- Software Process: Improvement and Practice
- Pediatric Blood & Cancer
- Lasers in Surgery and Medicine
- Medicinal Research Reviews
- American Journal of Physical Anthropology
- Journal of Mass Spectrometry
- Journal of Polymer Science Part B: Polymer Physics
- Developmental Dynamics
- Journal of Applied Polymer Science
- Magnetic Resonance in Medicine
- Synapse
- Genes, Chromosomes and Cancer
- Journal of Medical Virology
- Flavour and Fragrance Journal
- Biofuels, Bioproducts and Biorefining
- Clinical Anatomy
- Hepatology
- Advances in Polymer Technology
- Journal of Orthopaedic Research
- Molecular Carcinogenesis
- Environmental Progress
- Infant Mental Health Journal
- Annals of Neurology
- International Journal of Imaging Systems and Technology
- Developmental Neurobiology
- AIChE Journal
- Journal of Traumatic Stress
- genesis
- Meteorological Applications
- Process Safety Progress
- Atmospheric Science Letters
- Systems Research and Behavioral Science
- Journal of Community Psychology
- Diagnostic Cytopathology
- Birth Defects Research Part B
- Journal of Software Maintenance and Evolution
- International Journal of Climatology
- The Chemical Record
- Wireless Communications and Mobile Computing
- International Journal of Intelligent Systems
- Computer Animation and Virtual Worlds
- Statistics in Medicine
- Concurrency and Computation: Practice and Experience
- Developmental Psychobiology
- Applied Stochastic Models in Business and Industry
- The Prostate
- Journal of Computational Chemistry
- X-Ray Spectrometry
- Peditric Blood & Cancer
- Random Structures and Algorithms
- Microwave and Optical Technology Letters
- Lasers in Surgery and Medicine
- Rapid Communications in Mass Spectrometry
- Weather
- Mental Retardation and Developmental Disabilities Research Reviews
- International Journal of Finance & Economics
- Psycho-Oncology
- Chirality
- Applied Cognitive Psychology
- American Journal of Medical Genetics Part B:
- Medicinal Research Reviews
- Biopharmaceutics & Drug Disposition
- Zoo Biology
- Catheterization and Cardiovascular Interventions
- Plus 103 more journals!
- Bioinformatics
- Journal of Antimicrobial Chemotherapy
- American Journal of Epidemiology
- PEDS
- Briefings in Functional Genomics & Proteomics
- The Computer Journal
- Health Policy and Planning
- Journal of Environmental Law
- Review of English Studies
- Behavioral Ecology
- ELT Journal
- Molecular Biology and Evolution
- CESifo Economic Studies
- Journal of Pediatric Psychology
- Cerebral Cortex
- Literary and Linguistic Computing
- Molecular Human Reproduction
- Enterprise & Society
- Age and Ageing
- European Journal of Public Health
- Publius
- Integrative and Comparative Biology
- Nephrology Dialysis Transplantation
- Rheumatology
- Glycobiology
- And 35 more journals!
Blackwell Publishing, publisher of over 800 journals, rejects OOXML submissions telling authors, "Will authors please note that Word 2007 is not yet compatible with journal production systems." This adds to our list of journals where OOXML cannot be used:
- Psychophysiology
- Acta Anaesthesiologica Scandinavica
- Transfusion Alternatives in Transfusion Medicine
- Acta Neuropsychiatrica
- Nursing Forum: An Independent Voice for Nursing
- Experimental Techniques: A Publication for the Practicing Engineer
- Cytopathology
- Asian Journal of Social Psychology
- Journal of Anatomy
- Annals of Applied Biology
- Lethaia: An International Journal of Palaeontology
- Journal of the American Water Resources Association
- Clinical Physiology and Functional Imaging
- Ibis: The International Journal of Avian Science
- Basin Research
- Digestive Endoscopy
- Journal of Empirical Legal Studies
- European Journal of Neurology
- Surgical Practice: Formerly Annals of the College of Surgeons
- FEMS Yeast Research
- FEMS Microbiology Reviews
- FEMS Microbiology Ecology
- FEMS Microbiology Letters
- Regulation & Governance
- FEMS Immunology & Medical Microbiology
- Clinical and Experimental Optometry
- Journal of Food Process Engineering
- The Journal of Cardiovascular Electrophysiology
- Medical Education
- European Journal of Clinical Investigation
- Diseases of the Esophagus
- Sleep and Biological Rhythms
- International Migration Review
- Computational Intelligence
- Asia Pacific Viewpoint
- Seminars in Dialysis
- Peace & Change: A Journal of Peace Research
- Journal of Applied Social Psychology
- Basic & Clinical Pharmacology & Toxicology
- Dermatologic Therapy
- WorkingUSA: The Journal of Labor and Society
- Journal of Travel Medicine
- Singapore Journal of Tropical Geography
- Australasian Radiology
- Genes to Cells
- The Clinical Respiratory Journal
- Echocardiography
- The American Journal of Gastroenterology
- Histopathology
- Personal Relationships
- Clinical and Experimental Dermatology
- Alcoholism: Clinical and Experimental Research
- Experimental Dermatology
- Journal of Social Philosophy
- The Journal of Popular Culture
- Pathology International
- Pain Practice
- The Journal of American Culture
- Clinical & Experimental Immunology
- Religious Studies Review
- Entomological Science
- Plus 107 more journals!
Labels: OOXML
Thursday, August 02, 2007
Two Feet, No Feathers
But even among professionals, the attack/defense of language continues. One party writes the tax code, and another party tries to find the loopholes. Iteration of this process leads to more complex tax codes and more complex tax shelters. The extreme verbosity (to a layperson) of legislation, patent claims or insurance policies results from centuries of cumulative knowledge which has taught the drafters of these instruments the importance of writing defensively. The language of your insurance policy is not there for your understanding. Its purpose is to be unassailable.
This "war of the words" has been going on for thousands of years. Plato, teaching in the Akademia grove, defined Man as "a biped, without feathers." This was answered by the original smart-ass, Diogenes of Sinope, aka Diogenes the Cynic, who showed up shortly after with a plucked chicken, saying, "Here is Plato's Man." Plato's definition was soon updated to include an additional restriction, "with broad, flat nails." That is how the game is played.
In a similar way Microsoft has handed us all a plucked chicken in the form of OOXML, saying, "Here is your open standard." We can, like Plato, all have a good laugh at what they gave us, but we should also make sure that we iterate on the definition of "open standard" to preserve the concept and the benefits that we intend. A plucked chicken does not magically become a man simply because it passes a loose definition. We do not need to accept it as such. It is still a plucked chicken.
(This reminds me of the story told of Abraham Lincoln, when asked, "How many legs does a dog have if you call the tail a leg?" Lincoln responded, "Four. Calling a tail a leg does not make it a leg.")
With the recent announcement here in Massachusetts that the ETRM 4.0 reference architecture will include OOXML as an "open standard" we have another opportunity to look at the loopholes that current definitions allow, and ask ourselves whether these make sense.
The process for recommending a standard in ETRM 4.0 is defined by the following flowchart:
So, let's go through the first three questions that presumably have already been asked and answered affirmatively in Massachusetts, to see if they conform to the facts as we know them.
- Is the standard fully documented and publicly available? Can we really say that the standard is "fully documented" when the ISO review in the US and in other countries is turning up hundreds of problems that are pointing out that the standard is incomplete, inconsistent and even incorrect? We should not confuse length with information content. Just as a child can be overweight and malrnourished at the same time, a standard can be 6,000 pages long and still not be "fully documented." Of course, we could just say, "A standard fully documents the provisions that it documents" and leave it at that. But such a tautological interpretation benefits no one in Massachusetts. We should consider the concept of enablement as we do when prosecuting patent applications. If a standard does not define a feature such that a "person having ordinary skill in the art" (PHOSITA) can "make and use" the technology described by the standard without "undue experimentation" then we cannot say that it is "fully documented." By this definition, OOXML has huge gaps.
- Is the standard developed and maintained in a process that is open, transparent and collaborative? We're talking about Ecma here. How can their process be called transparent when they do not publicly list the names of their members or attendance at their meetings, do not have public archives of their meeting minutes, their discussion list or document archive, do not make publicly available their own spreadsheet of known flaws in the OOXML specification nor of the public comments they received during their public review period? How is this, by any definition, considered "transparent"? We can also question whether the process was open. When the charter constrains the committee from making changes that would be adverse to a single vendor's interests, it really doesn't matter what the composition of the committee is. The committee's hands are already tied and should not be considered "open." If I were writing a definition of an open, transparent process, I'd be sure to patch those two loopholes.
- Is the standard developed, approved and maintained by a Standards Body? Without further qualifying "Standards Body" this is a toothless statement. As should be apparent right now, not all SDO's are created equal. Some of the standards equivalent of diploma mills. Accreditation is the way we usually solve this kind of problem. Ecma's Class A Liaison status with JTC1 is not an accreditation since their liaison status has no formal requirements other than expressing interesting in the technical agenda of JTC1. In comparison, OASIS needed to satisfy a detailed list of organizational, process, IPR and quality criteria before their acceptence as a PAS Submitter to JTC1/SC34. Why bother having a requirement for a Standards Body unless you have language that ensures that it is not a puppet without quality control?
- Is there existing or growing industry support around the use of the standard? Again, very vague. A look at Google hits for OOXML documents shows that there are very few actually in use. My numbers show that only 1 in 10,000 new office documents are in OOXML format. But I guess that is more than 0 in 10,000 that existing last year. But is this really evidence for "growing industry support"? I'd change the language to require that there be several independent, substantially full implementations.
There are two additional questions which I won't presume to answer since they rely more on integration with internal ITD processes.
We learn lessons and move on to the next battle. Just as GPLv2 required GPLv3 to patch perceived vulnerabilities, we'll all have much work to do cleaning up after OOXML. Certainly JTC1 Directives around Fast Tracks will need to be gutted and rewritten. Also, the vague and contradictory ballot rules in JTC1, and the non-existent Ballot Resolution Meeting procedures will need to be addressed. I suggest that ITD take another look at their flowchart as well, and try to figure out how they can avoid getting another plucked chicken in the future.
Sunday, July 29, 2007
My comments on the ETRM 4.0 draft
I’d like to write to you as a long-time Massachusetts resident and taxpayer. My employer (IBM) will likely submit their own comments, but I’d like to offer you my own personal views on the ETRM 4.0 draft.
I am proud of the Commonwealth’s tradition of openness in government, enshrined in our Public Records Law and Open Meeting Law. As James Madison wrote, “A popular government, without popular information, or the means of acquiring it, is but a prologue to a farce or a tragedy. A people who mean to be their own governors must arm themselves with the power which knowledge gives them.” So access to government documents, now and for posterity, is critical for public oversight and participation in government, as well as for preserving our heritage. Now that we’ve moved into the digital age, access to government documents requires that these documents be made available in a format that all Commonwealth residents can read. So the move toward open documents formats, as called for in the ETRM, is laudable. A citizen must never be dependent on any single vendor for the software needed to read their government’s documents.
However, I am concerned at the proposed addition of Ecma Office Open XML (OOXML) to the list of acceptable document formats. As you may have heard, OOXML is currently undergoing review by ISO/IEC JTC1 for possible approval as an ISO standard. As part of this review, technical committees in standards bodies around the world are reviewing OOXML and appraising it’s suitability as an International Standard. As a participant in the US committee reviewing OOXML, INCITS V1, I had the opportunity to review the text of the OOXML specification and to discuss it with others. I am sorry to report that I found the OOXML specification to be full of errors and omissions. Of course, no technical document is perfect. But this one, in particular, is of far greater length (more than 6,000 pages) and of far lower quality than any I have seen before. If it has advanced this far in the ISO process it is because of vendor pressure, not because of technical merit.
What is the problem with a buggy standard? Interoperability suffers. That is the problem. There is no doubt that if everyone in the Commonwealth used Microsoft Office 2007 on Windows Vista, that their interoperability will be good. But as soon as we admit choice in applications and operating systems, then interoperability will only occur when all sides follow a common standard. So the technical quality of a standard (accuracy, comprehensiveness, level of detail, consistency, etc.) is directly proportional to the level of interoperability achievable and the cost to achieve it.
The ISO ballot on OOXML will not end until September 2nd, after which a resolution process to fix defects in the text of the standard will take at least an additional 6-18 months. That is, of course, if OOXML gains ISO approval, something which is not certain at this point. So I would recommend a cautious approach, and wait for the ISO process to conclude, or conduct your own independent technical evaluation of the OOXML specification to confirm its technical quality before adding OOXML to your list. Ask other vendors: Is this something you can implement? Ask yourself: Will this truly give the Commonwealth the interoperability and choice that you desire? These are important questions to ask.
Finally, I’d note that the ETRM also calls out OpenDocument Format (ODF) as an acceptable format. ODF was approved by ISO last year. So why do we need OOXML? I personally think that the complexity of document exchange and translation in a multi-format world would take us back to the confusion and frustration of the early 1990’s when we all juggled WordStar, WordPerfect, Word and WordPro files, and could collaborate only poorly. Better to push for a single unified/harmonized standard document format for personal productivity applications, much as we have a single standard (HTML) for web pages.
I’ll leave you with a quote from Tim Berners-Lee, the inventor of the web, from an interview he gave with David Berlind from ZDNet when Berners-Lee was recently in Boston receiving a Lifetime Achievement Award from the Massachusetts Innovation & Technology Exchange.
Berners-Lee said:
It was the standardization around HTML that allowed the web to take off. It was not only the fact that it is standard, but the fact that it’s open and the fact that it is royalty-free.
So what we saw on top of the web was a huge diversity and different business which are built on top of the web given that it is an open platform.
If HTML had not been free, if it had been proprietary technology, then there would have been the business of actually selling HTML and the competing JTML, LTML, MTML products. Because we wouldn’t have had the open platform, we would have had competition for these various different browser platforms, but we wouldn't have had the web. We wouldn't have had everything growing on top of it