• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar

An Antic Disposition

  • Home
  • About
  • Archives
  • Writings
  • Links
You are here: Home / 2008 / Archives for March 2008

Archives for March 2008

Implementation-defined (Not really)

2008/03/11 By Rob 16 Comments

Here begins the lesson on Embrace, Extend and Extinguish (EEE). Classically, this technique is used to perpetuate vendor lock-in by introducing small incompatibilities into a standard interface, in order to prevent effective interoperability, or (shudder) even substitutability of competing products based on that interface. This EEE strategy has worked well so far for Microsoft, with the web browser, with Java, with Kerberos, etc. It is interesting to note that this technique can work equally well with Microsoft’s own standards, like OOXML.

An easy way to find these extension points is to search the OOXML specification for “application-defined” or “implementation-defined”. You will find dozens of them, such as:

  1. In general, scripting
  2. In general, macros
  3. In general, DRM
  4. Part 1 — “Application-Defined File Properties Part” which is totally undefined, but is referenced 13 times for specific fields in Part 4.
  5. Section 2.16.4.1 — implementation-defined date/time formatting
  6. Section 2.16.5.34 — implementation-defined document filters
  7. Section 3.17.2.6 — implementation-defined string–>number conversions in a spreadsheet
  8. Section 2.8.2.2 — character sets supported by a font
  9. Section 2.9.6 — the interpretation of the mysterious hex “template code” in numbered list overrides — “The method by which this value is interpreted shall be application-defined.”
  10. Section 2.14.27 — application-defined storage of exclusion data for a mail merge
  11. Section 2.15.1.28 — application-defined cryptographic hash algorithms
  12. 2.15.1.76 — “Specifies a string identifier which may be used to locate the XSL transform to be applied. The semantics of this attribute are not defined by this Office Open XML Standard – applications may use this information in any application-defined manner to resolve the location of the XSL transform to apply.”
  13. Section 5.6.2.12 — application-defined macro string reference for connection shape
  14. Section 5.6.2.15 — application-defined macro string reference for graphic frame
  15. Section 5.6.2.24 — application-defined macro string reference for a picture object
  16. Section 5.6.2.28 — application-defined macro string reference for a shape
  17. Section 5.8.2.9 — application-defined macro string reference for a connection shape
  18. Section 5.8.2.12 — application-defined macro string reference for a graphic frame
  19. Section 6.2.2.14 — “This element specifies the presence of an ink object. An ink object is a VML object which allows applications to store data for ink annotations in an application-defined format.”
  20. Section 7.6.2.60 — implementation-defined bibliographic citation formats
  21. And many, many more.

So, one might ask, what exactly does “implementation-defined”mean? Here is how OOXML defines it and related terms:

behavior, implementation-defined — Unspecified behavior where each implementation documents that behavior, thereby promoting predictability and reproducibility within any given implementation. (This term is sometimes called “application-specific behavior”.)

behavior, locale-specific — Behavior that depends on local conventions of nationality, culture, and language.

behavior, unspecified —Behavior where this Standard imposes no requirements. [Note: To add an extension, an implementer must use the extensibility mechanisms described by this Standard rather than trying to do so by giving meaning to otherwise unspecified behavior. end note]

Note that this is not an entirely novel definition. Anyone who has spent time reading over the C and C++ Programming Language standards, in ANSI or in ISO, will recall a similar set of definitions. For example, these from ISO/IEC 9899:1999 C-Programming Language:

implementation-defined behavior
unspecified behavior where each implementation documents how the choice is made

locale-specific behavior
behavior that depends on local conventions of nationality, culture, and language that each implementation documents

unspecified behavior
behavior where this International Standard provides two or more possibilities and
imposes no further requirements on which is chosen in any instance

So, you can see that OOXML pretty much copies these definitions. However, ISO standards like ISO/IEC 9899:1999 go one step further and make an additional statement in their conformance clause, something that is distinctly missing from OOXML:

“An implementation shall be accompanied by a document that defines all implementation-defined and locale-specific characteristics and all extensions.”

If you poke around you will see that all conformant C compilers indeed do come with a document that defines how their implementation-defined features were implemented. For example, GNU’s gcc compiler comes with this document.

So, by failing to include this in their conformance clause, OOXML’s use of the term “implementation-defined” is toothless. It just means “We don’t want to tell you this information” or “We don’t want to interoperate”. Conformant applications are not required to actually document how they extend the standard. You can look at Microsoft Office 2007 as a prime example. Where is this documentation that explains how Office 2007 implements these “implementation-defined” features? How is interoperability promoted without this?

(This item not discussed at the BRM for lack of time.)

  • Tweet

Filed Under: OOXML

Contra Durusau, Part 1

2008/03/11 By Rob 22 Comments

I have a lot of respect for Patrick Durusau. He has taught me much about how ISO standards work in practice, and I have benefited from his thoughts on that subject. I hope I can repay my debt to Patrick even in part, by teaching him something about how Microsoft works, in practice, a subject where I have expertise he lacks.

From the start Patrick has remained publicly silent on the topic of OOXML. No blog posts, no press, nothing. If you asked, he would say that this was his policy. Privately, you would get an earful (all negative), but as befits the unbiased chair of the committee which is responsible for the technical recommendation for the US NB, he kept his personal opinions out of the public arena.

This public orientation changed recently. As best I can figure it, on returning from a conference in Seattle in late January, Patrick was a changed man. Patrick is now an enthusiastic OOXML supporter and is eager to inform the world of his delight in OOXML at every opportunity. He posts his “open letters” on his web site, which are linked to, often within minutes, by the various Microsoft bloggers, and then sent around by Microsoft employees to the press and the various JTC1 NB’s.

Patrick is entitled to his own opinions. Free speech (and free enterprise for that matter) are things which all red-blooded Americans believe in, among other things. So long as Patrick makes it clear that he is speaking for himself, I have no problem with this.

Of course, Microsoft will not be so careful to distinguish Patrick’s personal opinions from his professional affiliations. So a post from Patrick’s personal web site is retold on a Microsoft blog as “The ODF Editor says….”, and then the next day is sent in an email to NB’s with a larger set of “endorsements”:

Chair, V1 – US TAG to JTC 1/SC 34
Convener, JTC 1/SC 34/WG 3 (Topic Maps)
Editor, OpenDocument Format TC (OASIS), Project Editor ISO/IEC 26300
Co-Editor, ISO/IEC 13250-1, 13250-5 (Topic Maps)

By the time it is actually discussed at the NB committee level, I wouldn’t be surprised if it morphs into an assertion that JTC1/SC34, INCITS, the ODF TC and the City Council of Covington, Georgia have all approved OOXML. It is dangerous to wear many hats when dealing with Microsoft. They are not ones for fine distinctions.

But now on to the substance of Patrick’s letters.

In his first note, the “OpenXML Poster Child“, Patrick says:

OpenXML has progressed from being developed in a closed environment to being handed over to approximately 70% of the world’s population for future development so I am missing the “not open” aspect of OpenXML. If anything, the improvements made to OpenXML during that process make it a poster child for the open standards development process.
.
.
.
I understand that SC 34 will be taking on the maintenance and future development of OpenXML (with the participation of Ecma). That will mean that approximately 70% of the world’s population will have a say (through their respective national bodies) on how OpenXML continues to develop. I can’t speak for anyone other than myself but that sounds pretty open to me. (That presumes approval of OpenXML as an ISO standard, which must be decided by every national body for itself.)

We’ve covered this before. Let’s go down the list again. Where are the minutes from Ecma TC45 teleconferences? Where are the public archives of their mailing list? Where is the list of individuals participating in the TC? Where is the list of voting members? Where are the public comments they have received on OOXML? You call this open?

For ODF, all of this information is easily available to the public, here, here and here.

And don’t give me the canard about how moving to SC34 results in greater representation.

In the US who represents our population? The 7 members of V1 before the DIS 29500 process began? Or the 26 members after Microsoft stuffed V1 (the committee that you chair) with business partners last summer? Or V1 after several of them were kicked out for not paying their dues? Or the V1 after the DIS 29500 procedure completes and the warm bodies fade away? In your opinion, which one do you believe truly represents our US population?

Similarly, SC34 was stuffed with new P-members and swelled from 9 P-members in 2006 to 40 today, most of which voted in favor of OOXML and then failed to participate in any other SC activities. Are you seriously suggesting that SC34 was increasing the world’s influence over Microsoft’s decisions? That sounds quite naive. To me this looks much more like Microsoft is increasing their influence over the world, and JTC1 NB’s in particular.

The long list of shenanigans recorded, from Sweden to Portugal, from Poland to Switzerland is further evidence that the second interpretation is the accurate one. Is offering Microsoft partners rewards for joining a committee a way of increasing openness? Is joining JTC1 three days before the Sept. 2nd vote, then voting Yes without comments the way in which the world is able to gain a seat at the table?

Moving on.

Patrick’s next post is “Co-Evolution“. This, plus Microsoft’s recent interoperability announcements (yes, yet more announcements) give the impression that they believe it is better to talk about interoperability than to do something about it. Interoperability is something we only talk about now, but accomplish sometime in the nebulous future, like weight loss or reducing the national debt. Create studies, write reports, open labs, make test cases, write more reports. But when given the opportunity to do something now which would actually improve interoperability, like adding missing features to OOXML to accommodate the richer text model in ODF, then just say “No”. You can always do a study on this later, and write another report, and make a test case.

But if announcements alone could improve interoperability, then Microsoft would have solved this problem long ago and many times over.

The perspective that is missing in Patrick’s analysis is that of the vast part of the world’s population that does not benefit, and in fact is distinctly disadvantaged by having multiple incompatible document standards. We’ve been here before, in the 1980’s and 1990’s. It was not fun. We should not be seeking ways to repeat that failure.

Much of the world is also disadvantaged by the monopolist’s rent paid on Microsoft products and the associated lack of choice in today’s software monoculture. I’d rather help the world free itself of this oppression than appease the oppressor in hopes that he’ll wield a more lenient whip.

Last September, the NB’s of Great Britain, Brazil, Chile, Colombia, New Zealand, and the United States all requested that specific features be added to OOXML in order to improve interoperability with ISO ODF, in total 40 features such as the ability to have background images in tables or to have font weights beyond “normal” and “bold”. These were the exact features that Microsoft’s translator project on SourceForge identified as needed to improve interoperability with ODF. Ecma rejected all of these requests. They did not reject them because the features were unreasonable. They were rejected purely because they were ODF features.

So given the chance to do more than just write reports and have panel discussions, Ecma refused to move interoperability forward even one inch. If this is them on their best behavior (they desperately need NB approval votes), then why would we expect greater consideration from them if OOXML were approved?

In his next letter,”Confusion“, Patrick responds to Andy Updegrove, but not having followed that debate, I’m the one who is now confused. Patrick seems to be arguing that it doesn’t matter whether OOXML is “good” or not (in fact he seems to argue that there is no “good” or “bad” when it comes to XML) but that it will be better if OOXML was someplace where we could talk at it more.

I don’t know whether I’d choose to use moral terms when describing engineering artifacts either, but I would note that if the basic protocols and formats of the web were as poorly designed as OOXML, the web would never have thrived to become the glory it is today.

In “On the Importance of Being Heard” Patrick generously gives us his opinion of the DIS 29500 BRM he did not attend. The argument formally comes down to this:

  1. Based on published and unpublished reports from the BRM, it appears that “everyone at the table was heard” and “Microsoft was listening to everyone” in a “public and international” forum.
  2. If we now reject OOXML, we “all lose a seat at the table where the next version of the Office standard is being written”.
  3. If we approve OOXML, even though “rough” then this “gives all of us a seat at the table for the next Office standard”.
  4. Therefore, Patrick recommends approval of DIS 29500.

This argument has several critical flaws.

First, it is inaccurate to call the BRM proceedings “public”. Neither the public nor the press was allowed to attend. Security guards were posted at the door to enforce this mandate. JTC1 is a private, Swiss-headquartered NGO, answerable to no one, with no statutory responsibility to the public. Patrick talks about “ordinary users, governments, smaller interests” having a seat at the table. This is a fantasy. I did not see any such representation at the table in Geneva. One in five BRM attendees were Microsoft employees. Over 25% of the 114 people in attendance were either Microsoft or Ecma TC45 members. I fear that Patrick underestimates the extent to which NB’s have been stacked over the past two years and that he preserves some illusion of SC34 NB’s comprised of “ordinary users, governments, smaller interests”. Maybe that was true a few years ago, but the neighborhood has changed.

Was everyone at the table heard? Formally, it is true that every delegation had the opportunity to raise a single issue during the week. Some (those earlier in the alphabet) had the opportunity to raise two issues. But I think it is disingenuous to cast that as “everyone at the table was heard”. For many delegations it was true that for every issue they were able to raise, they had 10 or 20 more that they wanted to raise, based on their analysis of Ecma’s proposed dispositions, but were unable to because of insufficient time.

Was Microsoft listening? Yes. Everyone in the room was listening. Formally only the BRM itself could authorize changes to the standard at this point, regardless of Microsoft’s or Ecma’s opinion. So it is moot as to whether Microsoft was attentive. Whether they listened or not has zero impact on the ability of the BRM to make changes.

Patrick also appears to be impressed that this discussion all takes place “at a table where a standard for a future product was being debated by non-Microsoft groups?” What future product? The future product is Office 14 (Office 2009). Microsoft has not informed JTC1 nor Ecma on what the changes to OOXML will be for Office 14, due out later this year in beta form.

And then we come to main point of Patrick’s argument. Vote “Yes” so we all have a seat at the table. Before we buy into that logic, I suggest we examine other Microsoft/Ecma standards and see how their approval has or has not lead to increased participation.

Microsoft has two primary ways to negate broader participation in a standard’s maintenance. The first is standards abandonment. Take for example Ecma-234 “Application Programming Interface for Windows”. A contemporary observer might have been just as enthusiastic as Patrick is now. Wow! Isn’t this great? They are finally opening up and listening to the world! We finally have a seat at the table! I have a feeling that things are going to be better from now on!

Unfortunately, this standard was approved in December 1995 and covers the Windows 3.1 API only. Since Windows 95 shipped in August 1995, this Ecma standard was obsolete on the day it was approved. No revision of the standard was ever issued. Microsoft abandoned it.

Now certainly, there was nothing in principle that prevented the non-Microsoft Ecma members from continuing to maintain Ecma-234, creating errata documents, polishing up the language of the clauses, etc. But they had no effective way of actually evolving the standard when Microsoft withdrew from the process. That is the danger when you approve a single-vendor standard on the false assumption that this leads to openness.

The other way to negate broader participation in standards development is to create technical revisions at a rapid pace, and to create them within Microsoft with little outside participation. Note that this is how OOXML was created in the first place. And this is how Microsoft/Ecma maintains standards like the C# Programming Language. Ask your friends in JTC1/SC22 whether “70% of the world’s population” has a “seat at the table” in evolving that standard. Let me know what you hear. I believe you’ll hear that there has been negligible WG activity around C# maintenance, and that new revisions are promulgated by Microsoft, rubber stamped by Ecma, and sent on to SC22, canceling the previous standard and replacing it with the new one.

This trick can be very effective whenever the underlying Microsoft product has an update every 2-3 years. If your product revisions are more frequent than the required JTC1 maintenance checkpoints, then you can effectively ignore JTC1. That’s how Microsoft/Ecma has played the game in the past.

Note that Office 2007 has been out since late 2006. Office 14 (Office 2009) is due out in beta form this year, with expected release next year. Any bets on whether the file format will require a technical revision to accommodate Office 2009? There is absolutely nothing that prevents Microsoft from submitting a revised file format specification for Ecma, getting a rubber stamp approval and then Fast Tracking it back into JTC1. Since that is how they have treated other Microsoft/Ecma standards, the burden is on those who argue the contrary to support their optimism.

So consider the facts:

  1. Microsoft has not supported the JTC1 maintenance process with their other Ecma Fast Tracks. There is no broader “seat at the table”, no power sharing, no ownership by “70% of the world’s population”. It is 100% Microsoft.
  2. Microsoft’s current charter in Ecma TC45 explicitly calls for Ecma to own maintenance of OOXML if approved, not SC34.
  3. Ecma in fact has submitted a proposal [PDF] to SC34 asking for control of OOXML to be handed back to them.
  4. With their “rejuvenation” of SC34 (from 9 to 40 P-members in 2 years) Microsoft clearly has the votes it would need to force any maintenance regime they desire.
  5. No one at Microsoft has made an official statement in writing confirming Patrick’s vision of future maintenance. In fact their only official statement, the Ecma proposal to SC34 cited above, contradicts what Patrick is suggesting. So why are only 3rd parties speaking so glowingly about the future control of OOXML? Plausible deniability, anyone?

Until the following occur I’d advise a bit more skepticism, considering that we’re dealing with a company with a clear record of abusing, subverting, abandoning, embracing and extending etc., standards:

  1. Ecma changes their TC45 charter to explicitly call for all maintenance activities (corrigenda as well as technical revisions) to be performed in an SC34 WG.
  2. Ecma explicitly withdraws their submission on DIS 29500 maintenance from the agenda of the Oslo SC34 Plenary and instead submits a proposal asking for future OOXML work to be done in a new WG in SC34, with a non-Microsoft chair.
  3. Microsoft publicly states that they will hand operational control of OOXML to SC34, not only for maintenance of OOXML 1.0, but also for technical revisions, and that they will support this being done under JTC1 IPR rules, and using the JTC1 process, and that they will implement whatever revisions SC34 develops within 1 year of approval.

Until you have that, you have nothing. Get that, and then you can start talking about having a “seat at the table”.

In his most recent post, “Russian Peasant” Patrick suggests that the only reason one would vote against OOXML is spite, and that any problems could be fixed in maintenance.

Let’s try another analogy. You are shopping for a new TV and you go to your local consumer electronics store and look at the array of television sets lined up. Most come with a warranty. Any defects detected within the maintenance period will be fixed at the manufacturer’s expense. This is generally a good thing, having a maintenance period to fix problems that were not evident at purchase time.

So you find the model TV you want, the salesperson rolls out the box and just before you hand over your credit card, you notice a big gash on the side of the box, where a forklift had pierced it. You say, “I can’t accept this TV, it has been smashed!”. The salesperson says, “Don’t worry. No TV is perfect. We can fix this in maintenance. You’re fully covered.”

Do you hand over your credit card? Of course not. Maintenance periods, with TV’s as with standards, are for defects detected after the fact. It is not a replacement for proper inspection, review and approval processes. You expect a TV to work properly at the start.

No standard is perfect. We all know that. But at the time of approval, NB’s should be confident that their technical review was sufficient to find all of the important issues, and that these issues have all been fixed in the standard. OOXML should not be approved unless it is suitable now. The maintainers of OOXML will be busy enough fixing other problems that will be found later. We should not willingly approve a defective standard and set up a future maintenance group for failure by front-loading their agenda with defects that we already know about.

Consider: If we do that, then on what grounds can we reject another Fast Track proposal ever again? This slippery argument — we can fix that in maintenance — can be used for every single proposal that ever comes along. Why even have JTC1 at this point? Easier for everyone involved just hand the “International Standard” stamp over to Ecma and allow them to rubber stamp their own International Standards. This will save the time and expense of engaging hundreds of representatives from 87 JTC1 NB’s for a year for a sham review.

My advice is this. Let’s turn this train wreck around. Vote No on DIS 29500 and send a clear message that 6,000 page immature standards are not appropriate for JTC1 Fast Track. It showed poor judgment and great disrespect toward JTC1 NB’s for Microsoft to send this mess via Fast Track in the first place.

Microsoft has every right to feel that they are late to the game, and risk being left behind for their lack of an open document standard. But they should not expect that they can simply throw money around and remedy their long neglect overnight. And certainly they should not expect JTC1 NB’s to do the work for them. Microsoft should work on their specification at the consortium level and get it right first. Once when they have something mature, then they should send it along, preferable in smaller parts submitted sequentially. If they are unwilling or incapable of fixing the specification in Ecma then they could propose it as a new work item in SC34, where they may find some assistance. But if they persist on the standard remaining a single vendor standard, unilaterally controlled to benefit that single vendor, then I wouldn’t expect a warm reception in SC34 either.

  • Tweet

Filed Under: OOXML

JTC1 Improv Comedy Theater

2008/03/06 By Rob 32 Comments

JTC1 has been improvising its Fast Track processing from the start of the DIS 29500 procedure.

The latest “let’s invent a new rule” came at the BRM in Geneva, where a novel approach to tallying meeting votes was surreptitiously foisted on delegations, one which is clearly against the plain text of JTC1 Directives.

The question is how votes should be counted at a Fast Track BRM, where consensus cannot be reached, in this case for lack of time. Specifically, in that final batch-vote on 1027 comments, how should votes be counted. I believe the rules call for positions to be established by the majority of P-members. The leadership of the meeting instead counted both P-members and O-members. In the balance lies the fate of over 100 Ecma proposals which may or may not be included in the final text of the DIS, depending on how this question is resolved.

Let’s review the rules, from the current JTC1 Directives (5th Edition, Version 3.0)

First let’s start with the overriding rule from section 1.2 “General Provisions”:

These Directives shall be complied with in all respects and no deviations can be made without the consent of the Secretaries-General.

Or in plain English — “These are the rules, you can’t just make stuff up”.

So what is a P-member and an O-member? This is covered in chatper 3 “Membership Categories and Obligations”. P-members are defined as:

P-members within JTC 1 shall be NBs that are Member Bodies of ISO or National Committees of IEC, or both. Only one NB per country is eligible for membership in JTC 1. P-members have power of vote and defined duties.

and O-members are defined as:

Any NB that is a Member Body of ISO or National Committee of IEC, or both, may elect to be an O-member within JTC 1. Correspondent members of ISO are also eligible to be O-members of JTC 1. O-members have no power of vote, but have options to attend meetings, make contributions and receive documents.

So clear enough? O-members can attend meetings and contribute, but cannot vote. P-members can vote at meetings.

Section 9 deals with the voting rules, and 9.1.4 speaks about meeting votes in particular:

In a meeting, except as otherwise specified in these directives, questions are decided by a majority of the votes cast at the meeting by P-members expressing either approval or disapproval.

So, in a meeting, only P-members vote and they vote by majority. “Except as otherwise specified in these directives” means that this rule can be overridden in specific cases. But the override must be “specified”, i.e., actually written down that it is an override of the normal meeting voting rules.

So drilling down a level deeper we come to the Fast Track rules themselves in chapter 13, where in 13.8 is covered meeting votes at a Fast Track BRM:

At the ballot resolution group meeting, decisions should be reached preferably by consensus. If a vote is unavoidable the vote of the NBs will be taken according to normal JTC 1 procedures.

So on the surface this seems to be a vague statement. What are “normal JTC 1 procedures”? However, a moment’s reflection on 9.1.4 above shows that the Directives have already declared this as the normal procedure for meeting votes by saying that this is the rule that holds unless specified otherwise.

One can easily seek confirmation of this by looking at the parallel rules for PAS process BRM votes, given in 14.4.3.9. Here it is more explicit:

At the ballot resolution group meeting, decisions should be reached preferably by consensus. If a vote is unavoidable, the approval criteria in the subclause 9.1.4 is applied.

So despite the clear and plain text of the Directives, the JTC1 leadership decided to improvise a new rule, or more precisely the application of a different rule in the wrong context. The argument appears to be that section 9.5 applies to BRM votes. Section 9.5 “Combined Voting Procedure” is introduced as:

The voting procedure which uses simultaneous voting (one vote per country) by the P members fo [sic] JTC 1 and by all ISO member bodies and IEC national committees on a letter ballot is called the combined voting procedure. This procedure shall be used on FDISs, DISs, FDAMs, DAMs and FDISPs.

This is absurd. JTC1 Directives are not a menu. You can’t just pick what voting procedure you want to use from the list. The Directives tell you what procedure to use. First, the combined voting procedure is for letter ballots given to an NB, not for a BRM meeting vote by a delegation. Second, the BRM was not voting on an FDIS, DIS, FDAM, DAM or FDISP. We were voting on whether to include changes into a set of meeting resolutions. We were told repeatedly that the BRM could not take a position on the DIS. Finally, if combined voting procedures are read as applying to Fast Track, then they would also, by that same logic, need to apply equally to PAS, since both PAS and Fast Track are DIS’s. But as shown earlier, the PAS process explicitly calls for P-member majority voting according to 9.1.4.

One does not arrive at the voting rules of 9.5 by any straightforward or natural reading of the Directives.

So again, repeating from JTC1 Directions 1.2:

These Directives shall be complied with in all respects and no deviations can be made without the consent of the Secretaries-General.

I wasn’t in favor of having any batch ballot, because it violates the spirit of the consensus process, as defined in JTC1 Directives 1.2:

These Directives are inspired by the principle that the objective in the development of International Standards should be the achievement of consensus between those concerned rather than a decision based on counting votes.

[Note: Consensus is defined as general agreement, characterised by the absence of sustained opposition to substantial issues by any important part of the concerned interests and by a process that involves seeking to take into account the views of all parties concerned and to reconcile any conflicting arguments. Consensus need not imply unanimity. (ISO/IEC Guide 2:1996)]

To resort to “counting votes” on the vast majority of the technical issues of DIS 29500, without discussion or opportunity for objection, this is a failure of the JTC1 process. But if we are to have a vote at all, then let it be done in accordance with the rules.

So, let’s stop the nonsense. Let’s quit the tortuous post facto reinterpretation of the rules. Let’s recount and republish the results of the BRM counted according to the Directives and move on with the process. If JTC1 cannot consistently adhere to its own rules, then it should consider another line of business.

  • Tweet

Filed Under: OOXML

OOXML, Macros and Security

2008/03/04 By Rob 9 Comments

As we all know, rich desktop editors, such as those provided in Microsoft Office, offer a range of end-user programming options, such as Visual Basic macros. These can be used to automate repetitive clerical tasks, such as a mail merge, or to add a custom user interface over a data entry form. These capabilities have existing in personal productivity applications since the late 1980’s — so 20 years now. This is a not cutting-edge feature.

Such scripting capabilities are essential for the creation of high-value scripted documents. These features are essential in modern applications. Almost every word process or spreadsheet today has automation capabilities. Even open source applications like OpenOffice have macro features. So, considering the popularity and value of scripting in a productivity application, it is much lamented that DIS 29500 does not define how scripts or macros are to work. This lack will cause serious interoperability concerns, as each vendor, lacking standards guidance, will implement these features in incompatible ways.

Specifically, in order to have any interoperability among scripted documents, it is necessary to define:

  • How and where a script is stored and located within the Open Packaging Convention (OPC) container file.
  • How is the script bound to the document. In other words, how does the document content associate itself with the macro?
  • What is the runtime language of the script?
  • What is the core and extension API’s available to the script?
  • What is the security model?

OOXML defines none of these. So how can it meet its goal to “represent faithfully the existing corpus of word-processing documents, spreadsheets and presentations that have been produced by Microsoft Office applications (from Microsoft Office 97 to Microsoft Office 2008 inclusive)”? How can it do that and ignore the macros that have been around for decades?

Note that there is ample precedent for a markup standard answering these questions in a flexible and interoperable manner. For example the common web paradigm would be:

  • Script is located via URL specified in a “src” attribute of a script element, or is given inline
  • The script is invoked by a function call at a particular point in the document, or triggered from a standard event such as onLoad().
  • Multiple runtime languages are supported, often EcmaScript
  • The API’s allowed are defined by the W3C’s DOM API
  • There is a defined security model to deal with hazards such as cross-frame scripting, etc.

OOXML provides none of this, so interoperability of these high value documents is not possible. Note again that scripting is widespread and has been around for 20 years. So it is especially unfortunate that a newly proposed standard lacks this capability.

Note however that scripting is not without its problems. We all remember the Word Macro Viruses of several years ago, such as Melissa. Portable code has well-known risks, and these risks have well-known counter-measures. For example, it is common for anti-virus software to scan Word documents for viruses. It is also common for mail servers to scan incoming emails for attachments with viruses, and even remove the macros or block documents with macros, according to admin policy. So there is a need toenable 3rd party applications that can locate, retrieve, scan and delete scripting elements from documents. However, since OOXML does not define even where the scripts are stored, or how they can be located, such 3rd party applications cannot be written in general for a document described by this specification. The standard provides an insufficient foundation for implementing a reasonable security policy around OOXML documents.

For example, take Ecma Response 101, approved in Geneva in a 9-4 vote as part of a large batch 0f 1027 changes, without discussion or opportunity for dissent. Four NB’s, in their ballot comments from last September, pointed out that Section 2.16.5.41 of DIS 29500’s Part 4 defines a “MACROBUTTON” field that allows the definition of a button in the document that will trigger a macro. But nothing is said about how the macro is stored, bound, what API’s are available, what the security model is, etc.

The request from one NB was to “Describe this feature to a level where cross-platform, cross-application interoperability is possible.” However, what Ecma provided in their draft Disposition of Comments report, approved in batch by the BRM without discussion or opportunity for objection, was something quite different. They merely added the the following text:

The mechanism by which the command specified by text in field-argument-1 is located and/or executed by an application is implementation-defined

So not only is it impossible to have cross-platform interoperability of this feature, it is not even possible to implement a reasonable security policy to detect, scan or block macros. Even the location of the macro is outside the scope of the standard. It could be just another file in the Zip. It could be a binary blob with an obscure content type that varies from application to application. It could be base64Encoded in the XML. Or it could be steganographically encoded in low-order bits of an image file. The OOXML standard is singularly unhelpful in telling us how to deal with this risks of this macro function.

Finally, note that this lack of information on how to locate macros within a document makes it impossible for anyone to programmatically combine or divide OOXML documents which may contain macros. For example, imagine a 2-page spreadsheet, with a macro on sheet one only. How can it be split into two one-page documents, if there is no defined way to locate the script associated with page one? This is the type of automated composition and document manipulation that OOXML should be enabling. Similarly, how can one combine two single documents containing macros into one document, if there are no defined rules for locating and naming macros? Many basic types of applications,such as merging slide shows, etc., will break in the presence of macros.

The above topic was of interest to several NB’s in Geneva, but could not be discussed for lack of time at the BRM.

  • Tweet

Filed Under: OOXML

The Carolino Effect

2008/03/04 By Rob 11 Comments

“There is it some game in this wood?”

Pedro Carolino wanted to write and publish an Portuguese/English phrase book.

“Another time there was plenty some black beasts and thin game, but the poachers have killed almost all.”

But one small problem — Carolino did not know English.

“Look a hare who run! let do him to pursue for the hounds! it go one’s self in the plonghed land.”

Undeterred, Carolino hatched a clever plan.

“Here that it rouse. let aim it! let make fire him!”

He had a copy of an Portuguese/French phrasebook, O Novo guia da conversação em francês e português by José da Fonseca. And he had a French/English dictionary.

“I have put down killed.”

With these two resources, writing his phrase book would be easy. Or so he thought.

“Me, i have failed it; my gun have miss fixe.”

Starting from the French half of the text in da Fonseca’s book, Carolino dutifully used his dictionary to translate, word-for-word, the French into English.

The result, O Novo Guia da Conversação, em Português e Inglês, em Duas Partes was published in Paris in 1855, and is now considered to be a classic of unintentional humor.

“Here certainly a very good hunting.”

A similar problem occurs in DIS 29500 “Office Open XML”. The scope of OOXML, as amended by the BRM is stated as:

This International Standard defines a set of XML vocabularies for representing word-processing documents, spreadsheets and presentations. The goal of this standard is, on the one hand, to represent faithfully the existing corpus of word-processing documents, spreadsheets and presentations that have been produced by Microsoft Office applications (from Microsoft Office 97 to Microsoft Office 2008 inclusive). It also specifies requirements for Office Open XML consumers and producers , and on the other hand, to facilitate extensibility and interoperability by enabling implementations by multiple vendors and on multiple platforms.

Faithful representation of Microsoft Office 97-2008. I’ve learned it is rarely polite to ask a man what he means by “faithful”, but let me make an exception here. We have now the binary Office format specifications, not part of the standard, but posted by Microsoft. And we have OOXML specification. In what way does the OOXML “represent faithfully” the “existing corpus” of legacy documents?

Does OOXML tell you how to translate a binary document into OOXML? No. Does it tell you how to map the features of legacy documents in OOXML? No. Does it give an implementor any guidance whatsoever on how to “represent faithfully” legacy documents? No. So it is both odd and unsatisfactory that primary goal of the OOXML standard is so tenuously supported by its text.

Now, certainly, someone using the binary formats specifications, and using the OOXML specification, could string them together and attempt a translation, but the results will not be consistent or satisfactory. It is the Carolino Effect. Knowing the two endpoints is not the same as knowing how to correctly map between them. A faithful mapping requires knowledge not only of the two vocabularies, but also the interactions.

Also, having the two specifications does not help with the 77 features in OOXML which are declared to the “implementation-defined” or “application-defined”. How are these translated from the binary formats?

Note that DIS 29500 bears the obvious marks of its legacy roots, from the use of VML and non-hierarchical run structures in WordProcessingML, to bit fields and idiosyncratic leap year calculations in SpreadsheetML. This suggests the likelihood that the authors of this standard did not just sit down and design the standard from scratch, but that they in fact had access to the binary format specification and mapped it into XML as a preparatory step. It is difficult to explain the presence of elements such as “lineWrapLikeWord6” without positing the presence of such a mapping.

Microsoft should simply publish this mapping. Without such a mapping, conversions will be inconsistent, interoperability will suffer and a primary goal of the standard will not be met. Given the same binary document, Microsoft Office, Apple iWork, OpenOffice.org, etc., will all produce different OOXML documents. How is this “faithfully representing” existing documents? What is needed is a canonical mapping.

Note that the initiation of a open source project to develop a convertor between the binary formats and OOXML is insufficient. What is required is a canonical mapping. Otherwise we are faced with the reality that the true goal of OOXML is more accurately stated as:

To allow Microsoft the ability to represent their legacy documents in XML and pretend that it is a capability that other vendors can practice as well.

Though this issue was of great interest to several NB’s, it was not able to be raised at the BRM for lack of time.

  • Tweet

Filed Under: OOXML

  • « Go to Previous Page
  • Go to page 1
  • Go to page 2
  • Go to page 3
  • Go to Next Page »

Primary Sidebar

Copyright © 2006-2023 Rob Weir · Site Policies