• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar

An Antic Disposition

  • Home
  • About
  • Archives
  • Writings
  • Links
You are here: Home / 2007 / Archives for January 2007

Archives for January 2007

More Matter with Less Art

2007/01/31 By Rob 29 Comments

I wish to discuss a recent blog post, a vigorous defense of Microsoft’s Office Open XML and XAML from Novell’s Miguel de Icaza. His post is so wrong, on so many levels, that I am somewhat at a loss for words. Miguel is not stupid, and I find it hard to believe that he is a Microsoft shill, so I must assume that he was imperfectly informed on this issue. “Everyone is entitled to their own opinions but they are not entitled to their own facts,” as Pat Moynihan was fond of saying. I’ll try hard not to make this personal, but there are so many errors in his post that he may very well feel the sting of correction in my words, and for that I apologize in advance.

I suggest you read through Miguel’s post in its entirely, and then return here for my response.

After an attack against lawyers, we come to some technical comments:

Unlike the XML Schema vs Relax NG discussion where the advantages of one system over the other are very clear, the quality differences between the OOXML and ODF markup are hard to articulate.

The high-level comparisons so far have focused on tiny details (encoding, model used for the XML). There is nothing fundamentally better or worse in those standards like there is between XML Schema and Relax NG.

ODF grew out of OpenOffice.org and is influenced by its internal design. OOXML grew out of Microsoft Office and it is influenced by its internal design. No real surprises there.

Maybe I can be of some assistance here, helping to articulate the difference in quality between ODF and OOXML. ODF, starting from its roots in OpenOffice.org specification, spent a further 2 1/2 years being improved and reviewed in OASIS, then further work preparing for submission to ISO, then a further year in ISO, receiving more comments and corrections, before it was published as an ISO standard. So this is a combined 4 years in technical committees being refined by standards bodies. During this time ODF has been implemented in dozens of applications, including full suites like OpenOffice.org, KOffice and Lotus Workplace, as well as individual applications like AbiWord, Gnumeric and Google Docs and Spreadsheets.

In comparison, OOXML went from a proprietary Microsoft specification to an Ecma standard in record time. If you make something 8 times lengthier than ODF, and do it 4 times faster than ODF, then you are going to have a quality problem. The list of problems on GrokLaw is one list of known problems in OOXML. Note that particular list was generated in only 3 or 4 days by volunteers. I recently did a sampled survey of OOXML specification quality and predicted that it contains thousands of errors.

And where are the OOXML implementations? OOXML was approved by Ecma and submitted to ISO without a single available implementation. Certainly, Office 2007 later shipped with support, but is that it? A single implementation? Until you have at least two independent implementations of a standard you will have a very imperfect understanding of the standard’s quality.

So the question to ask is this: Why should JTC1 NB volunteers deal with the mess that Microsoft dropped on their lap by their overhasty review of OOXML in Ecma? Why should they spend the next 6 months reviewing this specification when even a cursory review shows it is defective in so many ways? And considering the observed low level of quality, why should it be reviewed and approved via a Fast Track process, and all in one big chunk of 6,000 pages? Isn’t this the last thing you want to do, following up a rushed review in Ecma by a rushed review in ISO? Instead this should go back to Ecma to let them do a proper review, one they can be proud of.

Miguel correctly points out that OOXML derives from Microsoft Office’s formats, and ODF derives from OpenOffice.org’s formats. But then he leaps to an assertion that they both reflect their parent application’s internals. This is not true. Only a poorly-designed file format reflects the internals of the application. Maybe that is how we did it back in the 1980’s, but best-practices for portable file formats have been known for years now. That is why we have data formats like XML, so the format can be independent of the application internals. ODF was designed, even in the OpenOffice days, from the ground up to be an application- and platform-neutral document format. While it was further developed in OASIS, it continued to take on such good qualities as reuse of existing relevant W3C standards such as XForms and MathML and SVG. So certainly, the platform-independence and open nature of OpenOffice.org rubbed off on ODF, but isn’t that an extremely good thing?

OOXML, on the other hand, matches to an inane degree the internals of a single vendor’s legacy application, with no concessions to platform-neutrality. For example, OOXML encodes data in non-XML formats such as binary blobs, bitmasks and other encodings that defy XML schema validation or processing by XML tools. As I’ve said before, this is not a specification, this is a DNA sequence.

Does that help articulate the difference?

Miguel then takes on the size question:

A common objection to OOXML is that the specification is “too big”, that 6,000 pages is a bit too much for a specification and that this would prevent third parties from implementing support for the standard.

Considering that for years we, the open source community, have been trying to extract as much information about protocols and file formats from Microsoft, this is actually a good thing.

This is good thing, I agree, that Microsoft has produced this specification. I’d like even more for them to make the specification for the Office binary formats public, since that is the format that the billions of legacy documents are actually in. I hope you’ll join with me in calling for Microsoft to release the specification for these formats under their Open Specification Promise, so that users will truly be able to choose which format they want to remain in or move to.

However, merely because it is useful from a disclosure perspective, does not necessarily mean it will make a good standard. Simply because it is better than nothing does not mean it is sufficient for an ISO standard. There is an important difference between a descriptive specification and a prescriptive standard. Writing down file formats is a small virtue, and one that other companies have done for years. Do they all deserve to be ISO standards?

For example, many years ago, when I was working on Gnumeric, one of the issues that we ran into was that the actual descriptions for functions and formulas in Excel was not entirely accurate from the public books you could buy.

OOXML devotes 324 pages of the standard to document the formulas and functions.

….

Depending on how you count, ODF has 4 to 10 pages devoted to it. There is no way you could build a spreadsheet software based on this specification.

This is a rather bold misstatement, considering that implementations such as OpenOffice.org, KSpread, Gnumeric, Google Spreadsheets, Lotus Workplace, etc., already in fact exist. Go back even earlier, we had 1-2-3, Quattro Pro and OpenOffice all supporting Excel’s formulas even though there was no formal specification for it. Sure having a good specification helps, but the extreme rhetoric that says that this is unimplementable is patently absurd. Just look around.

Some folks have been using a Wiki to keep track of the issues with OOXML. The motivation for tracking these issues seems to be politically inclined, but it manages to pack some important technical issues.

Hmm… The open source community helps test a purported open standard, reports the defects it finds, and this is called “politically inclined”? Isn’t this what open source is all about, “given sufficient eyeballs, all bugs are shallow”? Shouldn’t open standards be subject to scrutiny? As I said in my blog, I am so impressed by the quality and productivity of this type of wiki-enabled public review that I am going to investigate how we can do this to solicit public comments on ODF 1.2. This isn’t for political reasons. This is because it works.

Some of the objections over OOXML are based around the fact that it does not use existing ISO standards for some of the bits in it. They list 7 ISO standards that OOXML does not use: 8601 dates and times; 639 names and languages; 8632 computer graphics and metafiles; 10118-3 cryptography as well as a handful of W3C standards.

By comparison, ODF only references three ISO standards: Relax NG (OOXML also references this one), 639 (language codes) and 3166 (country codes).

Not only it is demanded that OOXML abide by more standards than ISO’s own ODF does, but also that the format used for metafiles from 1999 be used. It seems like it would prevent some nice features developed in the last 8 years for no other reason than “there was a standard for it”.

Miguel has inexplicably ommitted all of the W3C standards that ODF uses, such as XForms, MathML, SVG, XLink, SMIL, XSLT, CSS2 as well as IETF standards such as RFC 2045, RFC 2048, RFC 2616, RFC 2898, RFC 3066, RFC 3987. To imply that OOXML follows more standards that ODF is a foolish statement, unsupported by facts.

On the WMF, Miguel has it all wrong. What is a Windows Metafile? It is simply a recording of the graphical function calls made by Windows as it renders a drawing. It maps 1-to-1 into Windows API calls. It maps so closely to Windows that when the WMF format was found to be vulnerable to a security flaw, even the Wine Windows compatibility layer for Linux was susceptible to the same security hole. WMF (and VML, another legacy format in OOXML with a history of security problems) are flawed formats. One security vendor said: “Turns out this is not really a bug, it’s just bad design. Design from another era.” and “The WMF vulnerability probably affects more computers than any other security vulnerability, ever.”

Although Miguel is pleased to note that the proposed cross-platform ISO standard, Computer Graphics Metafile (CGM) dates to 1999, he fails to mention that WMF is even older, dating back to Windows 3.0 (1990).

So which one should be prefered in an ISO standard? The Windows Metafile format which is not documented in an open standard, is tied to the graphical layer of a single vendor, and has design flaws with serious security implications? Is this what we really want? Or do we want an open standard, one designed to be platform neutral, that has been in use for eight years, that has had a community continuing development and promotion of it such as CGM Open and WebCGM? Where is the WMF community? A Google search for WMF comes up with security problems; a search of CGM comes up with communities, initiatives and test suites.

There is an important-sounding “Ecma 376 relies on undisclosed information” section, but it is a weak case: The case is that Windows Metafiles are not specified.

It is weak because the complaint is that Windows Metafiles are not specified. It is certainly not in the standard, but the information is publicly available and is hardly “undisclosed information”. I would vote to add the information to the standard.

Did you really read the Groklaw issues list? WMF is not the only, or even the most troublesome of the undisclosed information in OOXML. Start here, then go back and read the Groklaw list of issues, and let me know if it makes more sense then. I am not that good at explaining these things, so please ask questions and I will try harder.

I have obviously not read the entire specification, and am biased towards what I have seen in the spreadsheet angle. But considering that it is impossible to implement a spreadsheet program based on ODF, am convinced that the analysis done by those opposing OOXML is incredibly shallow, the burden is on them to prove that ODF is “enough” to implement from scratch alternative applications.

There is that claim, that it is impossible to implement an ODF spreadsheet. Miguel, surely you aware of OpenOffice, KSpread, Lotus Workplace, Gnumeric, Google Docs? How can you persist in such obvious error? How could you actually write the above when you know, I know, and everyone reading it knows that it is patently false? Please tell me it was a just a typographical error.

Here’s a challenge: Give me a list of four spreadsheet applications from four different vendors that today are as interoperable with OOXML as the four leading ODF spreadsheets are with ODF.

There is a good case to be made for OOXML to be further fine-tuned before it becomes an ISO standard. But considering that Office 2007 has shipped, I doubt that any significant changes to the file format would be implemented in the short or medium term.

The best possible outcome in delaying the stamp of approval for OOXML would be to get further clarifications on the standard. Delaying it on the grounds of technical limitations is not going to help much.

This is quite a revealing statement. Why should the shipment of Office 2007 factor in the appropriateness and the quality of a proposed International Standard? Should standards of quality be relaxed for Microsoft’s convenience? Do technical limitations not matter because Microsoft has sales targets to meet? Is this what ISO is for? If so, I suggest their hard-working volunteers be given Microsoft salaries and stock options, since clearly they would be working only for Microsoft’s benefit at this point.

Miguel has a good point at the end:

To make ODF successful, we need to make OpenOffice.org a better product, and we need to keep improving it. It is very easy to nitpick a standard, specially one that is as big as OOXML. But it is a lot harder to actually improve OpenOffice.org.

If everyone complaining about OOXML was actually hacking on improving OpenOffice.org to make it a technically superior product in every sense we would not have to resort, as a community, to play a political case on weak grounds.

OpenOffice.org is one, but not the only application of ODF. It is the most prominent one in the traditional heavy-weight office suite model, but I’m not certain that this is the only way forward. We need good implementations, several of them, since one size does not fit all.

In any case I’d say in return that if Microsoft and Microsoft boosters spent some of their time investigating exactly how easy it would be to encode Office’s legacy features on top of the extensible ODF specification, and worked together with the ODF community to address their common concerns, then we could easily have a single interoperable format that we all could use. The resulting standard of OOXML on top of ODF would be smaller, simpler, higher quality and more interoperable than the mess that we’ll end up with by having OOXML as a standard, in addition to ODF.


Change Log:

2/1/2007 — fixed spelling errors reported by a reader via email
2/2/2007 — another spelling error

  • Tweet

Filed Under: ODF, OOXML

Defining Deviancy Down

2007/01/30 By Rob 10 Comments

Kai Erikson, in his classic study of deviant behavior in early New England, Wayward Puritans, made the important observation that:

…the amount of deviation a community encounters is apt to remain fairly constant over time. To start at the beginning, it is a simple logistic fact that the number of deviancies which come to a community’s attention are limited by the kinds of equipment it uses to detect and handle them, and to that extent the rate of deviation found in a community is at least in part a function of the size and complexity of its social control apparatus. A community’s capacity for handling deviance, let us say, can be roughly estimated by counting its prison cells and hospital beds, its policemen and psychiatrists, its courts and clinics.

In other words, a community’s perception of social deviation is conditioned and limited by their capacity for controlling it. With equal number of punishment cells, equal-sized communities of cloistered monks and bloodthirsty pirates would perceive the same rate of deviancy. Of course the actual deviations would be different: Brother Maynard isn’t praying earnestly enough versus Greybeard slit a crewmate’s throat in the night, without warning the bunkmate below.

The late Senator from New York, Daniel Patrick Moynihan, took this idea and applied it to the social ills that America has increasingly faced since the 1960’s: mental illness, illegitimacy and violent crime. How does society react when the level of deviancy rises unexpectedly and rapidly above accepted norms? He observed, in an essay entitled, “Defining Deviancy Down”:

[…T]he amount of deviant behavior in American society has increased beyond the levels the community can “afford to recognize” and that, accordingly, we have been re-defining deviancy so as to exempt much conduct previously stigmatized, and also quietly raising the “normal” level in categories where behavior is now abnormal by any earlier standard.

I look at the current situation with Office Open XML (OOXML) in a similar way. There is a clearly defined community — JTC1 member National Bodies — with the responsibility for reviewing submitted standards. However, their capacity for exercising control is finite. The JTC1 Directives allow them a fixed period of time to review any submission. They also have a fixed number of volunteers to perform the review, and a fixed (or at least highly constrained) number of meetings to discuss and agree on review comments. So, when presented with a specification of unprecedented length (over 6,000 pages), and rather low quality, what are they to do? Spend hundreds of hours reading the specification? Write up and report thousands of errors? No, the capacity in JTC1 to deal with this level of deviancy does not exist, so the natural way for the community to cope is to to define deviancy down.

How deviant is OOXML? The 6,000+ page length is one aspect. Another is the rate at which it raced through its Ecma review, 20-times the speed of comparable specifications. Certainly, a longer specification will tend to have more problems than a shorter one, and a rushed review will find fewer problems than a thorough one. But that is speaking in generalities. Is there anything we can say for OOXML defect rates?

The Groklaw review, which occurred over a few days found a large number of serious problems. But I think we can quantify this a bit more. I tried an experiment. I used a random-number generator to generate a sample of 20 page numbers in the OOXML specification. I then read each of these pages, looking for technical errors, platform dependencies, lack of extensibility, drafting errors, etc. I did not bother noting spelling, grammatical or usage errors. I recorded how many reportable errors I found on each page. Some pages had zero problems, others had 1, 2 or even 3 problems. I even found one particularly bad error that could send OOXML back to Ecma once reported — more on that another day — but the average errors per page was 1.0. So projecting out to a 6,039 page specification this leads to a prediction of 6,000 +/- 1,000 errors. Reviewing a larger number of pages would reduce the error bars on that prediction, but we seem to be dealing with defects numbering in the thousands.

Are NB’s able to deal with a level of deviancy this great? Do they possibly have the resources to detect and report this number of errors and then verify that they are addressed? If not, the natural reaction is to define deviancy down.

For example, OOXML is currently in a 30-day review period where “contradictions” with existing ISO or IEC standards can be alleged by National Bodies (NB’s). Although the word “contradiction” is not defined in JTC1 Directives, its meaning can be seen from a resolution unanimously adopted at a JTC1 Plenary in 2000:

Resolution 27 – Consistency of JTC 1 Products

JTC 1 stresses the strong need for consistency of its products (ISs and TRs) irrespective of the route through which they were developed. Any inconsistency will confuse users of JTC 1 standards and, hence, jeopardize JTC 1’s reputation. Therefore, referring to clauses 13.2 (Fast Track) and 18.4.3.2 (PAS) of its Directives, JTC 1 reminds ITTF of its obligation to ascertain that a proposed DIS contains no evident contradiction with other ISO/IEC standards. JTC 1 offers any help to ITTF in such undertaking. However, should an inconsistency be detected at any point in the ratification process, JTC 1 together with ITTF will take immediate action to cure the problem.

The clear meaning of this is that contradictions are to be avoided, and that some of the defining characteristics of standards with contradictions are that they are not consistent, that they confuse users, and that they jeopardize JTC1’s reputation.

Further, we have precedents of other contradictions raised within JTC1, such as just last year, when the NB’s of the UK and Germany both alleged contradictions against Microsoft’s C++/CLI specification, then submitted for Fast Track processing from Ecma. The contradiction raised by the German NB (DIN) in that case said in part:

On a technical level, there are some rather different approaches between C++ and C++/CLI which can easily cause considerable confusion when both languages are considered to be “C++” or add unnecessary overhead when trying to write C++ code usable with C++ and C++/CLI. Below are a few example although if there were sufficient time to to thorough analysis of the C++/CLI document more could probably be found.

This is simple, easy to understand, and well within the spirit of the JTC1 Resolution quoted earlier.

But in a notable case of defining deviancy down, we’re starting to see the word “contradiction” defined very narrowly. For example, Microsoft’s Brian Jones suggests contradictions should be looked at this way:

[T]his is where you want to make sure that the approval of this ISO spec won’t cause another ISO standard to break. In the case of OpenXML, there really can’t be a contradiction because it’s always possible to implement OpenXML alongside other technologies. For instance, OpenOffice will soon have support for ODF and OpenXML.

An example of a contradiction would be if there was a standard for wireless technology that required the use of a certain frequency. If by using that frequency you would interfere with folks using another standard that also leverages that frequency, then there may be a contradiction.

To be quite fair, the Chinese WAPI defeat in ISO is also a precedent, but when searching for a definition of “contradiction” all precedents should be considered, not just one. Arguing exclusively from a wireless protocol standard precedent when dealing with the case of an XML markup standard is dubious when contradictions just last year were alleged to a programming language, a technology much closer to OOXML than a wireless protocol is. Surely, since C++/CLI is Microsoft’s technology they would be aware of this precedent? But still they didn’t mention it.

I ask you to consider the impact of taking Microsoft’s definition of “contradiction” and applying it to virtual technologies, like document formats, image formats, presentation formats, programming languages, operating system interfaces, API’s, security protocols, anything in the realm of software rather than hardware. None of these can ever conflict by Microsoft’s definition. Never. Therefor there is never grounds for a contradiction, and JTC1’s own Directives, which adopted the contradiction clause only a few years ago, is a procedural nullity, a no-op, meaningless, a waste of time for a large part of the technologies JTC1 has standards authority for. This is a clear example of defining deviancy down.

Let’s go back in time, 750 years ago to Thomas Aquinas and his Summa Theologica, the 13th century’s God: The Missing Manual. Aquinas had some apt words on contradictions, when discussing whether the powers of God were infinite and omnipotent (Question 25, Article 3):

Therefore, everything that does not imply a contradiction in terms, is numbered amongst those possible things, in respect of which God is called omnipotent: whereas whatever implies contradiction does not come within the scope of divine omnipotence, because it cannot have the aspect of possibility… For whatever implies a contradiction cannot be a word, because no intellect can possibly conceive such a thing.

Aquinas here allows that God can do all things that are possible, but cannot do something which is a contradiction in terms. Going back to Microsoft’s proposed definition of a contradiction, it seems that they are only willing to acknowledge a contradiction if it amounts to a co-existence problem so severe that even God could not resolve it. This seems to be a rather high hurdle to reach, and is clearly not what JTC1 intended. This is defining deviances down, way down.

This is the essential problem JTC1 has with the OOXML submission. It is too large and has too many problems with it for the control mechanisms available to JTC1 (in particular review time and volunteers) for handling the presented level of deviancy. The only recourse available to them is to define deviancy down to the level where they can handle a much smaller number of problems. Of course, this will lead to a much lower-quality ISO Standard than we are accustomed to, but what other choice is there?

This lesson has clear ramifications for Microsoft. The bigger the specification, the less throughly it will be reviewed. If you make it large enough it will barely be reviewed at all. The plan for 2007 should be to combine the .NET, OPC, XPS, JScript, J#, C#, XAML, WPF, HD Photo and whatever other specifications you have handy, put them all into one 50,000 page document, call it the “Open Microsoft Specification” rush it through Ecma and then Fast Track it into ISO. No one can really stop you. JTC1 Fast Track is broken.

  • Tweet

Filed Under: OOXML, Standards

Microsoft on Standards

2007/01/29 By Rob 11 Comments

There are many delicious morsels in the many exhibits in the Iowa Comes v. Microsoft case. Maybe that is why the official website containing the exhibits was taken down within hours of the case being settled? Luckily websites like Slated Antitrust filled the void and host backup copies of these candid insights into Microsoft’s internal strategies.

Let’s take a look inside.

First, here is the opening “Evangelism is War” section of a report called Effective Evangelism.

Our mission is to establish Microsoft’s platforms as the de facto standards throughout the computer industry. Our enemies are the vendors of platforms that compete with ours: Netscape, Sun, IBM, Oracle, Lotus, etc. The field of battle is the software industry. Success is measured in shipping applications. Every line of code that is written to our standards is a small victory; every line of code that is written to any other standard, is a small defeat. Total victory, for DRG [Developer Relations Group], is the universal adoption of our standards by developers, as this is an important step towards total victory for Microsoft itself: ‘A computer on every desk and in every home, running Microsoft software.’

Then we have this email from Bill Gates:

One thing we have got to change is our strategy — allowing Office documents to be rendered very well by other peoples browsers is one of the most destructive things we could do to the company.

We have to stop putting any effort into this and make sure that Office documents very well depends on PROPRIETARY IE capabilities.

Anything else is suicide for our platform. This is a case where Office has to avoid doing something to destroy Windows.

And here is a excerpt from an email from then Microsoft GM Aaron Contorer to Bill Gates:

Switching Costs

In economics there is a well-understood concept called switching costs – how much it costs for a trading partner to change partners. Our philosophy on switching costs is very clear: we want low swiching costs for customers who want to start using our platform, and we want to provide so much unique value that there are in effect high costs of deciding to move to a different platform. There is a name for this: it is called Embrace and Extend.

Embrace means we are compatible with what’s out there, so you can switch to our platform without a lot of obstacles and rework. You can switch from someone else’s Java compiler to ours; from someone else’s web server to ours; etc. Customers love when we do this (as long as we don’t spend our energy embracing extra standards no one really cares about); our competitors are not sure they like it because they prefer us to screw up.

Extend means we provide tremendous value that nobody else does, so (A) you really want to switch to our software, and (B) once you try our software you would never want to go back to some inferior junk from our competitors. Customers usually like when we do this, since by definition it’s only an extension if it adds value. Competitors hate when we do this, because by adding new value we make our products much harder to clone – this is the difference between innovation and being just a commodity like corn where suppliers compete on price alone. Nobody builds or sustains a business as successful as Microsoft by producing trivial products that are easy to clone – that would be a strategy for failure.

If we fail to embrace, we can lose because there are big barriers to buying our products. But if we fail to extend, or do only humble work that is easy to clone or to surpass, we automatically lose because our competitors will spend literally billions of dollars to clone our work and replace us.

Patrick Ferell, at the time head of MSN tools and applications, worried about the internet’s open standards and protocols:

Looking out from the inside the current MSN strategy some things that concern me about the Internet and the Web are:

1) The Internet is about as open as it gets. This means that an ISV can go and buy a C compiler and a server, rent a wire and create a new service or create an extension to an existing one. The tools are still a little crude but there are very few bottlenecks in this process.

2) The Internet defines formats and architectures that MS has no control over and very little say in. MIME and the WWW helper architectures are crude but quite extensible.

Are there any other good Microsoft quotes out there regarding formats or standards? Post as a comment and I’ll add the best ones to the main post.


Change Log:

02/11/2007 — added Embrace & Extend quote sent in from reader
02/14/2007 — note on the links to the exhibits being broken
02/03/2008 — added MSN strategy quote

  • Tweet

Filed Under: Microsoft, Standards

Adobe to Standardize PDF

2007/01/29 By Rob 3 Comments

According to the press release, it sounds like Adobe will submit their PDF 1.7 specification to AIIM, where it will be reviewed and refined before submission to ISO, likely to TC 171 . AIIM, if the name isn’t familiar to you, is the Association for Information and Image Management. They have been around since 1943, and they have a thriving ANSI-accredited standards program.

Note that this is not PDF’s first trip to ISO. Subsets of PDF have been standardized for particular problem domains, such as:

  • PDF/A for archiving as ISO 19005-1:2005
  • PDF/X for digital prepress exchange as ISO 15930
  • PDF/E for engineering workflows, currently under review ISO DIS 24517

But now we’ll be getting the full PDF functionality as an International Standard. This is good news. I’m pleased to see Adobe’s continuing leadership in this area. For more information on this topic, now and in the future, I recommend adding Adobe’s Duane Nickull to your regular cycle of blog reading.

  • Tweet

Filed Under: Standards

A Review of the Wikipedia Article on ODF

2007/01/27 By Rob 5 Comments

As I had done last week with the Wikipedia article on Office Open XML (OOXML), I have taken a read through the article on OpenDocument Format (ODF). My aim was to do some fact checking and make some suggestions on some additional references that might be included. In some case I’ve made additional usage or phrasing suggestions, but I have not endeavored to do a full edit of the article.

In accordance with Wikipedia’s Conflict of Interest guidelines, I will put a link to this blog entry on the ODF article’s Talk page. These points are for the consideration of the volunteers editing the article, to consider and do what they want with them. I’ll probably repeat this review on a quarterly basis.

Since the article is changing at a rather rapid rate, you should note that I looked at the revision of 27 January at 16:19 which you can retrieve here.

  1. Opening paragraph. “…is a document file format used for exchanging electronic documents”. I’d say instead, “…for describing electronic documents”. Documents are exchanged via protocols like SMTP, WebDAV or HTTP, etc. ODF is only describing the documents.
  2. Strictly speaking, ODF was developed by a technical committee (TC) working within the OASIS consortium. The point is OASIS as a whole approved ODF, but it was developed within a TC.
  3. Last sentence of first paragraph is awkward. I’d keep the details and dates in the Standardization section and just state the current status here: “OpenDocument is an OASIS Standard as well as an International Standard published as ISO/IEC 26300:2006”
  4. The next sentence is weak. I’d rephrase as something like “ODF meets the common definitions of an [Open Standard], meaning the specification is freely available and may be implemented freely”. Since Wikipedia already has nice article on open standards, why not just link to that?
  5. The claim that ODF was “intended” to avoid vendor lock-in should be substantiated. That indeed may be one of its effects. But the charter of the TC did not mention that as an explicit goal. I think this is just loose language. Whenever you see a passive sentence, ask yourself, “Who or what did this”? Who intended ODF to be such and such? If you can provide a reference for that question, then you have something.
  6. Next sentence is awkward. How about, “OpenDocument is the first widely adopted International Standard for editable office documents.” ?
  7. Under Specifications, in addition to the listed compression advantage of using the approach with the ZIP archive, it also has the benefit of separating the content, styles , metadata and application settings into four separate XML files. This is a good example of the architectural principle of [Separation of Concerns].
  8. I suggest we add here: “An important goal during the development of ODF was to reuse existing relevant standards where possible. Such standards used in ODF include [MathML], [Synchronized Multimedia Integration Language|SMIL], [SVG], and [XForms].” If needed a link to the ODF TC’s charter would server as an authoritative reference for the goal to reuse existing standards.
  9. The Standardization section seems to be split off into a linked article which is a bit outdated. Is this necessary? This might make more sense to have this information brought back into the main article. Just my opinion.
  10. First sentence is not quite correct. ODF was developed by a technical committee (TC) working within the OASIS consortium.
  11. “OASIS Standard” should be capitalized as a proper noun.
  12. This section gets a bit weighted down with jargon. Does the average reader, even a technical reader, understand was a “DIS” is, or a “default ballot”? We should either explain the significance of these terms, or summarize. I don’t think this needs to contain a day-by-day retelling of how a specification made its way through ISO.
  13. OpenDocument Format 1.1 was approved as an Committee Specification in October. The ballot for approval as an OASIS Standard is occurring right now. (Would the average reader understand this distinction? Specifications are approved first by the ODF TC as Committee Specifications, then major versions are put forward for a vote by the entire OASIS membership as an OASIS Standard, and even more significant editions are then put forward for approval by ISO as an International Standard.)
  14. On the ODF 1.2 work, the parenthentical remark on spreadsheet formulas seem out of place and redundent since there is a separate Criticism header that covers this. The obvious presumption is that anything added to ODF 1.2 is added because it is not already there. Do we believe that any reader would think otherwise?
  15. Overall the 1.2 statement looks like it needs a rewrite. I’d suggest a simple statement like, “OpenDocument Format is currently being drafted by the ODF TC. It is planned to contain additional accessibility features, metadata enhancements, spreadsheet formula definition (based on [OpenFormula] and any errata submitted by the public.” (Discussion of various schedule predictions seems outdated since December has already come and gone. )
  16. Section on Application support — “Since there are a number of independent implementations of the ODF standard..”. This might be better in an “Interoperability” sub-section. If you make such a sub-section, the Fellowships test suite, mentioned earlier in the article, could be moved there as well.
  17. “Although Microsoft Office does not support OpenDocument…” should be, “Although Microsoft Office does not support OpenDocument natively…”
  18. Again, never trust engineers to come up with a good prediction of schedules. December has come and gone and no Add-in is complete.
  19. There should also be mention of Corel’s stated plans to add ODF support to WordPerfect Office. The press release you can reference is here.
  20. There is mention here of a “MS Open XML translator”. This was Microsoft’s name for their intiative. But the web page linked to here consistently refers to itself as the “ODF Add-in for Microsoft Word”. This is confusing. Maybe start with a mention of the Microsoft announcement from July 2006 (this press release) then say that one such project supported by Microsoft is the ODF Add-in for Word, etc.
  21. The ODMA mention is unrelated to ODF. It probably should be removed entirely.
  22. Under the Accessibility sub-section, might want to mention that a group at the University of Illinois has written an OpenDocument Format Accessibility Evaluator to scan uploaded ODF documents for how well they follow best practices for accessibility. A link to the tool is project is here.
  23. Under Promotion section, we should link to the ODF Adoption TC’s web page here and mention that they also manage the web site http://OpenDocument.xml.org
  24. The promotion activities of OpenOffice.org should be included in the bullet list that follows, right? Not clear why it is not.
  25. “…as well as other companies who may or may not be working inside…” is weird. Was someone attempting to say something here. The fact that the ODF Alliance is stated has having “more than 280 members” should make it obvious that not all are members of the OASIS ODF TC. Is anything added by having this statement?
  26. ODF Alliance has 362 organizational members according to their latest newsletter here .
  27. In Adoption section, there is repetition of information that was already covered in the Application support section, such as the Microsoft-funded translator work.
  28. The Adoption section is incomplete, missing adoptions in Brazil, Argentina, Extremadura Spain, and India. The ODF Alliance newsletters have the details on these and others. This whitepaper is a good summary.
  29. In Criticism section, the statements, “Some mathematicians do not think that the choice of the MathML W3C standard for use in OpenDocument is a good choice” and “monstrosity written purely by web designers” lack an authoritative citation. All that is given is a link to an unnamed commenter on a GrokLaw article, whose credentials as a mathematician or a spokesman for mathematicians are not obvious. Consider that one of the authors of the MathML 2.0 standard, and co-chair of the W3C’s Math Working Group, is Patrick Ion, editor of the American Mathematical Society’s Mathematical Reviews. So the credibility of MathML should not so easily be set aside by a single anonymous, unsubstantiated comment. I’d also note that the Wikipedia artcle for MathML does not note such criticism.
  30. “The OpenDocument ISO specification does not contain a defined formula language” is more precise as “The OpenDocument ISO specification does not define a standard spreadsheet formula language.”
  31. “This means that ISO conforming files do not have to be compatible.” This is a weak argument. Even if the spreadsheet language were defined, ISO conforming documents are not required to be compatible. For example, two implementations may implement different subsets of features. And even without a formula standard, implementations can still be compatible. For example, 1-2-3 , Quattro Pro and OpenOffice have been able to read Excel formulas for years, even though Microsoft had not specified this. Maybe what is meant here is “This means that spreadsheet implementations currently rely on application-level interoperability testing rather than referencing a normative specification of formula syntax and semantics.”
  32. The criticism of the ability to embed Java applets is new to me. No reference is given for this criticism. The section number establishes the existence of the feature, but does not establish grounds for criticizing it. Is this original research? If so, it does not belong on Wikipedia.

Change Log
1/28/07 — corrected link to ODF’s Talk page

  • Tweet

Filed Under: ODF, Wikipedia

  • Go to page 1
  • Go to page 2
  • Go to page 3
  • Go to page 4
  • Go to Next Page »

Primary Sidebar

Copyright © 2006-2023 Rob Weir · Site Policies