Monday, February 18, 2008

Interoperability the ELIZA way

Back in the days before the Web, before GUI's, before PC's, before many of you were even born there was ELIZA. The brainchild of Joseph Weizenbaum, ELIZA was a 1966 computer program that seemed to understand.

I'm sure we've all played with ELIZA or variants over the years. I remember playing it during lunch in the 9th grade, working off of a teletype in the school's main office. This terminal was intended for connecting to a remote application at a local college that allowed us to query a database of US colleges. From my small town perspective, this was amazing. It was the only computer in the world that was connected to any other computer, or at least as far as I knew. This must have been 1984 or so. I wonder what this system was. BITNET?

In any case I eventually figured out how to get to a command prompt, and found out that the terminal could be used for much more. I could play exciting text-based games, like Hunt the Wampus and Moon Landing. I eventually figured out how to list the source code for games, and printed out a some of them to take home and key in to my TI-99/4A. ELIZA was the first I ported. It was an interesting, early lesson on divergent standards, dealing with the differences between Dartmouth BASIC and TI BASIC.

In any case, for the benefit of those whose first computer had storage measured in gigabytes, here is an example an example of an ELIZA session:

> Hello, I am Eliza.
* Hello, I am BillG
> How are you today.. What would you like to discuss?
* ISO standards
> Tell me more...
* I want an ISO standard for OOXML
> I sometimes also want an ISO standard for OOXML.
* Good, maybe I can get you to write some letters for me.
> Oh... to write some letters for you?
* Yes, to national standards bodies, telling them why OOXML is good
> Are such questions on your mind often?
* Always
> Can you think of a specific example?
* I need an ISO standard and I need it bad
> Can you elaborate on that?
* Other people have one, so I need one too.
> Come, come, elucidate your thoughts.
* I need a standard quickly
> That is quite interesting.
* So, I went to Ecma and asked for a fast track standard
> That is quite interesting.
* And they gave me a 6,000 page one in less than a year
> Can you elaborate on that?
* That is their speciality, rubber-stamping standards
> Come, come, elucidate your thoughts.
* So, now I have an Ecma standard, but ISO doesn't like it
> Come, come, elucidate your thoughts.
*The open source zealots don't respect our standards
> I'm not sure I understand you fully.
* They are thieving communists who don't respect intellectual property
> What do you think?
* I think we should outlaw open source
> Do you really think so?
* Yes, otherwise there will be no more innovation
> Why no?
* Because without profit there is no incentive to innovate
> Why no?
* Without profit, no one would work
> Why no?
* What are you, some sort of communist?
> Would you prefer if I were not some sort of communist?
* Yes, better dead than red
> I understand.
* OK. So we need an ISO so we can continue innovation
> Come, come, elucidate your thoughts.
* An ISO allows us to sell the same old MS Office we always had with our monopolist's profit margin. That is how we ensure innovation.
> What do you think?
* I have a monopoly.
> You have a monopoly?
* Yes, otherwise we would not be able to claim innovation by having an ISO format that merely mimics a 10-year old legacy format.
> I see.

Now of course, this is something of a parlor trick, an illusion of understanding, but an impressive one nevertheless. Weizenbaum explains in his 1966 paper "ELIZA—a computer program for the study of natural language communication between man and machine" (CACM, Vol. 9, Num. 1, p. 43):

A large part of whatever elegance may be credited to ELIZA lies in the fact that ELIZA maintains the illusion of understanding with so little machinery. But there are also bounds to the extendability of ELIZA's "understanding" power, which are a function of the ELIZA program itself and not a function of any script it may be given. The crucial test of understanding, as every teacher should know, is not the subject's ability to continue a conversation, but to draw valid conclusions from what is being told.

It is in a similar vein that I am suspicious of any claims to "universal document interoperability" that are not firmly based on both sides of the interaction fully understanding the data that they are exchanging. For example, I have heard some claims that a generic extension mechanism in OOXML or ODF would allow a vendor to store away additional formatting hints that would ensure round-trip interoperability between editors. But this is merely a parlor trick, regurgitating data that is stored, without understanding it. It might work for the most trivial demos, but falls apart quickly when the document is manipulated, combined with others, split, converted into other formats, edited with other editors, sections cut & pasted, etc. Real interoperability is more complicated than the trivial round-trip demos, just as real conversations are more complicated than ELIZA sessions.

So, if anyone shows you interoperability, ask yourself whether both sides of the interaction actually fully understand the data that is being exchanged. If not, this is not really full interoperability. It is just an illusion.

Labels:

Sunday, October 07, 2007

Cracks in the Foundation

You must admire their tenacity. Gary Edwards and the pseudonymous "Marbux". The mythology of Silicon Valley is filled with stories of two guys and a garage founding great enterprises. And here we have two guys, and through blogs, interviews, and constant attendance at conferences, they have become some of the most-heard voices on ODF. Maybe it is partly due to the power of the name? The "OpenDocument Foundation" sounds so official. Although it has no official role in the ODF standard, this name opens doors. The ODF Alliance , the ODF Fellowship, the OASIS ODF TC, ODF Adoption TC (and many other groups without "ODF" in their name) have done far more to promote and improve ODF, yet the OpenDocument Foundation, Inc. seems to score the panel invites. Not bad for two guys without a garage.

However, in recent months the OpenDocument Foundation has found itself more and more isolated, outside of the mainstream debate. How far they have fallen can be seen in the fact that Microsoft has gone from ridiculing their conspiracy theories to using them to support their arguments. At the same time the Foundation's membership has dwindled to the point where only a small number remain. Former members have disassociated themselves from the Foundation as it turned increasingly to strident rhetoric. Whereas in the early days, the Foundation had a large membership that participated fully in the OASIS TC's, now their "contributions" are mainly that of heckling and haranguing the other members. Finally, the Foundation has recently announced its intent to abandon constructive work within OASIS, to actively lobby against adoption of ODF 1.2 in ISO and to push for an alternative format, CDF, based on XHTML, CSS 3.0 and RDF. This is an odd stance for a non-profit whose charter was:

The OpenDocument Foundation, Inc. is a 501c(3) non profit chartered to work in the public interest to support, promote and develop the OASIS OpenDocument File Format affectionately known as "ODf".

So it is against this backdrop that I read with interest in Linux Today the latest correspondence from the Foundation. You can read it yourself, or take the following 8 points from me as a condensed summary of their main points:
  1. "The commercialization of interoperability remains a key driver in both big vendor deals and big vendor consortia FOSS is left on the outside looking in."

  2. The conversion to XML [document formats] must be nondisruptive" meaning it fits into existing business processes which are increasingly dominated by Microsoft middleware. This implies a requirement for high-fidelity, loss-less round-trip conversions.

  3. The alternative is "rip and replace" and that is too costly and disruptive.

  4. Microsoft is moving toward a "grand convergence" of their services, desktop, device and servers, with OOXML at the core. "MS-OOXML is the primary transport, the document/data container of interop-integration preference."

  5. ODF was not designed as a response to these problems.

  6. Microsoft/Sun/Novell are working "to limit ODF interoperability and usefulness" because of some patent deals. (Sorry I can't summarize this one better -- I just don't understand it.)

  7. IBM/Oracle/Google are working to "limit ODF interop" because "they want a total ripout and replace of MS Office."

  8. The Open Document Foundation is in "the middle area of trying to perfect the conversion to XML".


Let me take these points one-by-one:


  1. The OpenDocument Foundation seems to try to clothe themselves in the mantle of the open source community and pontificate on how the big bad vendors treat interoperability. But are they speaking as a non-profit or as a vendor? Take their DaVinci plugin, for example. Where is the source code? Why isn't this open source? Are we to follow the Foundation's claim of 100% interoperability, based on blind faith, without seeing some proof in the form of working code? I've been working on document conversions and document file formats of one kind or another for almost 20 years. I've never seen 100% fidelity conversions of anything but trivial formats. Extraordinary claims require extraordinary evidence. But we have nothing here, just white papers.

  2. I would not claim a priori that all customers require lossless, 100% fidelity conversions. Remember, we do not see 100% fidelity even when upgrading from Office 2003 to Office 2007, but this appears to be adequate. What is required is that the total return from changing document formats exceeds any other profitable use of capital available to the enterprise. In other words, to a business this is an investment, and will be judged as an investment. Very few businesses will take a dogmatic, ideologically pure view of this. Ask yourself, would you accept 1% loss in fidelity if I gave you a billion dollars? Yes,of course you would. There are no purists in business who will remain in business. We're just haggling over what price/fidelity combination is needed to make a good investment.

    However, there is a notable exception to this rule, and that is where access to open document formats are mandated as a public right, not as a business investment. Think of the last 20 years or so of enabling public buildings with ramps for the disabled, bathrooms to accommodate wheelchairs, braille lettering in elevators. This was done by legislation and regulation, as a matter of public policy, to ensure that all of the public has access to public facilities. There was no requirement that an access ramp post a net profit. Similarly, today we see some movements to ODF are based on open access principles.

  3. This is what we call the "fallacy of the excluded middle." You are either with us, or against us, etc. It is false to suggest that the only two approaches to interoperability are to either blindly follow the OpenDocument Foundation's mysterious DaVinci plugin, or to ignore interoperability altogether and advocate rip and replace. There are today two other other ODF plugins available, one from Microsoft and one from Sun. This is real, running code, open source even in the case of the first plugin. So why should we be taking exclusive direction from the Foundation on how we achieve interoperability? Oh right, they are claiming that their program achieves 100% round-trip fidelity. Extraordinary claims...

  4. Gary is in the ballpark when he suspects that Microsoft is seeking some sort of "grand convergence" around protocols and formats. However, I disagree with his impression that OOXML sits at the center of this. In my opinion, OOXML is a rushed, transitional format, intended purely to disrupt ODF adoption. Just as the Office 2000, Office XP, and Office 2003 markup formats were abandoned by Microsoft, I predict that OOXML will soon be cast aside. The problem is that OOXML is such a poorly-engineered format that not even Microsoft wants to build upon this. If I had to divine the future of Microsoft's file formats, I'd look more in the XAML/XPS/Silverlight space. I believe that future MS Office document formats will look more like that than like OOXML.

  5. I find this observation amusing. ODF, which started its standards track late in 2002, was not designed to be 100% compatible with Office 2007. Mercy me, how did we manage to drop the ball on this one?! Remember, in 2002 there was no publicly available specification for Microsoft document formats. There was no Open Specification Promise or Covenant Not to Sue. So not only was 100% compatibility technically impossible, attempting it via reverse engineering was precarious from a legal standpoint. In my opinion, it still is, even in 2007.

    In any case I'm staunchly opposed to evolving any open standard purely for the benefit of a single vendor. Microsoft Internet Explorer is the dominate web browser. Should we then require that HTML only evolve in ways that improve interoperability with Internet Explorer? I don't think so. Why should document formats be different?

  6. This comment manages to avoid confronting a heap of contrary facts. Microsoft supports the open source ODF Translator project on SourceForge. Sun has made their own ODF Plugin 1.1 for MS Office available for download. And Novell, along with helping the Microsoft effort, has integrated that translator into their version of OpenOffice and has also started work on more powerful, next-generation support for OOXML. So these three companies are seeking to "limit ODF interoperability and usefulness"? If so, they sure have a clever way of disguising their intent. To the ordinary bystander, writing conversion and translation code to allow documents to be shared between OpenOffice and MS Office would be seen as a pro-interoperability statement. But thanks to the OpenDocument Foundation's in-depth sleuthing, we now know that the opposite is true. Not!

    Although I have serious doubts as to long-term technical feasibility of some of these translation endeavors, they do have the advantage of showing real, running code working with real, running applications. They may not claim 100% fidelity, but this is first-generation work and will undoubtedly improve. But they have an important advantage over the Foundation's DaVinci Plugin in that these other efforts demonstrably exist. Given a choice, I'll take an open source version of a partial fidelity convertor, with a reasonable architecture, over one that claims 100% fidelity, but that I can't see or touch.

  7. The claim is that IBM/Google/Oracle also want to "limit ODF interop" because (according to Gary) we want rip & replace. Strange, but just a few weeks ago I lead an ODF Interoperability Camp in Barcelona, on behalf of the OASIS ODF Adoption TC, where we had a good selection of ODF vendors, open source projects and customers working to improve interoperability, including Sun, Novell, Google and IBM. The OpenDocument Foundation is a member of the OASIS ODF Adoption TC. So did they help in the organizing of the event? Did they participate? No, nothing, nada. Evidently it is easier to complain about interoperability than to do something about it.

    And again there is this fallacy of the excluded middle. You must either accept the magical DaVinci Plugin, or you are for rip & replace. There are no other alternatives considered. I'd remind the OpenDocument Foundation that interoperability was not invented yesterday, and that there are many technical approaches that can be applied to foster it. Open standards are one way, but there are others that can be applied as well, including conformance testing, test suites, plug-fests, profiles, shared code, reference implementations, etc. We should apply experience and engineering judgment to select the appropriate solution for the problem, and not fall into the trap of believing that there is only a single path to interoperability, and that this path just happens to be based on the Foundation's product.

  8. Although it sure would be nice to portray yourself as the little guy, watching out for the customer, while the big bad vendors tromp all over the flowers, the fact is that the big vendors are actively working on interoperability, with at least three major solutions available today, as well a major initiative around interoperability in the ODF Adoption TC. In particular, IBM (with SmartSuite) and Sun (with StarOffice) have 15 or so years experience each in working on document interoperability with MS Office. This isn't rocket science, but neither is it easy. You can either stand on the sidelines and make pronouncements about how the world is out to prevent interoperability, or you can roll up your sleeves and help get the work done. I know which one I'll be doing. What about you?

If the Foundation's approach was technically feasible, they would just go out and do it. You don't let a breakthrough technical innovation wait on a standards committee to act. You just go out and do it and then standardize it later, once you've proven it works. If the Foundation really thinks that they can achieve 100% interoperability with MS Office with just 5 simple changes to ODF, then why the heck don't they just do it? Don't wait for the formality of an the ODF TC 's approval. They should go ahead, as if the standard already had their 5 fixes, and show the world how they have achieved 100% interoperability with MS Office. If they are right, they would all become multi-millionaires in a very short period of time.

Labels: , ,

Thursday, August 02, 2007

An Invitation: ODF Interoperability Workshop

The OASIS ODF Adoption TC is organizing an ODF Camp to be held on September 20th in Barcelona, Spain. Facilities for this event are graciously provided by OpenOffice.org, which will be holding its annual conference concurrently.

The hope is that this will be the first of several such events to bring ODF vendors together to explore ways of greater technical coordination, especially in the area of interoperability. I've written about and presented on this topic before. Now is the time for action, and I'm extremely pleased that so many vendors will be attending.

On other occasions I've called interoperability "the price of success" because a standard implemented by only a single vendor and a single application need not worry about it. Only successful standards with many implementations need to rent a hall to bring the implementors together to review and perfect interoperability.

(It is like capital gains taxes. I grumble when I pay them, but take some solace in the fact that my investments were profitable. Those who make a losing investment don't pay capital gains taxes on it.)

The focus of this first interoperability event will be on the ODF word processor format. Follow-up events will look at spreadsheets and presentations.

Please have a look at the detailed agenda for the camp and consider joining us in Barcelona.

Labels: ,

This page is powered by Blogger. Isn't yours?