≡ Menu

Protocols, Formats and the Limits of Disclosure

A few words today on an important distinction that deserves greater appreciation, since it lies at the heart of several current interoperability debates. What I relate here will be well-known to any engineer, though I think almost anyone can understand the gist of this.

First, let’s review the basics.

Formats define how information is encoded. For example, HTML is the standard format for describing web pages.

Protocols define how encoded information is transmitted from one endpoint to another. For example, HTTP is the standard protocol for downloading web pages from web servers to web browsers.

There are other such format/protocol pairs, such as MIME and SMTP for emails. When we talk about “web standards” we talk about formats (often described by W3C Recommendations) and protocols (often described in IETF RFCs).

An instance of data that conforms to a given format standard might be given any number of terms: a web page, a document, an image, a video, etc., according to the underlying standard. The instance of a format is a data, bits and bytes that you can save to your hard drive, burn to a CD, email, etc. Data in a format is persistent and has a static representation.

But what is an instance of a protocol? It is a transaction. It is ephemeral. You can’t easily save an instance of HTTP or SMTP on your hard drive, or email it to someone else. A protocol is complex dance, a set of queries and responses, often a negotiation of capabilities that preface the data transmission.

There is a key distinction between formats and protocols when it comes to interoperability. The key is that a protocol typically involves the negotiation of communication details between two identifiable parties, each of whom can state their capabilities and preferences, as well as conform to the capabilities of the datalink itself. Software running on each endpoint of the transaction can adapt as part of this negotiation.

You may be familiar with this from the modem days, where this “handshaking” procedure was audibly manifest to you whenever you connected to a remote host. But although you don’t hear or see it, this negotiation still occurs with protocols today, behind the scenes.

For example, when you request a web page, your client negotiates all sorts of parameters with the web server, including packet size and timings (at the TCP/IP level) to authentication, language, character set and cache preferences (at the HTTP level). This negotiation of capabilities is essential for handling the diversity of difference web servers and web clients in existence today.

With a protocol, you have two technical endpoints communicating and negotiating the parameters of the data exchange. In other words, you have software on both ends of the communication able to execute logic to adapt to the needs of the other endpoint and the capabilities of the underlying datalink.

However, when it comes to formats, things are different.

Let’s use an word processor document as an example of a format instance. I author a document, and then I send it out, via email, as an attachment on my blog, burned on a conference CD-ROM, posted to a document server or whatever. I have no idea who the party on the receiving end will be, nor what software they will be using. They could be running Microsoft Office, but they could also be using OpenOffice, Google Docs, Lotus Symphony, WordPerfect, AbiWord, KOffice, etc. I, as the document author, have no ability to target my document to the quirks of the receiving party, since their identity and capabilities are unknown and in general unknowable.

Since a document is not executable logic, it cannot adapt to the quirks of various endpoints. A document is static. When it comes time to interpret the document, you don’t see two vendor endpoints adapting and negotiating. You see only one piece of software, the receiving party’s application, and they need to interpret a static data instance in a given format.

In other words, with document formats, there is no dynamic negotiation, because at the time when your write a document out, you have no idea what the reading application will be. And although the application that reads the document may know the identity of the writing application (via metadata stored in the document for example), it has no ability to negotiate with the writing application, since that application is not present when the document is being loaded.

OK. Simple enough. However, a confused understanding this distinction will lead you to muddled reasoning about interoperability and how it is achieved.

Although it is not ideal, having Microsoft disclosure the details of exactly how they implement various proprietary protocols and even their quirky implementation of standard protocols, this may enable 3rd parties to code to these details. If the disclosure is timely, complete and accurate, this information may be useful. I think of the SAMBA work, for example.

However, no amount of disclosure from Microsoft on how they interpret the ODF standard will help. We see that today, with Office 2007 SP2, where it strips out ODF spreadsheet formulas. Having official documentation of this fact from Microsoft, in the form of “Implementation Notes” does not help interoperability. Why? Because when I create an ODF document, I do not know who the reader will be. It may be a Microsoft Office user. But maybe it won’t. It very well could be read by many different users, using many different programs. I cannot adapt my document to the quirks of all the various ODF implementations.

When you deal with formats, interoperability is achieved by converging on a common interpretation of the format. Having well-documented, but divergent interpretations does not improve interoperability. Disclosure of quirks is insufficient. Disclosure presumes a document exchange universe where the writing application knows that the reader will be Microsoft Office and only Microsoft Office and therefor the writer can adapt to Microsoft’s quirks. That is monopolist’s logic. Interoperability with competition only comes when all implementors converge in their interpretation of the format. When that happens we don’t need disclosures. We just follow the standard.

{ 4 comments… add one }
  • David Gerard 2009/10/13, 08:40

    "Since a document is not executable logic,"

    … unless it's got a festering sea of VBA inside it. Excel spreadsheets are the main culprit – many are just a place to put complicated BASIC programs – but Word is far from immune.

    Well done, Microsoft – inventing text documents that can carry viruses!

  • Rob 2009/10/13, 09:28

    @David, I don't think you can achieve interop with logic in the document. Aside from how ugly and fragile that approach is from an engineering perspective — and we have a lot of experience with how broken that approach is from getting conditional JavaScript to work around browser quirks — it is not future-proof. In other words, you cannot use that approach to create a document today that works with an editor that is written tomorrow. This essentially breaks the archivability of documents (if that is a word).

  • Chris AUld 2009/10/13, 10:42

    Not sure where, exactly, 'Spreadsheet Formlas' are defined in the ODF spec, but that's another topic altogether.

    If you are ging to be at the SharePoint Conference in Las Vegas net week then I'd be happy to have you attend my session on Markup Compatability Extensions (ISO/IEC 29500-3) where I'll talk about some of the challenges you identify above and some approaches to handling forward and backward compatability as well as round tripping of content through client implementations.

  • Rob 2009/10/13, 11:22

    @Chris,

    This post is not primarily about spreadsheet formulas. Remember, Microsoft has produced hundreds of pages of "implementation notes" regarding their support of ODF and OOXML in MS Office. It is all useless for interoperability, IMHO, for the reasons given in this post.

    But to satisfy your rhetorical question, table:formula is defined in ODF in section 8.1.3. This attribute is undoubtedly part of the ODF Standard. The behavior of Excel 2007 SP2 is to strip out all table:formula values from other implementations, unless they are written to Microsoft's specifications. This is taking a protocol approach to a data format, and demanding that documents magically adapt to MS Office and MS Office alone. Documenting this behavior is useless from an interoperability standpoint.

    Thanks for the offer on the your session in Las Vegas. But I am familiar with MCE and it is not a solution to the above problem. MCE may be fine for specifying degradation of behavior to old versions of an application, such as when Office 2010 writes out a document with features new to that version, but also giving an alternative fall back representation for Office 2007. So it has some use for evolving schemas.

    However, from an interoperability perspective, MCE doesn't cut it. MCE is really just hand waving and pixie dust. It just pushes the problem down one layer and leaves a more complicated interoperability problem to solve. Instead of just having to get implementations to support feature A, you need to rely on implementations to support MCE, as well as either feature A or its fallback B, and to support MCE, and A or B in an interoperable fashion. I think in most cases it is far easier to simply implement A in an interoperable fashion. And at best, if implementations do support MCE and A or B in an interoperable way, the implementations that support B but not A will receive a degraded experience and not A as the original document author specified. So in my mind MCE is an overly-complicated way of getting poor results.

Leave a Comment