• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar

An Antic Disposition

  • Home
  • About
  • Archives
  • Writings
  • Links
You are here: Home / Archives for Standards

Standards

The Value of Choice

2007/06/21 By Rob 8 Comments

Here in Westford, Massachusetts, some of our public schools have boilers that can be powered by natural gas or heating oil. This way the schools can have their choice of fuel, which they can alter year to year, or even month to month according to the comparative prices of these two commodities. Such a choice has a value, a very tangible value at any given point. For example, suppose that today the price of natural gas was $1.15/therm (100,000 BTU’s) and the price of heating oil was $1.72/therm. The value of choice is ($1.72-$1.15)*# of therms purchased. Those clever with finance could probably estimate the long-term value by pricing the analogous commodities futures.

Of course, choice here has a cost as well, namely the increased cost to purchase and maintain the more complex boiler that offers the choice of gas or oil. If the value of having the choice is worth more than the cost of maintaining a world that offers that choice, then you have a net gain by preserving the choice. Otherwise, you are losing by having choice. It is odd to hear that, isn’t it? You can lose by having choice, if the cost of maintaining that choice is greater than the benefit from having a choice.

For example, take shoe sizes. In the U.S. we buy shoes in 1/2 size increments. In theory a store could offer shoes in 1/10 size increments. This would give you, the consumer, an increase in choice, and this choice would have a distinct value to you. Whereas the previous shoe sizes would be an average of 1/4 of a size away from perfect fit, the new shoes would be on average only 1/20th of a size away. So a tangible benefit to you the consumer. But this comes at a cost, since the larger inventory and slower turnover for the retailer would increase their costs. Since we are unlikely to buy more shoes than we do today, this cost increase would be passed on to the consumer. So in this case, the benefit of better fitting shoes is not seen to be worth the increased costs to maintain those choices, so the industry remains with 1/2 size shoe increments.

As an aside, I’ll give you another example, as a brainteaser. You are walking down a street evenly lined with many stores, all of which sell some commodity, let’s say orange juice. The prices at the various stores are random. You want to buy orange juice at the best price, but you can only make one purchase, and you can only make one pass down the street. So you can look at many prices, but at some point you need to make a decision and purchase the orange juice, and you can’t turn back or make a second choice once you’ve made a purchase. The street is 1 kilometer long. Where do you buy your orange juice? Even with an abundance of choice, it isn’t always clear how you make an optimal decision. Note that many life decisions are like this, since time acts as a one-way street, where often we must make an important choice, based on the info we have so far, but with uncertain knowledge of the future, and often we can only choose once.

So what does this mean for document formats? It is popular these days to use the word “choice” as a “god term”, a phrase introduced by Richard Weaver in The Ethics of Rhetoric, referring to words like “progress”, “culture” and “for the Fatherland” that are used to appeal more by seduction than by rational argument. But we should avoid the seduction and ask ourselves what this choice really means. What is it really worth to you and your business? Sitting down today, writing a document, or creating a spreadsheet, what is the value to you, knowing that you could save a document to ODF, OOXML, UOF, SmartSuite, WordPerfect Suite format, etc.? And what is the tangible value of having that choice, that option?

What I want in a document format is:

  1. It is supported by my word processor.
  2. When I save the document and later retrieve it, the document looks and behaves the same.
  3. When I give it someone else, who may be using the same or a different word processor, on the same or a different operating system, it looks and behaves the same.
  4. It is easily processable by other software tools. I care about this directly because I am a programmer. But even if I were not, I would want this characteristic, since this is what ensures that an ecosystem of other tools will emerge to support the format, offering me more choice.
  5. I want the format to be open for the same reason, so it encourages the creation of other tools that I may later choose to use.
  6. I want the format to be controlled by a group of vendors and other interests, not dominated by a single player. Further, I’d want them to be to be working openly and transparently, so the public can all see what they are doing. We should all remember the line by Adam Smith, “People of the same trade seldom meet together, even for merriment and diversion, but the conversation ends in a conspiracy against the public.” The remedy is given by Justice Louis Louis Brandeis in his line, “Sunlight is the best disinfectant.”
  7. I want the format to be well-designed according to industry best practices, since I know that will make it easier to work with for tools vendors and will help ensure its longevity as a format.

Given a single format that can accomplish these goals, I see zero value in having a second standard. In fact, having multiple formats brings increased complexity and expense to the software vendor who maintains and supports all the translator code and this expense gets passed on to the consumer. And then there is the opportunity cost of the features that may have been coded if my vendor hadn’t been distracted by writing translator code. Also, there is the cost, in performance and fidelity loss when translating between formats, and the resulting business losses that may be caused by errors introduced in this processing. This is all very real. But where is the benefit?

To solve this puzzle, we need to look at it from Microsoft’s perspective. A standard in this space is a very scary proposition for them. A comparison can be made to the early years of the automobile industry:

Between 1904 and 1908, more than 240 companies entered the fledgling automotive business. In 1910 there was a mini-recession, and many of these entrants went out of business. Parts suppliers realized that it would be much less risky to produce parts that they could sell to more than one manufacturer. Simultaneously, the smaller automobile manufacturers realized that they could enjoy some of the cost savings from economies of scale and competition if they also used standardized parts that were provided by a number of suppliers.

Guess which two players were not interested in parts standardization? The two largest companies in the industry: Ford Motor Company and General Motors. Why? Because they were well able to achieve strong economies of scale in their own operations, and had no interest in “interconnecting” with anyone else: standardization would (partially) level the playing field regarding economies of scale at the component level. As usual, then and now, standardization benefits entrants, complementors, and consumers, but may hold little interest for dominant incumbents. — Carl Shapiro and Hal R. Varian, Intro for Managing in a Modular Age

We’re in a very similar situation now. Microsoft, the sole dominant player in this market, is perfectly happy with having total control over their proprietary formats. It has worked very well for them for many years. But just as Ford and GM eventually gave in to the obvious necessity of true interoperability, Microsoft will as well. The companies that win in this world are the ones that adapt, not the ones that sell adapters.

We need to start talking about what we can do to ensure that we have a single open document format that can be used by everyone. Making a second ISO standard for document formats is a bad idea. What we need to do is continue to evolve ODF, continue the work to harmonize UOF and ODF, and also take on the task of harmonizing OOXML and ODF. The value of having a single standard in this space is clear. We just need to remain vigilant in the face of those commercial interests that would stand to lose the most if customers had true document portability and could choose platforms and applications based on features and price and support, and not solely on fears, uncertainty and doubt about whether they could still access their legacy documents.

Filed Under: Economics, OOXML, Standards

Hemidemisemiquavers

2007/06/11 By Rob 9 Comments

Some “short notes” to share with you:

From a GrokLaw news pick we hear that ZDNet’s David Berlind recently interviewed Tim Berners-Lee in Boston, where Sir Tim received the Massachusetts Innovation and Technology Exchange’s Lifetime Achievement Award. Watch the whole interview if you have 12 minutes, though I will transcribe one passage which highlights the importance of agreeing on a single open standard for a problem domain and fostering competition among the applications built upon that standard:

It was the standardization around HTML that allowed the web to take off. It was not only the fact that it is standard, but the fact that its open and the fact that it is royalty-free.

So what we saw on top of the web was a huge diversity and different business which are built on top of the web given that it is an open platform.

If HTML had not been free, if it had been proprietary technology, then there would have been the business of actually selling HTML and the competing JTML, LTML, MTML products. Because we would”t have had the open platform, we would have had competition for these various different browser platforms, but we wouldn’t have had the web. We wouldn’t have had everything growing on top of it.

So I think it very important that as we move on to new spaces … we must keep the same openness we that had before. We must keep an open internet platform, keep the standards for the presentation languages common and royalty free. So that means, yes, we need standards, because the money, the excitement is not competing over the technology at that level. The excitement is in the businesses and the applications that you built on top of the web platform.

Well said. I tried to make a similar point, but with pictures, back in February.

I recently ordered some podcasting equipment. It should arrive tomorrow. I will be looking for people to interview soon. So hide while you can, don’t answer the phone, and if it looks like I’m carrying a microphone, then run for the exit.

An interesting article in the American Surveyor, by Joel Leininger, on the importance of file format standards. Although it is a different application domain, the concerns are very similar (via OpenMalaysia).

Anyone know Romanian? Something gives me the impression that this guy from Microsoft Romania is not complementing me. I wonder what subtle hint gives me that impression…

The OOXML ballot marches on in national standards committees around the world. September 2nd is the deadline, though many committees have earlier deadlines for developing their recommendations. In the US the committee looking at OOXML is called INCITS V1, and we have until July 13th. V1 has had a few meetings so far and we’re just starting to get into the technical comments. Since we have a consensus process, all it takes is a small minority of members to bring everything to a halt, which is pretty much what is happening. For example, we spent 2 1/2 hours today and discussed only two comments. So we risk having a perfunctory technical review of OOXML. When I compare this to the BSI’s excellent work developing detailed comments on a publicly-readable wiki, I think we in the US should be ashamed at the stonewalling going on in V1.

I’ll be hosting a V1 face-to-face meeting in a couple weeks in Washington, DC. Hopefully we’ll make some more substantial progress there. If you really want to follow our work closely, you can read through our mailing list archives which Sun’s Jon Bosak was kind enough to set up for us.

Although no formal call for public comments has gone out, we’ve received a number of unsolicited pro-OOXML letters which you can read here. As you can see, they are pretty much identical form letters, all ending with the artless phrase, “Furthermore, Open XML in no way contradicts any other international document standard.” Remind anyone of the Manchurian Candidate’s, “Raymond Shaw is the kindest, bravest, warmest, most wonderful human being I’ve ever known in my life”?

In any case, if you want to provide input into this process, feel free to send in your thoughts as well. Having read many of these letters myself, I’d offer the following advice:

  1. Don’t send in a form letter. It hurts your cause more than helps it, since it makes it look like you couldn’t get real support if you tried.
  2. Use your real name and email address and postal address, so we know you are a real person and not a robot.
  3. Be polite. Remember you are trying to persuade.
  4. Give a succinct, reasoned opinion. Keep it to a page if you can.
  5. Ask for a specific action. Don’t expect the reader to draw a conclusion. Draw it yourself.

Of course, since V1 is developing the US position on OOXML, comments from US companies and citizens are especially welcome. Also, if you have specific technical comments about OOXML, you can submit them through me and, if I agree with your points, I will raise them directly with the committee. (I do this as a personal favor to you, my readers, not as an official INCITS V1 solicitation.) Assume the committee is already familiar with the GrokLaw items. But OOXML is a big standard, and there are certainly dark corners where I have not ventured. So if you’ve found something new, certainly let me know.

Canada continues to solicit comments on OOXML. And the UK is soliciting comments as well, through June 30th. Again, be succinct, and give your name and address. Otherwise you risk having a committee member reject your comment outright since it cannot be ascertained whether you are actually a resident of that country.

A blog I’d like to recommend to my readers is Lodahl’s blog. Leif Lodahl has been giving some great coverage of ODF happenings in Denmark, including analysis of the parliamentary debate on the question of whether Denmark should have one or two standards. Also a good catch of Microsoft dancing all over the place, trying to avoid giving a straight answer on why Word does not provide integrated ODF capabilities. If you can spare 45 minutes this is a great clip to listen to.

Filed Under: ODF, OOXML, Standards

Documents for the Long Term

2007/06/05 By Rob 6 Comments

We all will die. Institutions come and go. Empires and nations crumble. But what is written down may have transcendent longevity. Whether it is a personal letter from a departed friend, the minutia of administration or the recorded contemporary reports of great historical events, the durable written word has almost mythic status in our culture.

The permanence of the written word has fascinated mankind for millennia. The powerful knew the truth of this. To be sure that his deeds would outlive his contemporaries, the Emperor Augustus had his CV engraved in bronze in his “Res Gestae Divi Augusti” (Deeds accomplished of the Divine Augustus). The bronze did not survive, but the words have. Horace wrote in his Ode, “Exegi monumentum aere perennius” (I have erected a monument more lasting than brass). And his words have survived. Shakespeare in Sonnet #55 echoed this sentiment, “Not marble, nor the gilded monuments/ Of princes shall outlive this powerful rhyme”. Shelly in his Ozymandias shows the irony of the surviving boastful inscription, “Look on my Works ye Mighty, and despair!” beside the “colossal wreck” of an ancient monument.

The saying is “ars longa, vita brevis” — art is long, but life is short. But this is not entirely accurate. The performing arts such as dance or music have a very sketchy and imperfect history until the rather recent invention of written notations. So dance before around 1450 is a matter of speculation. No doubt the ancient Bacchae accompanied their ecstatic revels with an equally furious dance. But we know none of it. Thucidydes has the Lacedamonians march into battle to the accompaniment of flutes. What martial notes they played we do not know. We can only speculate, with Thomas Browne, “What song the Syrens sang”. Some like Benjamin Bagby may give a glimpse at earlier performance practice. And scholars like Milman Parry find echoes of ancient practices in traditional story telling. But we cannot know for certain.

The structural arts of architecture, city design, aqueducts, and monuments, engravings, these have all fared better over time. Even scattered texts from antiquity have survived. Text can have longevity, but not unassisted. Left to the ravages of water, fire, insects and fungi, papyrus, vellum and paper will only survive a few hundred years. For a text to survive longer, someone must copy it. So, the works of Cicero, these we have in rather good shape today, in part because Augustine of Hippo praised his works. (Then as now, getting a good review from a recognized figure is is the best marketing).

Which ancient texts were copied, and thus became part of the canon of western literature, was somewhat a matter of chance. Nine of the surviving plays of Euripides, existing in a single partial manuscript, are curiously in alphabetical order, but only containing plays beginning with the Greek letters eta through kappa, leading scholars to believe that this is merely volume 2 of a larger collection of plays that are lost. Euripides is believed to have written almost 100 plays. We have almost 20 of them today.

With digital documents, the issues are a little different. The transmission of digital data can be done without error. But digital media, the tapes, floppies and optical disks, these are susceptible to the ravages of time, light, heat, fungi and the gradual deterioration of the substrate. So, digital documents must be copied from one storage format to another every few years. And so modern digital data relies on the same haphazard selection mechanism as we see with ancient texts — survival depends on someone deciding that a document is worthy of copying and preserving.

That said, the survival of a document does not depend entirely on the whims of monks or archivists. There are certain engineering principles which are key to creating a document that lends itself to long term retention. Some of these are tasks for the individual authors:

  1. Keep a document intact. Better to preserve a document inclusive of annexes and appendicies.
  2. Separation of content, structure, layout and presentation
  3. Findability — a good title, a abstract, keywords and other metadata will help ensure that your document can be found and retrieved via current and future search technologies.
  4. Use of a fully-specified, open document format.

From another angle we can look at archiving from a systems view and follow a basic architectural principle. The key to durability, whether in documents, monuments, institutions, or whatever, all boils down to this: Do not depend on something less stable than yourself.

(I didn’t invent that principle, but don’t recall where I first heard it. Any idea who it was?)

If you depend on something less stable, which is to say more susceptible to change, than yourself, then when it changes, it forces you to change. Stability is when you change only when you want to change.

For example, a house is built on a foundation. A frame, plumbing and electrical, walls, wallpaper and furniture are layered on top. If replacing the wallpaper triggered a need for a new foundation, then we would say that the house was inherently unstable. But it is reasonable to expect that installing new plumbing will require opening a hole in a wall and later applying wallpaper. The expected rates of change of these various layers has lead to a method of construction that enforces this dependency chain. If for some reason we needed to make very frequent changes to the plumbing, then we would place them outside the interior walls, or behind removable wall panels for each access.

We carefully manage dependency chains when programming as well. For example, imagine a module A (a database client) that depends on a module B (a database server) where you believe that module B is less stable (has a greater rate of change) than A. This is a problem, since changes to B trigger changes to A. So we define a new interface layer C (maybe SQL) that is more stable than A or B. By having A depend on C rather than B directly, we transform the unstable dependency A->B, into the stable relationship (A,B)->C, where C is a standard.

This same principle applies to document formats as well. Never depend on something less stable than yourself. For the first few decades of document formats, the era of binary formats in the 1980’s and early 1990’s, we did this all wrong, as the following diagram shows:

In those days the file format stood atop a large set of dependencies and changes at all layers would lead to changes in the file formats. This created a very inflexible stack of dependencies, where changes in the less stable lower layers can trigger incompatible changes to the document format. When we see that an Excel file on the Mac has a different internal date format than an Excel file created on Windows, we’re are seeing remnants of this kind of dependency chain.

Note also that these interfaces between the layers were not standards, but proprietary interfaces. For example, a Word 95 document might be seen as this:

The move to XML-based file formats changes this diagram but little. The format at the top is now XML but the dependency chains are the same. The relationship of the format to the technology stack has not changed:


If using a new document format requires you to buy a new application suite, update your hardware and buy a new operating system, then that should be a clear sign that something is wrong. “The tail wags the dog,” as they say.

And note that a dependency is not the same as a layer. You can pretty things up all you want with the use of standards like XML, but still have adverse dependency chains. Taking a Microsoft Word binary format and translating it into XML, and putting it in a Technical Committee whose charter requires that it remain 100% compatible with Microsoft Word leaves you will a file format that depends on Microsoft Word, no matter now much XML Schema and Dublin Core you throw at it. The XML is just syntactic sugar. But the essence of the dependency chain remains: OOXML depends on Word and Windows, a single vendor’s application stack. Instead of an application supporting a format, a format is supporting an application.

I should further note that a vendor, at great expense and effort, can forestall the bad effects of an unstable dependency chain, sometimes for many years. Instability, with effort, can be managed, as jugglers, unicyclists and stilt walkers remind us. Even though the Word binary format has many dependencies on the Windows platform, and on specific internals of Word and features and behaviors from earlier versions of Word, Microsoft has managed to preserve some level of compatibility with these older formats, even in current versions of Word. The support is far from perfect, and it certainly makes their file format and their applications more complicated and more expensive to work with. But that is the burden they face from bad engineering decisions back in the early 1990’s. They and their customers live with that, and though they may not realize it, they all pay a price for it.

The alternate approach, the one that leads to better prospects for long term document access, is to have a stack, not of proprietary applications and interfaces, but of standards. ODF’s long-term stability and readability comes from the fact that it is built upon, and depends upon other standards that are widely-used, widely-adopted and widely-deployed. ODF is designed so the format depends on things more stable than itself, with a solid foundation as seen here:

The suitability of a format for long term archiving depends as much on the formal structure of the technological dependencies as it does on specific details of the technologies involved. The greatest technologies in the world, if assembled in an unstable dependency arrangement, will lead to an unstable system. Look at the details, certainly, but also step back and look at the big picture. What technology changes can render your documents obsolete? And who controls those technologies? And what economic incentives do they have to trigger a cascade of changes every 5 years, to force upgrades? As consumers and procurers we all need to make a decision as to whether we want to ride on that roller-coaster again.

The question we face today is whether we want to carry forward the mistakes of the past and the extensive and expensive logic required to maintain this inherently unstable duct tape and bailing wire Office format, or whether we move forward to an engineered format that takes into account the best practices in XML design, reuses existing international standards, and is built upon a framework of dependencies that ensures that the format is not hostage to a chain of technologies that can be manipulated by a single vendor for their sole commercial advantage.

Filed Under: ODF, OOXML, Standards

The Legend of the Rat Farmer

2007/05/31 By Rob 11 Comments

The Tale

A long time ago in a land far away there once was a prosperous town called Hamelin. Everything was perfect in Hamelin until the year the rats came. The rats ate up the grain, bit the townsfolk in the toes and scared the young children. Something had to be done! So the Bürgermeister and the Council met together and decided to bring in an outside consultant, Pied Piper Enterprises, LLC. That did not go well. The rats were back the very next year.

So in the Spring the Bürgermeister again assembled the Council and they talked and talked and talked. Should they bring in another consultant? Should they abandon the town and move someplace else? They finally decided on a market-based approach to solving the problem. They would offer a reward, a bounty, to citizens who captured, killed and turned in rats. Turn every person in Hamelin into an exterminator. The signs soon went up all over town: “A Silver Thaler for every 10 Rats.”

The Bürgermeister tracked the results on a big chart on the wall of his office and the numbers looked very good. Each day more and more rats were being caught and killed. The citizens were busy at work. The rats would soon all be gone.

But then one day the Bürgermeister went home, and in the doorway of his house was his wife and she was very upset, “You shall have no dinner tonight! The rats have eaten all of the grain!”

“How can this be?” exclaimed the Bürgermeister. “The metrics show that we’re eliminating a record number of rats every day.  Come with, and I will show you the chart.”

“Chart, schmart. I’ll show you some metrics,” said the Bürgermeister’s wife, who then took him by the ear and led him around the town center, and at each house they stopped and heard the same tale. The rats are still eating up the grain. They are still biting townsfolk in the toes. They are still scaring the young children.

Nothing at all had improved in the quality of life in Hamelin. The only thing that had changed was that they now had a larger pile of dead rats, and a smaller pile of silver Thalers.

An inquest was held to account for the misuse of town funds. During this investigation it was found that a large portion of the reward money had gone to one old man who lived by himself on the outskirts of town. The Bürgermeister and the Council went to visit the old man. “How did you manage to catch so many rats?” they asked, “You are old and slow”.

“Simple,” he said, “Let me show you”. He lead them back around his house to an old barn. As he opened the barn doors, he revealed to the astonished Council hundreds of small wooden cages, each one holding 10 large rats.

“I don’t care for rats much myself”, said the old man. “But since you wanted them so much, I thought I could help out a little. After all, I could use the money, and rats are so easy to breed”.

“Bu…bu…bu…but we didn’t want more rats,” stammered the Bürgermeister. “We wanted fewer”.

“Nonsense”, said the old man. If you offer a reward for something, of course you want more of it, not less. This is just the free market in action.”

The Commentary

We see here the results from failing to specify an appropriate metric. As is often the case, we tend to latch on to metrics that are easy to measure, such as counting dead rats, rather than harder to measure, but more appropriate metrics that truly indicate the achievement of our goals. For example, a reasonable metric might have been a “resident satisfaction index” based on a weekly survey of Hamelin’s citizen’s to see if their rat problems were decreasing. Or the Bürgermeister could have sent out a commission to count how many rats they find in the grain and tracking that number from week to week. The point is to have a metric that clearly and directly reflects the attainment of your goals.

So the lesson is that you should always watch out and ensure that the metrics being suggested truly reflect your ultimate concerns.

With that in mind, let’s move forward to the present and what seems to me a similar confusion of metrics.

Jason Matusow, Microsoft’s Director of Corporate Standards has written a new blog post, which concludes:

The fact of the matter is that translation between formats has always been the path to interop (for document formats), and now with XML-based formats that path is even more appropriate than ever through translation.

China wants to create its own standardized XML format…translation will enable interop. Google Docs has its own format….translation will enable interop. OpenOffice has ODF..translation will enable interop (to MS Office, to Google Docs, to IBM Workspace). Adobe PDF is its own format…translation will enable interop.

Jason seems to be suggesting that increasing the number of different formats and translators leads to an increase in interoperability. This is akin to saying that increasing the number of umbrellas improves the weather. It just doesn’t work that way.

We need to step back and find the proper metric. If, for sake of argument, we define interoperability as the ability for different formats to work together, then obviously as we increase the number of formats and the number of translators then the sum total of interoperability (by that definition) in the world increases. In that case, let’s make the old 1-2-3 format an ISO standard, the WordPerfect format an ISO standard, WordStar an ISO standard, XYWrite an ISO standard, Quattro Pro an ISO standard, Manuscript an ISO standard, Harvard Graphics an ISO standard, Freelance Graphics an ISO standard, etc. Just imagine how much interoperability we could have in the world if we simply could standardize more formats. Every application, could have its own standard format, or maybe two or three.

But you may smell a rat in the above argument. Interoperability of formats is not the appropriate metric. A simple look at the lack of OOXML support on the Microsoft’s Mac Office shows that the introduction of OOXML has reduced interoperability, not increased it. Similarly, scientific journals like Science and Nature have already come out saying that they cannot accept the OOXML format. Translation among multiple formats only partially and imperfectly attempts to work around a break-down in interoperability caused by having multiple formats. It is a band-aid approach and does not address the core issue.

A more appropriate metric than counting piles of semi-functional translators is to look at things from the perspective of the user exchanging documents. The end user doesn’t see or care about formats. They care about their documents and the people and processes that work with these documents. The question for them is: what is the cost to exchange their document with other users and business processes? In other words, what is the cost to interoperate? That is the metric that counts.

Several cost drivers come into play here:

  1. What are the choices and costs in application software necessary to author a document?
  2. What are the choices and costs in application software needed by the recipient of this document, in order for them to read it, or collaborate with me in editing this document?
  3. Will others see the document as I intended? Or will there be fidelity loss from conversions?
  4. Similarly, what are the performance, security, stability, legal and licensing implications of introducing any translation steps?
  5. How easy is it to program this document format? In other words, what is the cost of business process integration?

When looked at from this business perspective, we can get away from counting piles of dead rats and thus come to a quite different conclusion:

None of the cost-driver factors lead to reduced costs with multiple formats. They all have minimal costs when there is a a single format in use. So if the metric for interoperability is the “cost to interoperate”, then interoperability (and choice as well) is maximized when a single application-neutral and platform-neutral document format is natively supported by multiple applications at a range of price/function points. Introducing even a single additional format into your business will escalate costs, degrade fidelity of document exchange, and reduce interoperability.

Filed Under: ODF, OOXML, Popular Posts, Standards

Math markup marked down

2007/04/25 By Rob 16 Comments

Sun’s Erwin Tenhumberg fights some FUD about ODF and in passing provides a link that is worth a few more words. It appears that Science, the journal of the America Association for the Advancement of Science (AAAS), itself the largest scientific society in the world, has updated its authoring guidelines to include advice for Office 2007 users. The news is not good.

Because of changes Microsoft has made in its recent Word release that are incompatible with our internal workflow, which was built around previous versions of the software, Science cannot at present accept any files in the new .docx format produced through Microsoft Word 2007, either for initial submission or for revision. Users of this release of Word should convert these files to a format compatible with Word 2003 or Word for Macintosh 2004 (or, for initial submission, to a PDF file) before submitting to Science.

Well, so much for 100% compatibility, eh? That is what I’ve been talking about. Whether you move to OOXML or ODF you will be making a change that will break compatibility with your past document processing systems. You will need to change over the next couple of years and you will need to examine your choices carefully. But don’t get suckered into thinking that the choice of OOXML is magically painless. The 100% compatibility claims don’t hold water.

More bad news:

Users of Word 2007 should also be aware that equations created with the default equation editor included in Microsoft Word 2007 will be unacceptable in revision, even if the file is converted to a format compatible with earlier versions of Word; this is because conversion will render equations as graphics and prevent electronic printing of equations, and because the default equation editor packaged with Word 2007 — for reasons that, quite frankly, utterly baffle us — was not designed to be compatible with MathML. Regrettably, we will be forced to return any revised manuscript created with the Word 2007 default equation editor to authors for re-editing. To get around this, please use the MathType equation editor or the equation editor included in previous versions of Microsoft Word.

Uh oh. Not only cannot you not submit files in OOXML format, but you can’t even use Office 2007 and save in the old binary formats. Down conversion or using the Compatibility Pack won’t help. Microsoft’s decision to push a new “Open Math Markup Language” rather then use the well-established MathML standard appears to be a serious flaw.

Nature appears to have the same problem:


We currently cannot accept files saved in Microsoft Office 2007 formats. Equations and special characters (for example, Greek letters) cannot be edited and are incompatible with Nature’s own editing and typesetting programs.

Of course, when targeting final publication of a paper, a PDF file is fine. But when engaging in collaboration with another researcher, or an editor, you need to agree of a standard format in which you both can work.

Reuse of existing standards is important. When you reuse a standard, you are reusing more than a piece of paper. You are reusing the experience and effort that went into creating and reviewing that standard. You are reusing the experience gathered by those who have already implemented the standard. You are reusing the books and training materials already written for that standard. You are reusing the interfaces for other technologies that have already integrated with that standard or can produce or consume output that conforms to that standard.

Isaac Newton wrote, “If I have seen further it is by standing on the shoulders of giants”. When you reuse standards you reuse the accumulated wisdom of an industry and assume the vision and powers of giants. But when you ignore all precedents and go forth on our own, well, let’s just say the outcome is more variable in that case. You may be the next Einstein, or you may be the next fool.

If Science and Nature need to update their templates, then I’d suggest they take a look at ODF. Not only does it use MathML for equations, but it is an open standard, an ISO standard, a platform and application-neutral standard that has many implementation, including several good open source ones. If they need to update their processing, then they might want to make the smart choice now, the choice that increases their choices and flexibility going forward.


18 June 2007 Update

A response from Nature and one of their vendors, explaining the complexity of migrating their publishing ecosystem to a new file format. Quoting a letter to Microsoft from Bruce Rosenblum of Inera:

Had the conversion from DOCX to DOC provided a conversion from OMML to Equation Editor format, it would have provided the necessary backwards compatibility for publishers to upgrade one system at a time. But because this compatibility is not available, it’s created the need for a “big bang” upgrade, or a delay until the ecosystem of inter-dependent systems is deliberately updated over time. In the environment of scholarly publishing, such substantive upgrades often take years, not months.

Filed Under: ODF, OOXML, Standards

  • « Go to Previous Page
  • Page 1
  • Interim pages omitted …
  • Page 3
  • Page 4
  • Page 5
  • Page 6
  • Page 7
  • Interim pages omitted …
  • Page 9
  • Go to Next Page »

Primary Sidebar

Copyright © 2006-2026 Rob Weir · Site Policies