Summary: In this post I will look at MathML, a web standard for displaying mathematical equations. I will show how well established it is on the web, how it is integrated into ODF, and how Microsoft has decided to go off in another direction with OMML, another “stealth” standard hidden in their 4,000 page Office Open XML specfication, but little mentioned. As I did with my prior analysis of their reliance on the rejected VML specification, I will show why this is a bad thing.
I’ve been reading Math You Can’t Use: Patents, Copyright, and Software a book by Ben Klemens, Guest Scholar at the Brookings Institute. It examines the current state of software patents in U.S. and the abuses thereof. He blends his legal and economic policy background with his insights as a programmer to give a perspective worth hearing. Mind you, I don’t agree with him on many points, and in fact I found the book infuriating at times, but he does make a serious argument and I respect that. In any case I like to have my opinions challenged every now and then. It keeps the mind limber.
Although I am not going to talk about patents and copyrights today, I will steal the title of this book and talk a bit about math, the kind you can use as well as the type you can’t. The topic for today is MathML.
MathML is a web standard from the W3C, an XML vocabulary for representing the structure and content of mathematical expressions. In other words, it represents equations for display, especially complicated expressions with integrals, summations, products, limits and all the Greek you can throw at it.
If you are running Firefox and have installed the math fonts then you can get an idea of its capabilities by loading MathML-enabled pages right now, like this one. If you are running Internet Explorer, then sadly you lack native support for MathML, but a browser plugin is available.
MathML 1.0 dates back to 1999, and has been revised through MathML 2.0 (second edition) in 2003.
The W3C has made a special effort to get the various MathML vendors together to evaluate how well they handle MathML and this is reported out in their Implementation and Interoperability Report .
Where MathML is supported natively, such as in Firefox, it will render along with the text, and not merely as an embedded GIF image. So, it will scale to different screen resolutions and print well. In theory, since it is just text markup in the page, it can be indexed by an intelligent search engine, though I am aware of none that do this currently. (Is there any use for a Google search of all web pages that include a 3rd degree polynomial inequality? I wouldn’t want to be the first to say “No”.)
MathML also is the key to enabling better support for mathematics via screen readers and other assistive agents. When a visually impaired user is presented an equation in the form of a GIF or other image format, they are left out. But put the formula in MathML and the possibilities look better. The work is not complete yet, but progress is being made. For example this report from CSUN 2004 and NIDE’s MathML Accessibility Project.
Further innovations are seen at sites like Wolfram’s MathMLCentral where we see web services for creating, displaying, or even integrating MathML expressions, using their Mathematica program as the backend.
For the above, and many other reasons, MathML was the only logical choice for us to use to support equations in OpenDocument Format (ODF). With such a thriving ecosystem of producers and consumers, with support the tools used by academia and industry like Mathematica and Maple, strong support in web browsers like Firefox, with the accessibility initiatives around it, I don’t see how you could argue otherwise. MathML is the way the web does math.
But the choice of MathML is more than just a fashion statement. It has practical significance and enables opportunities for innovative workflows around mathematical document production. If you create an equation block in OpenOffice, it saves the equation as a standalone MathML XML document in the ODT document archive. This makes it very easy to access, read, replace, etc.
We should be thinking about workflows like the following:
- Do your complicated calculations in a tool like Mathematica
- When you get the final results you want, export it to MathML, for example, using Mathematica’s MathMLForm[ ] function.
- Copy the MathML into an ODF document archive
- Take the ODF document and complete the prose write-up of the document in OpenOffice
- Share the draft with colleagues, review, etc., in the editable ODF format
- When ready to publish, export to XHTML with embedded MathML preserved for the equations, and embedded SVG for the charts.
- Users can then view in Firefox or Internet Explorer (with extra plugin)
We’re not quite there yet, end to end. Step #6 in particular is not working as I’d expect in OpenOffice 2.03. But you get the idea. There is opportunity for fame glory and perhaps some profit to the person or company who provides an end-to-end mathematical editing and publishing solution based on open standards.
So, in this happy world I’ve described, what is missing? If you guessed “Microsoft Office” then you guessed correctly! Even though MathML is a 7 year-old standard, widely implemented, supported by the leading mathematical tools, the preferred format for publishing math on the web, etc, etc., (the mantra should be familiar), Microsoft has ignored it and instead is pushing forward a new competing format in their Office Open XML (OOXML) specification rushing through Ecma.
The new math markup format is called OMML and you’ve probably never heard of it. You can check Google, you can check Wikipedia, you can check MSDN. You won’t find it. In fact, I’m not even sure what OMML stands for since the acronym is not defined in the spec. But it is there, nestled away in the 4,081 page draft OOXML specification as the markup that “specifies the structures and appearance of equations in the document”, Section 25.1, all 93 pages of it.
OMML is not MathML, though it does the solves the same problem. But if you use OMML, it will not work with Firefox, with Mathematica, with OpenOffice or with any of the other 100 applications that support MathML. OMML works with Office, and that’s it. One door in, no doors out.
Consider that Ecma TC45’s Programme of Work included the goal of:
….enabling the implementation of the Office Open XML Formats by a wide set of tools and platforms in order to foster interoperability across office productivity applications and with line-of-business systems.
How exactly does the OOXML specification foster this interoperability when it ignores relevant web standards like MathML (and SVG and XForms)?
Microsoft’s typical argument is to say that the existing standards are inadequate, that Microsoft users expect more, that they need more features, that this is because they need to deal with billions of documents and trillions of dollars, etc. But this rings hollow when talking about math. An examination of the history of mathematical notation demonstrates, as you may already know, that mathematical notation is not exactly experiencing a high rate-of-change. Equations, as used in math and sciences, for the most part use the same notation they did 100 years ago, and many parts of notation are 200-300 years old. Certainly there is no essential change in notation since 1999, when MathML was created.
Now if Microsoft had merely wanted to create a proprietary format for equations and use that in Word in order to trap their customers onto that platform, then I’d simply say that’s not my concern and I’d blog about my heirloom tomatoes or something else. But when this shows up in a nominally open standard destined for approval by ISO, then this raises my eyebrows a little. The obvious choice would have been to simply reuse MathML. So, why are they creating, and standardizing a whole new math markup language? Are there no standards worth reusing? Will XPS replace PDF, VML replace SVG, Windows Media Photo format replace PNG, OMML replace MathML, and OOXML replace ODF? Let’s say “No” to OMML and “Yes” to MathML, the math you can use.