Here’s a short tutorial on exchanging MathML between Mathematica and OpenOffice, showing what is possible today, and offering some suggestions for closer integration.
First, start with a new ODF document in OpenOffice. It is often easier to modify an existing document, inheriting its structure and default styles, than to create a new document from scratch. So I believe that a lot of interesting projects with ODF will start with an existing document as a template, and then add or replace content in it.
So, here’s what I made, a simple file with a formula describing the Euclidean metric, our old friend the Pythagorean Theorom. Click the image to load the ODF file.
If you rename the ODF file to a .zip extension, and unzip it, you can see the XML files it contains. Always start with the manifest.xml , for your convenience here, to which I draw your attention to the entry with the type “application/vnd.oasis.opendocument.formula”. This, according to Appendix C of the ODF 1.0 specification, is the registered MIME type of an ODF formula document. So that sounds like what we want. Let’s replace that equation with something else.
So into Mathematica we go. Suppose I want to calculate the indefinite double integral of the Euclidean metric. Why not? This is something I’d rather not do by hand, but I know Mathematica can quickly give me the answer:
Now I really don’t want to retype that result into OpenOffice. So, what can I do? I can use Mathematica’s ExpressionToMathML function to turn the above into MathML. When I do that I get MathML like this.
Let’s see now what happens if I simply drop that content in as a replacement for the original content.xml in the ODF file. Here’s what I get (click the image to open the ODF file):
So we got something, but it is not quite right. I’m seeing some little hollow boxes, usually an indication of an unprintable character. What’s up with this?
A closer look at the XML generated from Mathematica shows that these boxes are being displayed whenever the MathML uses the XML character entities corresponding to section 6.2.4 “Non-Marking Characters” of the MathML specification. This includes things like “InvisibleTimes” which handles cases where adjacency represents multiplication (xy == x*y). Using these characters provides hints to the application that can help it optimize its rendering and editing, but they should not be displayed.
In any case there appears to be a bug in OpenOffice 2.0.3 where it tries to display these characters and finds they don’t map to any printable Unicode character. No big deal, I will enter a bug report on that later. But for now I can easily clean this up by defining a new function in Mathematica, ExpressionToOO, defined as follows:
(Note I didn’t name this “ExpressionToODF”, since strictly speaking the ODF specification allows MathML 2.0, including the non-marking characters. This function is specifically to work around an OpenOffice bug. It outputs valid MathML, simply removing the non-marking characters which OO doesn’t understand.)
So, back to Mathematica, I run ExpressionToOO, grab that XML and inject that XML into the ODF document, and we get the following (click to open the ODF file):
That’s what we want! For those who are interested, the complete Mathematica notebook is here: Session.nb.
As you can see, this isn’t rocket science, though no doubt it may be useful to rocket scientists. Consider this a little “proof of concept”. Real end users will not be going around unzipping ODF documents and copying XML around. There needs to be some additional integration work to make this process simple and joyful. For example:
- A Mathematica function that automatically inserts a formula into an ODF document
- A OpenOffice add-in that lets the user automatically browser formulas from Mathematica and insert them into the current working document.
- Clipboard level exchange of MathML between OpenOffice and Mathematica
- An export filter from OpenOffice to export to the XHTML+MathML+SVG profile defined to the W3C. This, combined with Firefox, would provide kickass scientific publishing using open standards and tools.
Note that I’m using here Mathematica just as an example. There are over 100 MathML supporting applications out there, both commercial and open source. I’d be interested in hearing what other ideas people have for workflows involving ODF editors and other tools that work with the standards ODF includes, not just MathML, but SVG, XForms, etc. Let’s demonstrate the value of open standards working together.