I’ve installed the new Office 2007. This isn’t my preferred platform. In fact I find I’m not using heavy-weight editors of any variety much. For every page I compose in a dedicated word processor I author perhaps 50 pages in emails, blogs or wiki’s. However, since I do have a license for Office 2007, and I am curious, I decided to take it for a spin. If you want to be a film critic, you’ve got to see the movies…
Here is a quick survey of what I saw in Excel 2007, concentrating on the file format support, my particular area of interest.
First, let’s look at the “Save As” dialog. As you can see from this screen capture, we have some new options:
The first choice saves in the default format. This is configurable under “Excel Options”, but by default this saves in the new Office Open XML (OOXML) format, with an “xlsx” file extension.
The “Excel Macro-Enabled Workbook” option saves as an “xlsxm” extension. It is OOXML plus proprietary Microsoft extensions. These extensions, in the form of binary blob called vbaProject.bin, represent the source code of the macros. This part of the format is not described in the OOXML specification. It does not appear to be a compiled version of the macro. I could reload the document in Excel and restore the original text of my macro, including whitespace and comments. So source code appears to be stored, but in an opaque format that defied my attempts at deciphering it.
(What’s so hard about storing a macro, guys? It’s frickin’ text. How could you you screw it up? )
This has some interesting consequences. It is effectively a container for source code that not only requires Office to run it, but requires Office to even read it. So you could have your intellectual property in the form of extensive macros that you have written, and if Microsoft one day decides that your copy of Office is not “genuine” you could effectively be locked out of your own source code.
New Style Binary
The “Excel Binary Workbook” option caught me by surprise. This is not the legacy binary formats. This is not the new OOXML. This is a new binary format, with an “xlsb” extension. Similar to OOXML it has a Zip container file (the so-called Open Packaging Conventions container file format), but the payload consists (aside from a manifest) entirely binary files.
I can’t tell if they are some proprietary binary mapping of the OOXML XML, or whether this is an entirely new binary format unrelated to the XML format. In any case this format is entirely undocumented and is unreadable to anyone by Microsoft.
It is also interesting that Microsoft is positioning this format as the preferred one for performance and interoperability. The online help for Excel 2007 says:
In addition to the new XML-based file formats, Office Excel 2007 also introduces a binary version of the segmented compressed file format for large or complex workbooks. This file format, the Office Excel 2007 Binary (or BIFF12) file format (.xls), can be used for optimal performance and backward compatibility.
Old Style Binary
The Excel 97-2003 option provides the legacy binary “xls” formats, the familiar BIFF format from earlier versions of Office.
This takes you to a page where you can download the “Microsoft Save as PDF or XPS” Add-in. Note that you are prompted to download an Add-in that provides support for both PDF and XPS. But if you hunt around a bit you can find another page where you can download just one format or the other, which is what I did, installing just the PDF support. This added a new option, “PDF” to the Save As dialog.
This brings up a dialog where you can choose from the previously mentioned formats as well as the several legacy export formats, including:
- XML Data
- Web Page
- Unicode Text
- XML Spreadsheet 2003
- Excel 5.0/95 Workbook
- Formatted Text
My overall impression was soured a bit by the large number of crashes I experienced. Indeed Excel crashed on exit on almost every session. This was dozens of crashes over the course of an afternoon. This will need to be fixed before I would trust it with my data.
Another curiosity was a legacy binary document that gave the following error message whenever I tried to save it to the new OOXML format:
It did not get this message when I saved it back to the binary format. So evidently I’m losing something when moving to OOXML, whatever “Line Print settings” are. So much for the claims of 100% backwards compatibility…
My examination also put to rest any lingering hope I had that Microsoft had fundamentally changed their position on proprietary file formats and has decided to follow in the paths of openness. The new proprietary binary format and the undocumented ways that macros are encoded put any hope of that to rest.
1/22/07, A quick update: Microsoft’s Doug Mahugh helped track down and fix the crash problem I had earlier reported when exiting Excel. This is a bug in the”Send to Bluetooth” COM Add-in that Excel was loading at startup. After disabling that Add-in, I’m no longer crashing.