As we all know, rich desktop editors, such as those provided in Microsoft Office, offer a range of end-user programming options, such as Visual Basic macros. These can be used to automate repetitive clerical tasks, such as a mail merge, or to add a custom user interface over a data entry form. These capabilities have existing in personal productivity applications since the late 1980’s — so 20 years now. This is a not cutting-edge feature.
Such scripting capabilities are essential for the creation of high-value scripted documents. These features are essential in modern applications. Almost every word process or spreadsheet today has automation capabilities. Even open source applications like OpenOffice have macro features. So, considering the popularity and value of scripting in a productivity application, it is much lamented that DIS 29500 does not define how scripts or macros are to work. This lack will cause serious interoperability concerns, as each vendor, lacking standards guidance, will implement these features in incompatible ways.
Specifically, in order to have any interoperability among scripted documents, it is necessary to define:
- How and where a script is stored and located within the Open Packaging Convention (OPC) container file.
- How is the script bound to the document. In other words, how does the document content associate itself with the macro?
- What is the runtime language of the script?
- What is the core and extension API’s available to the script?
- What is the security model?
OOXML defines none of these. So how can it meet its goal to “represent faithfully the existing corpus of word-processing documents, spreadsheets and presentations that have been produced by Microsoft Office applications (from Microsoft Office 97 to Microsoft Office 2008 inclusive)”? How can it do that and ignore the macros that have been around for decades?
Note that there is ample precedent for a markup standard answering these questions in a flexible and interoperable manner. For example the common web paradigm would be:
- Script is located via URL specified in a “src” attribute of a script element, or is given inline
- The script is invoked by a function call at a particular point in the document, or triggered from a standard event such as onLoad().
- Multiple runtime languages are supported, often EcmaScript
- The API’s allowed are defined by the W3C’s DOM API
- There is a defined security model to deal with hazards such as cross-frame scripting, etc.
OOXML provides none of this, so interoperability of these high value documents is not possible. Note again that scripting is widespread and has been around for 20 years. So it is especially unfortunate that a newly proposed standard lacks this capability.
Note however that scripting is not without its problems. We all remember the Word Macro Viruses of several years ago, such as Melissa. Portable code has well-known risks, and these risks have well-known counter-measures. For example, it is common for anti-virus software to scan Word documents for viruses. It is also common for mail servers to scan incoming emails for attachments with viruses, and even remove the macros or block documents with macros, according to admin policy. So there is a need toenable 3rd party applications that can locate, retrieve, scan and delete scripting elements from documents. However, since OOXML does not define even where the scripts are stored, or how they can be located, such 3rd party applications cannot be written in general for a document described by this specification. The standard provides an insufficient foundation for implementing a reasonable security policy around OOXML documents.
For example, take Ecma Response 101, approved in Geneva in a 9-4 vote as part of a large batch 0f 1027 changes, without discussion or opportunity for dissent. Four NB’s, in their ballot comments from last September, pointed out that Section 188.8.131.52 of DIS 29500’s Part 4 defines a “MACROBUTTON” field that allows the definition of a button in the document that will trigger a macro. But nothing is said about how the macro is stored, bound, what API’s are available, what the security model is, etc.
The request from one NB was to “Describe this feature to a level where cross-platform, cross-application interoperability is possible.” However, what Ecma provided in their draft Disposition of Comments report, approved in batch by the BRM without discussion or opportunity for objection, was something quite different. They merely added the the following text:
The mechanism by which the command specified by text in field-argument-1 is located and/or executed by an application is implementation-defined
So not only is it impossible to have cross-platform interoperability of this feature, it is not even possible to implement a reasonable security policy to detect, scan or block macros. Even the location of the macro is outside the scope of the standard. It could be just another file in the Zip. It could be a binary blob with an obscure content type that varies from application to application. It could be base64Encoded in the XML. Or it could be steganographically encoded in low-order bits of an image file. The OOXML standard is singularly unhelpful in telling us how to deal with this risks of this macro function.
Finally, note that this lack of information on how to locate macros within a document makes it impossible for anyone to programmatically combine or divide OOXML documents which may contain macros. For example, imagine a 2-page spreadsheet, with a macro on sheet one only. How can it be split into two one-page documents, if there is no defined way to locate the script associated with page one? This is the type of automated composition and document manipulation that OOXML should be enabling. Similarly, how can one combine two single documents containing macros into one document, if there are no defined rules for locating and naming macros? Many basic types of applications,such as merging slide shows, etc., will break in the presence of macros.
The above topic was of interest to several NB’s in Geneva, but could not be discussed for lack of time at the BRM.