• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar

An Antic Disposition

  • Home
  • About
  • Archives
  • Writings
  • Links
You are here: Home / Archives for 2007

Archives for 2007

Fast Track. Wrong Direction.

2007/03/13 By Rob 26 Comments

The idea was to make the C++ programming language work better in Microsoft’s .NET framework. It started off as the Managed Extensions for C++, first available in 2000, and later in Visual Studio .NET 2003. Managed Extensions were reformulated in Visual Studio 2005 where they were called C++/CLI, referring to the Common Language Infrastructure, the runtime abstraction in .NET.

CLI itself had earlier been standardized in Ecma (approved in 2000) and Fast Tracked through ISO (approved in 2001). So, it was not much of a surprise when the C++ variant for Microsoft’s .NET Framework, C++/CLI, was proposed for standardization as well. Ecma TC39/TG5 started work on C++/CLI in December 2003 and Ecma approved the specification as Ecma-372 in December 2005. Two years in committee, resulting in a 304-page specification. This used to be considered a fast pace.

After approval by Ecma, C++/CLI was submitted for Fast Track processing to ISO/IEC JTC1/SC22 as DIS 26926. Like any other Fast Track in JTC1, this process started with a 30-day contradiction period. Contradiction submissions were made by both Germany[pdf] and the UK[pdf].

The UK’s position was that calling the standard “C++/CLI” would cause, and in fact was already causing, confusion among users with the already existing C++ programming language. The name of the standard was unacceptable:

We consider that C++/CLI is a new language with idioms and usage distinct from C++. Confusion between C++ and C++/CLI is already occurring and is damaging to both vendors and consumers.

A new language needs a new name. We therefore request that Ecma withdraw this document from fast-track voting and if they must re-submit it, do so under a name which will not conflict with Standard C++.

Similar views were expressed by Germany:

With reference to §13.4 of the JTC1 Directives, 4th edition, DIN brings to the attention of the JTC1 secretariat that we perceive a contradiction between document JTC 1 N 8037 “30 Day Review for Fast Track Ballot ECMA-372 1st edition C++/CLI Language Specification”and the JTC1/C++ standard ISO/IEC 14882:2004 “Programming language C++” and related technical reports.

We propose that the document is input into SC22 as a regular New Work Item Proposal and assigned to WG21 for further processing.

Ecma responded[pdf] to these objections in a 5-page letter, on 29 January 2006, that refused to make even the most basic concession, such as changing the name to remove the C++ reference.

So the objections are ignored, and they move on to the 5-month ballot period, starting March 9th, 2006. When the ballot closed in August, and the votes were counted, C++/CLI had received 11 out of 20 P-Member votes (55%) and a total of 9 negative votes out of 26 total votes cast, or 34.61%. So it failed both to get the required 2/3 approval of P-Members, as well as to keep the negative votes to less than 25%.

Germany and the UK voted disapproval. No surprise there, since they had objected early in the process, and their objections were ignored. In fact one of Germany’s comments in the ballot was:

DIN has commented before, as well as BSI did, that allowing fast-track standardization of the “C++/CLI Language” under this name clearly conflicts with an existing and actively maintained standard: ISO 14882 – the C++ Programming Language. The document under review spells out under “NOTE FROM ITTF”, bullet 2.2, that ITTF will ascertain that this proposed standard does not conflict with any other International Standard but such a conflict was pointed out. No reason has been given why this objection was overridden. Thus, DIN wants to express its surprise that standardization of this proposal went forward.

The US comments included:

The proposed standard is not market driven, nor is it the product of an industry consensus.

We are unimpressed with the very low level of C++ community participation mustered in the design and refinement of the current document, and feel, quite frankly, that the current state of this document is not at a high enough level of technical excellence to merit the ISO imprimatur.

France said:

This document should be withdrawn from the fasttrack approval process pending re-drafting and a more adequate review prior to voting. Better yet, retain it as an Ecma standard only until a clear market consensus develops that a JTC1 standard in this area is needed.

And so on, down the list.

It should be noted that a failing vote in the 5-month ballot is not necessarily fatal. The Fast Track submitter, in this case Ecma, can call on the SC Secretariat to convene a Ballot Resolution Meeting (BRM), where the issues can be discussed and resolved, possibly leading to a positive vote after a further ballot. This is Ecma’s right as a Fast Track submitter. However, C++/CLI did not see a ballot resolution meeting. The JTC1 Secretariat recently notified SC22 members:

We have been advised that the comments accompanying the Fast Track ballot for DIS 26926 are not resolvable and that holding a Ballot Resolution Meeting (BRM) would not be productive or result in a document that would be acceptable to the JTC 1 National Bodies. Therefore, our proposal is to not hold the BRM and to cancel the project.

So, the BRM which had been scheduled for April, 2007 has been canceled, and that’s where it stands today, with the attempted Fast Track of C++/CLI dead from seemingly easily preventable flaws.

Lessons, anyone?

Don’t ignore NB members. If they take the time and make the effort to point out your flaws early in the process, then you should count yourself lucky. This is like the school teacher walking around the classroom during a quiz and pointing to one of your answers and saying, “You might want to take another look at that problem”. If you ignore her advice and just turn in your paper, then you deserve the grade you get.

It is instructive as well that although only two NB’s objected in the C++/CLI contradiction period, this grew to a far larger number by the time the 5-month ballot had ended. Ignoring problems doesn’t make them go away.

One last thing. Any guesses on how long those contradiction arguments stay online before they are taken down to preserve the shrouded secrecy of ISO process? I advise you to make a copy now. I certainly have.

Filed Under: OOXML

Document Migrations

2007/03/06 By Rob 11 Comments

If you’ve been around this business for a while, you’ve seen your share of migrations. New operating systems, new networks, new hardware, even new document formats. I’d like to share some recollections of one such migration, and then some suggest a solution.

In 1995 I was working at Lotus on Freelance Graphics, along with many others, getting SmartSuite ready for Windows 95. One day, as I walked to work and rounded the corner of Binney Street, I saw something unusual, even more unusual than the usual unusual one sees in Cambridge. Something was up. There were news vans parked in front of LDB, camera crews and reporters looking for comments, Lotus security videotaping the reporters asking for comments, and me standing there, clueless.

This was how I first heard of IBM’s take-over offer. It was hard to concentrate on porting to Windows 95 with all that news going on downstairs, but we managed.

In the weeks and months that followed there were many changes. At Lotus we were 100% SmartSuite users. No surprise there. Most of us did not even have a copy of Microsoft Office on our machines, unless we worked on file compatibility. Not only did we use SmartSuite for our collaborative work, creating and reviewing specifications, giving presentations, etc., we also ran some of our business processes on it. In particular we used an expense report application, done in 1-2-3 with LotusScript.

But IBM used Microsoft Office. So when IBM took over, we needed to migrate. Sure, there was whining and moaning and gnashing of teeth on our end about having to move to an inferior product. And it did take a little while to get accustomed to the different conventions of Office, typing AVERAGE() in Excel, rather than @AVG() in 1-2-3 and stuff like that. But we did it. We moved to Office. It was clear to all that the benefits of having a single file format outweighed the short-term pain on migration.

It is interesting what we did not do:

  1. We did not go and convert all existing legacy SmartSuite documents into Office format. What would have been the point? Most old documents are never touched again. Let them rest in peace.
  2. We did not delete SmartSuite from our hard drives. We kept the application there for cases where we needed to access old documents.
  3. We did not simply continue using SmartSuite and tell it to save in Office format. We knew that both fidelity-wise and performance-wise it is far better to use an application that supports a format natively than to rely on conversion software for interoperability.
  4. We did not translate 1-2-3 macro-based applications into Excel macro-based applications. We took the opportunity to move straight to web based applications. Aside from some standard presentation templates and similar boiler-plate templates we did not do a lot of conversion work.

Looking back in retrospect, the migration of file formats was one of the least contentious changes that accompanied the IBM takeover. We can handle file format changes, but eliminating the traditional Friday Beer Cart, now that was something to complain about…

I’m not much of one for committing unprovoked acts of methodology, but if I had to summarize what little wisdom I have in this area, I’d say that for a migration you want evaluate your existing documents by three criteria: stability, complexity and business criticality, and develop a migration plan based on that.

In the first case you classify documents by how stable (unchanging) they are:

  1. Hot documents — the documents that are being heavily changed and edited today, works-in-progress, in active collaborations
  2. Cold documents — the documents which are no longer edited, though perhaps they are still read. Many of these documents may have zero value and are just taking up space. Others may be valuable records, but hidden away on someone’s hard-drive.
  3. Warm documents — These are the ones that are in the middle, not seeing heavy activity, but they aren’t quite frozen either.

From the perspective of complexity we have:

  1. Low complexity — simple text and graphics
  2. Medium complexity — using more advanced features, created by power users
  3. High complexity — “engineered documents”, using scripting and macros to create applications.

Finally you can also look at these documents from the perspective of business criticality. Of course, this will vary according to your business. It might be relevance to ongoing litigation, it might be according to a records retention policy, it might be whether it concerns currently open projects, etc. But for sake of argument, let’s take client or public exposure as a proxy for criticality, so we get this:

  1. Internal use documents — internal presentations and reports
  2. Customer facing documents — engagement reports, proposals, etc.
  3. Publication ready documents — white papers, journal articles, etc.

These three dimensions — stability, complexity and criticality — can be combined, creating 27 different document classes. For example, our old expense report based on 1-2-3 macros would be classified as a hot, high complexity, internal use document.

So you are transitioning from Office legacy binary formats to ODF. What do you do with each of these document classes? You have four main strategies to consider:

  1. Do nothing and preserve the document in the legacy format, maintaining, as needed, access to the legacy application.
  2. Convert document to a portable high fidelity static representation, like PDF
  3. Convert directly to ODF.
  4. Reengineer as something other than a document.

So one migration policy might look like this:

Stability Complexity Exposure Strategy
Cold Low Internal Use Do nothing
Cold Low Customer Facing Do nothing
Cold Low Publication Ready Do nothing
Cold Medium Internal Use Do nothing
Cold Medium Customer Facing Do nothing
Cold Medium Publication Ready Do nothing
Cold High Internal Use Do nothing
Cold High Customer Facing Convert to PDF
Cold High Publication Ready Convert to PDF
Warm Low Internal Use Convert to ODF
Warm Low Customer Facing Convert to ODF
Warm Low Publication Ready Convert to ODF
Warm Medium Internal Use Convert to ODF
Warm Medium Customer Facing Convert to ODF
Warm Medium Publication Ready Convert to ODF
Warm High Internal Use Convert to ODF
Warm High Customer Facing Publish as PDF
Warm High Publication Ready Publish as PDF
Hot Low Internal Use Convert to ODF
Hot Low Customer Facing Convert to ODF
Hot Low Publication Ready Convert to ODF
Hot Medium Internal Use Convert to ODF
Hot Medium Customer Facing Convert to ODF
Hot Medium Publication Ready Convert to ODF
Hot High Internal Use Reengineer
Hot High Customer Facing Reengineer
Hot High Publication Ready Reengineer

There may be a better way of expressing this above (Karnaugh maps anyone?) but that gives the idea. Also, I’m not suggested that this is the “one true answer”, but merely that this may be a useful way of framing the problem.

Variations might include:

  • Have a default policy of doing no conversions, but create all new documents in ODF format.
  • By default, ignore all legacy documents. But the first time any legacy document is read or written, put it into a queue for evaluation and possible conversion.

Much of this lends itself to automation. For example:

  1. First you need to find all of the documents in an organization. This could be done by an activeX control on a page everyone in the company visits, an agent that spiders the intranet web pages and file servers, etc.
  2. Each document is then scored.
  3. Finding the stability of a document could be done by looking at the last read and last write stamps on the file. Also can look weblogs. Maybe even metadata in the document that tells how many times it has been edited.
  4. Complexity could be determined by scanning the document to see what features it uses. Some features, like script, would weight heavily for complexity. Think of it as a “goodness of fit” metric for how well the features used in the document fit within the ODF model.
  5. Business criticality is harder to automate, but could be done based on owner of the document, metadata in the document, location of the document (public web page versus intranet), etc.
  6. Calculate the scores, suggest actions to take, and then automate the action. This could lead to a nice automated migration solution.

In summary, it probably is not worth while simply to go out and convert all of your legacy documents in a giant cathartic orgy of document transformations. Not all documents are worth that effort. In any organization you probably have many many documents that will never be read again, ever. You also likely have some very complex documents that probably should be reengineered as web applications on your intranet. The other documents, the ones in the middle, that is where you focus your migration effort.

Filed Under: ODF

Compatibility According to Humpty Dumpty

2007/03/04 By Rob 15 Comments

‘I don’t know what you mean by “glory,” ’ Alice said.

Humpty Dumpty smiled contemptuously. ‘Of course you don’t — till I tell you. I meant “there’s a nice knock-down argument for you!” ’

‘But “glory” doesn’t mean “a nice knock-down argument,” ’ Alice objected.

‘When I use a word,’ Humpty Dumpty said, in a rather scornful tone, ‘it means just what I choose it to mean, neither more nor less.’

‘The question is,’ said Alice, ‘whether you can make words mean so many different things.’

‘The question is,’ said Humpty Dumpty, ‘which is to be master – that’s all.’

— Lewis Carroll from Through the Looking-Glass (1871)

I have written about Microsoft’s language games previously. These games continue and it appears to be time for yet another inoculation. Words such as “open”, “choice”, “interoperability”, “standard”, “innovation” and “freedom” have been bandied about like patriotic slogans, but with meanings that are often distorted from their normal uses.

The aggrieved word I want to examine today is “compatibility”. Let’s see how it is being used, with some illustrative examples, the ipsissima verba, Microsoft’s own words:

From an open letter “Interoperability, Choice and Open XML” by Jean Paoli and Tom Robertson:

The specification enables implementation of the standard on multiple operating systems and in heterogeneous environments, and it provides backward compatibility with billions of existing documents.

From another open letter, Chris Capossela’s “A Foundation for the New World of Documents”:

… all the features and functions of Office can be represented in XML and all your older Office documents can be moved from their binary formats into XML with 100 percent compatibility. We see our investment in XML support as the best way for us to meet customers’ interoperability needs while at the same time being compatible with the billions of documents that customers create every year.

From Doug Mahugh: “The new Open XML file formats offer true compatibility with all of the billions of Office documents that already exist.”

And from Craig Kitterman: “Is backward compatibility for documents important to you? How about choice?”

Those are just a handful of examples. Feel free to leave a comment suggesting additional ones.

Compatibility. Better yet, True Compatibility. What is that? And what do you think the average user, or even the average CTO, thinks, when hearing these claims from Microsoft about 100% compatibility?

Let’s explore some scenarios and try to reverse-engineer Microsoft’s meaning of “True compatibility”.

Suppose you get a new, more powerful PC with more memory and upgraded graphics card and you upgrade to Vista and Office 2007. You create a new presentation in PowerPoint 2007 and save it in the new OOXML format. What can you do with it?

Can you exchange it with someone using Office on the Mac? Sorry, no. OOXML is not supported there. They will not be able to read your document.

Is this 100% compatibility?

What about Windows Mobile? Can I read my document there? Sorry, OOXML is not supported there either.

Is this 100% compatibility?

What about sending the file to your friends using SmartSuite, WordPerfect Office or OpenOffice, or KOffice? They all are able to read the legacy Microsoft formats, so surely a new format that is 100% compatible with the legacy formats should work here as well? Sorry, you are out of luck. None of these applications can read your OOXML presentation.

Is this 100% compatibility?

What about legacy versions of Microsoft Office? Can I simply send my OOXML file to a person using an old version of Office and have it load automatically? Sorry, older versions of Office do not understand OOXML. They must either upgrade to Office 2007 or download and install a convertor.

Is this 100% compatibility?

I have Microsoft Access XP and an application built on it that imports data ranges from Excel files and imports them into data tables. Will it work with OOXML spreadsheets? Sorry, it will not. You need to upgrade to Access 2007 for this support.

Is this 100% compatibility?

What about other 3rd party applications that take Office files as input: statistical analysis, spreadsheet compilers, search engines, document viewers, etc. Will they work with OOXML files? No, until they update their software your OOXML documents will not work with software that expects the legacy binary formats.

Is this 100% compatibility?

Suppose I, as a software developer, takes the 6,039 page OOXML specification and write an application that can read and display OOXML perfectly. It will be hard work, but imagine I do it. Will I then be able to read the billions of legacy Office documents? Sorry, the answer is no. The ability to read and write OOXML does not give you the ability to read and write the legacy formats.

Is this 100% compatibility?

So, there it is. A don’t know if we’re any closer to finding out what “100% compatibility” means to Microsoft. But we certainly have found lot of things it doesn’t mean.

A quick analogy. Suppose I designed a new DVD format, and standardized it and said it was 100% compatible with the existing DVD standard. What would consumers think this means? Would they think that the DVD’s in the new format could play in legacy DVD players? Yes, I believe that would be the expectation based on the normal meaning of “100% compatible”.

But what if I created a new DVD Player and said it supported a new DVD format, but also that the Player was 100% compatible with the legacy format. What would consumers think then? Would they expect that the new DVD’s would play in older players? No, that is not implied. Would they expect that older DVD’s could be played in the new Player? Yes, that is implied.

This is the essence of Microsoft’s language game. The are confusing the format with the application. This is easy to do when your format is just a DNA sequence of your application. However, although Microsoft Office 2007, the application, may be able to read both OOXML and the legacy formats, the OOXML format itself is not compatible with any legacy application. None. The only way to get something to work with OOXML to write new code for it.

This is not what people expect when they hear these claims of OOXML being 100% compatible with legacy formats.

Filed Under: OOXML

OASIS Symposium and OpenDocument Workshop

2007/03/01 By Rob Leave a Comment

OASIS will have its annual Symposium April 15th-17th in San Diego, with the theme, “eBusiness and Open Standards: Understanding the Facts, Fiction, and Future”. It should be noted that this is not a real symposium, where guests recline in couches, drink wine and discuss philosophy to the accompaniment of flute-girls. On the other hand, it will have a lot of ODF, which is almost as good.

Bob Sutor will give the opening keynote. Scott Hudson will give a talk on, “DocTape: A Document Standards Interoperability Framework for DocBook, DITA, ODF and more!”. I’ll be joining a panel on Tuesday looking at ODF Interoperability and related topics. And Wednesday will be a half-day Workshop on ODF, with presentations on adoption, programmability, accessibility, interoperability and future directions.

Then back home on Thursday, my birthday. This gives my wife the rare opportunity to get a large present into the house without me noticing. Hint, hint…

Filed Under: ODF

Essential and Accidental in Standards

2007/02/25 By Rob 16 Comments

The earliest standards were created to support the administration of the government, which in antiquity primarily consisted of religion, justice, taxation and warfare. Crucial standards included the calendar, units of length, area, volume and weight, and uniform coinage.

Uniform coinage in particular was a significant advance. Previously, financial transactions occurred only by barter or by exchanging lumps of metals of irregular purity, size and shape, called “aes rude”. With the mass production of coins of uniform purity and weight imprinted with the emperor’s portrait, money could now be exchanged by simply counting, a vast improvement over having to figure out the purity and weight of an arbitrary lump of metal. Standards reduced the friction of commercial transactions.

Cosmas Indicopleustes, a widely-traveled merchant, later a monk, writing in the 6th Century, said:

The second sign of the sovereignty which God has granted to the Romans is that all nations trade in their currency, and in every place from one end of the world to the other it is acceptable and envied by every other man and every kingdom

“You can see a lot just by observing,” as Yogi Berra once said. A coin can be read much like a book. So, what can you see by reading a coin, and what does this tell us about standards?

To the left are examples from my collection of a single type of coin. The first picture shows the the obverse of one instance, followed by the reverse of eight copies of the same type.

The legend on the observe is “FLIVLCONSTANTIVSNOBC”. The text is highly abbreviated and there are no breaks between the words as is typical in classical inscriptions whether on coins or monuments. Brass or marble was expensive so space was not wasted. We can expand this inscription to “Flavius Julius Constantius Nobilissimus Caesar” which translates to “Flavius Julius Constantius, Most Noble Caesar”.

So this is a coin of Constantius II (317-361 A.D.), the middle son of Constantine the Great. The fact that he is styled “Caesar” rather than “Augustus” indicates that this coin dates from his days as heir-designate (324-337), prior to his father’s death. We know from other evidence that this type of coin was current around 330-337 A.D.

There is not much else interesting on the obverse. Imperial portraits had become stylized so much by this period that you cannot really tell one from the other purely by the portrait.

The reverse is a bit more interesting. When you consider that the such coins were produced by the millions and circulated to the far corners of the empire, it is clear the coins could have propaganda as well as monetary value. In this case, the message is clear. The legend reads “Gloria Exercitus” or “The Glory of the Army”. Since the army’s favor was usually the deciding factor in determining succession, a young Caesar could never praise the army too much. Not coincidentally, Constantius’s brothers, also named as Caesars, also produced coins praising the army before their father’s death.

At bottom of the reverse, in what is called the “exergue”, is where we find the mint marks, telling where the coin was minted, and even which group or “officina” within the mint produced the coin. From the mint marks, we see that these particular coins were minted in Siscia (now Sisak, Croatia), Antioch (Antakya, Turkey), Cyzicus (Kapu-Dagh, Turkey), Thessalonica (Thessaloníki, Greece) and Constantinople (Istanbul, Turkey).

The image on the reverse shows two soldiers, each holding a spear and shield, with two standards, or “signa militaria”, between them. The standard was of vital importance on the battle field, providing a common point by which the troops could orient themselves in their maneuvers. These standards appear to be of the type used by the Centuries (units of 100 men) rather than the legionary standard, which would have the imperial eagle on top. You can see a modern recreation of a standard here.

If you look closely, you’ll notice that soldiers on these coins are not holding the standards (they already have a spear in one hand and a shield in the other), and they lack the animal skin headpiece traditional to a standard bearer or “signifer”. So this tells us that these soldiers are merely rank and file soldiers encamped, with the standards stuck into the ground.

If you compare the coins carefully you will note some differences, for example:

  • Where the breaks in the legend occur. Some break the inscription into “GLOR-IAEXERC-ITVS”, while others have “GLORI-AEXER-CITVS”. Note that neither match word boundaries.
  • The uniforms of the soldiers differ, in particular the helmets. Also the 2nd coin has the soldiers wearing a sash that the other coins lack. This may reflect legionary or regional differences.
  • The standards themselves have differences, in the number of disks and in the shape of the topmost ornament.
  • The stance of the soldiers varies. Compare the orientation of their spear arm, the forward foot and their vertical alignment.

There are also differences in quality. Consider the mechanics of coin manufacture. These were struck, not cast. The dies were hand engraved in reverse (intaglio) and hand struck with hammers into bronze planchets. The engravers, called “celators”, varied in their skills. Some were clearly better at portraits. Others were better at inscriptions. (Note the serifs in the ‘A’ and ‘X’ of the 4th coin) Some made sharp, deep designs that would last many strikes. Others had details that were too fine and wore away quickly. Since these coins are a little under 20mm in diameter, and the dies were engraved by hand, without optical aids, there is considerable skill demonstrated here, even though this time period is pretty much the artistic nadir of classical numismatics.

Despite the above differences, to an ancient Roman, all of these coins were equivalent. They all would have been interchangeable, substitutable, all of the same value. Although they differed slightly in design, they matched the requirements of the type, which we can surmise to have been:

  • obverse portrait of the emperor with the prescribed legend
  • reverse two soldiers with standards, spears and shields with the prescribed legend
  • mint mark to indicate which mint and officina made the coin
  • specified purity and weight

I’d like to borrow two terms from metaphysics: “essential” and “accidental”. The essential properties are those which an object must necessarily have in order to belong to a particular category. Other properties, those which are not necessary to belong to that category, are termed “accidental” properties. These coins are all interchangeable because they all share the same essential properties, even though they differ in many accidental properties.

As another example, take corrective eye-glasses. A definition in terms of the essential properties might be, “Corrective eyeglasses consist of two transparent lenses, held in a frame, to be worn on the face to correct the wearer’s vision.” Take away any essential property and they are no longer eyeglasses. Accidental properties might include the material used to make the lenses, the shape and color of the frame, whether single or bifocals, the exact curvature of the lens etc.

The distinction between the essential and accidental is common wherever precision in words is required, in legislation, in regulations, in contracts, in patent applications, in specifications and in standards. There are risks in not specifying an essential property, as well as in specifying an accidental property. Underspecification leads to lower interoperability, but over-specification leads to increased implementation cost, with no additional benefit.

Technical standards have dealt with this issue in several ways. One is through the use of tolerances. I talked about light bulbs in a previous post and the Medium Edison Screw. The one I showed, type A21/E26 has an allowed length range of 125.4-134.9 mm. An eccentricity of up to 3-degrees is allowed along the base axis. The actual wattage may be as much as 4% plus 0.5 watts greater than specified. Is this just an example of a sloppy standard? Why would anyone use it if it allows 4% errors? Why not have a standard that tells exactly how large the bulb should be?

The point is that bulb sockets are already designed to accept this level of tolerance. Making the standard more precise would do nothing but increase manufacturing costs, while providing zero increase in interoperability. The reason why we have cheap and abundant lamps and light bulbs is that their interconnections are standardized to the degree necessary, but no more so.

There is often a sweet spot in standardization that gives optimal interoperability at minimal cost. Specify less than this and interoperability suffers. Specify more than that and implementation costs increase, though with diminished interoperability returns.

An allowance for implementation-dependent behaviors is another technique a standard has available to find that sweet spot. A standard can define some constraints, but explicitly state that others are implementation-dependent, or even implementation-defined. (Implementation-defined goes beyond implementation-dependent in that not only can an implementation choose their own behavior in this area, but that they should also document what behavior they made implemented.) For example, in the C or C++ programming languages the size of an integer is not specified. It is declared to be implementation-defined. Because of this, C/C++ programs are not as interoperable as, say Java or Python programs, but they are better able to adapt to a particular machine architecture, and in that way achieve better performance. And even Java specifies that some threading behavior is implementation-dependent, knowing that runtime performance would be significantly enhanced if implementations could directly use native OS threads. Even with these implementation-dependent behaviors, C, C++ and Java have been extremely successful.

Let’s apply this line of reasoning to document file formats. Whether you are talking about ODF or OOXML there are pieces left undefined, important things. For example, neither format specifies the exact pixel-perfect positioning of text. There is a common acceptance that issues of text kerning and font rasterization do not belong in the file format, but instead are decisions best deferred to the application and operating environment, so they can make a decision based on factors such as availability of fonts and the desired output device. Similarly a color can be specified, in RGB or another color space, but these are not exact spectral values. Red may appear one way on a CRT under florescent light, and another way on a LCD monitor in darkness and another way on color laser output read under tungsten lighting. An office document does not specify this level of detail.

In the end, a standard is defined as much by what it does not specify as by what it does specify. A standard that specifies everything can easily end up being merely the DNA sequence of a single application.

A standard provides interoperability within certain tolerances and with allowances for implementation-dependent behaviors. A standard can be evaluated based on how well it handles such concerns. Microsoft’s Brian Jones has criticized ODF for having a flexible framework for storing application-specific settings. He lists a range of settings that the OpenOffice application stores in their documents, and compares that to OOXML, where such settings are part of the standardized schema. But this makes me wonder, where then does one store application-dependent settings in OOXML? For example, when Novell completes support of OOXML in OpenOffice, where would OpenOffice store its application-dependent settings? The Microsoft-sponsored ODF Add-in for Word project has made a nice list of ODF features that cannot be expressed in OOXML. These will all need to be stored someplace or else information will be lost when down-converting from ODF to OOXML. So how should OpenOffice store these when saving to OOXML format?

There are other places where OOXML seems to have regarded the needs of Microsoft Office, but not other implementors. For example, section 2.15.2.32 of the WordprocessingML Reference defines an “optimizeForBrowser” element which allows the notation of optimization for Internet Explorer, but no provision is made for Firefox, Opera or Safari.

Section 2.15.1.28 of the same reference specifies a “documentProtection” element:

This element specifies the set of document protection restrictions which have been applied to the contents of a WordprocessingML document. These restrictions shall be enforced by applications editing this document when the enforcement attribute is turned on, and should be ignored (but persisted) otherwise.

This “protection” relies on a storing a hashed password in the XML, and comparing that to the hash of the password the user enters, a familiar technique. But rather than using a secure hash algorithm, SHA256 for example, or any other FIPS compliant algorithm, OOXML specifies a legacy algorithm of unknown strength. Now, I appreciate the need for Microsoft to have legacy compatibility. They fully acknowledge that the protection scheme they provide here is not secure and is only there for compatibility purposes. But why isn’t the standard flexible enough to allow an implementation to utilize a different algorithm, one that is secure? Where is the allowance for innovation and flexibility?

What makes this worse is that Microsoft’s DRM-based approach to document protection, from Office 2003 and Office 2007, is entirely undocumented in the OOXML specification. So we are left with a standard with a broken protection feature that we cannot replace, while the protection that really works is in Microsoft’s proprietary extensions to OOXML that we are not standardizing. How is this a good thing for anyone other than Microsoft?

Section 2.15.3.54 defines an element called “uiCompat97To2003” which is specified simply as, “Disable UI functionality that is not compatible with Word97-2003”. But what use is this if I am using OOXML in OpenOffice or WordPerfect Office? What if I want to disable UI functionality that is not compatible with OpenOffice 1.5? Or WordPerfect 8? Or any other application? Where is the ability for other implementations to specify their preferences?

It seems to me is that OOXML in fact does have application-dependent behaviors, but only for Microsoft Office, and that Microsoft has hard-coded these application-dependent behaviors into the XML schema, without tolerance or allowance for any other implementations settings.

Something does not cease to be application-dependent just because you write it down. It ceases to be application-dependent only when you generalize it and accommodate the needs of more than one application.

Certainly, any application that stores document content or styles or layout as application-dependent settings rather than in the defined XML standard should be faulted for doing so. But I don’t think anyone has really demonstrated that OpenOffice does this. It would be easy enough to demonstrate if it were true. Delete the settings.xml from an ODF document (and the reference to it from the manifest) and show that the document renders differently without it. If it does, then submit a bug report against OpenOffice or (since this is open source) submit a patch to fix it. A misuse of application-settings is that easy to fix.

But a standard that confuses the accidental application-dependent properties of a single vendor’s application for an essential property that everyone should implement, and to do this without tolerance or allowance for other implementations, this is certainly a step back for the choice and freedom of applications to innovate in the marketplace. Better to use ODF, the file format that has multiple implementations and acknowledges the propriety of sensible application-dependent behaviors, and provides a framework for them to record such settings.


3/16/2007 — added Cosmas Indicopleustes quote

Filed Under: Numismatics, OOXML, Standards

  • « Go to Previous Page
  • Page 1
  • Interim pages omitted …
  • Page 9
  • Page 10
  • Page 11
  • Page 12
  • Page 13
  • Interim pages omitted …
  • Page 17
  • Go to Next Page »

Primary Sidebar

Copyright © 2006-2026 Rob Weir · Site Policies