OOXML

A bit about the bit with the bits

2006/10/15 By Rob 16 Comments

I had an interesting meal in Paris a few weeks ago at a small bistro. I like Louisiana Cajun-style food, especially spicy andouille sausage, so when I saw “andouillette” on the menu, my stomach grumbled in anticipation. Certainly, the word ended in “ette”, but even my limited knowledge of French told me that this is just a diminutive ending. So maybe these sausages were small. No big deal, right?

When my lunch arrived, something was not quite right. First, this did not smell like any andouille sausage I had ever had. It was a familiar scent, but I couldn’t quite place it. But as soon as I cut into the sausage, and the filling burst out of the casing, it was clear what I had ordered. Tripe. Chitterlings . Pig intestines. With french fries.

I then knew where I had smelt this before. My grandfather, a Scotsman, was fond of his kidney pies and other dishes made of “variety meats”. This is food from an earlier time. The high fat content, and (in earlier days at least) cheaper prices of these cuts of meat provided essential meals for the poor. Although my grandfather ate these dishes out of preference, I’m pretty sure that his grandfather ate them out of necessity. How times change.

This was brought to mind recently as was reading the “final draft” of the Ecma Office Open XML (OOXML), something that was probably once done out of necessity in the memory-poor world of 1985, but now looks like an anachronism in the modern world of XML markup.

I’m talking about bitmasks. If you are a C programmer then you know already what I am talking about.

In C, imagine you want to store values for a number of yes/no (Boolean) type questions. C does not define a Boolean type, so the convention is to use an integer type and set it to 1 for true, and 0 for false. (Or in some conventions, 0 for true and anything else for false. Long story.) The smallest variable you can declare in C is a “char” (character) type, on most systems 8 bits (1 byte long) or even padded to a full 16 bits. But the astute reader will notice that a yes/no boolean question is really expressing only 1 bit of information, so storing it in an 8 bit character is a waste of space.

Thus the bitmask, a technique used by C programmers to encode multiple values into a single char (or int or long) variable by ascribing meaning to individual bits of the variables. For example, an 8-bit char can actually store the answer to 8 different yes/no questions, if we think of it in binary. So 10110001 is Yes/No/Yes/Yes/No/No/No/Yes. Expressed as an integer, it can be stored in a single variable, with the value of 177 (the decimal equivalent of 10110001).

The C language does not provide a direct way to set or query the values of an individual bit, but it does provide some “bitwise” operators that can be used to indirectly set and query bits in a bitmask. So if you want to test to see if the fifth (counting from the right) bit is true, then you do a bitwise AND with the number 16 and see if it is anything other than zero. Why 16? Because 16 in binary is 00010000, so doing a bitwise AND will extract just that single bit. Similarly you get set a single bit by doing a bitwise OR with the right value. This is one of the reasons why facility with binary and hexadecimal representations are important for C programmers.

So what does this all have to do with OOXML?

Consider this C-language declaration:
typedef struct tagLOCALESIGNATURE { DWORD lsUsb[4]; DWORD lsCsbDefault[2]; DWORD lsCsbSupported[2]; } LOCALESIGNATURE, *PLOCALESIGNATURE;

This, from MSDN is described as a memory structure for storing:

…extended font signature information, including two code page bitfields (CPBs) that define the default and supported character sets and code pages. This structure is typically used to represent the relationships between font coverage and locales.

Compare this data structure to the XML defined in section 2.8.2.16 (page 759) of Volume 4 the OOXML final draft:

The astute reader will notice that this is pretty much a bit-for-bit dump of the Windows SDK memory structure. In this case the file format specification provides no abstraction or generalization. It merely is a memory dump of a Windows data structure.

This is one example of many. Other uses of bitmasks in OOXML include things such as:

paragraph conditional formatting
table cell conditional formatting
table row conditional formatting
table style conditional formatting settings exception
pane format filter

If this all sounds low-level and arcane, the you perceive correctly. I like the obscure as much as the next guy. I can recite Hammurabi in Old Babylonian, Homer in Greek, Catullus in Latin and Anonymous in Old English. But when it comes to an XML data format, I seek to be obvious, not obscure. Manipulating bits, my friends, is obscure in the realm of XML.

Why should you care? Bitmasks are use by C programmers, so why not in XML? One reason is addressing bits within an integer runs into platform-specific byte ordering difference. Different machine processors (physical and virtual) make different assumptions. Two popular conventions are go by the names of Big-endian and Little-endian. It would divert me too far from my present argument to explain the significance of that, so if you want more detail on that I suggest you seek out a programmer with grey hairs and ask him about byte-ordering conventions.

A second reason to avoid bitmasks in XML is that avoids being part of the XML data model. You’ve created a private data model inside an integer and it cannot be described or validated by XML Schema, RELAX NG, Schematron, etc. Even XSLT, the most-used method of XML transformation today, lacks functions for bit-level manipulations. TC45’s charter included the explicit goal of:

…enabling the implementation of the Office Open XML Formats by a wide set of tools and platforms in order to foster interoperability across office productivity applications and with line-of-business systems

I submit that the use of bitmasks is the not the thing to do if you want support in a “wide set of tools and platforms”. It can’t be validated and it can’t be transformed.

Thirdly, the reasons for using bitmasks in the first place are not relevant in XML document processing. Don’t get me wrong. I’m not saying bit-level data structures are always wrong on all occasions. They are certainly the bread and butter of systems programmers, even today, and they was truly needed in the days where data was transferred via XModem on 12kbps lines. But in XML, when the representation of the data is already in an expansive text representation to facilitate cross-platform use, trying to save a byte of storage here or there, at the expense of the additional code and complexity required to deal with bitmasks, that the wrong trade-off. Remember in the end, the XML gets zipped up anyways, and will typically end up to be 10-20% the size of the same document in DOC format. So, these bitmasks aren’t really saving you much, if any, storage.

Fourthy, bitmasks are not self-describing. If I told you the “table style conditional formatting exception” had the value of 32, would that mean anything to you? Or would it send you hunting through a 6,000+ page specification in search for a meaning? But what if I told you that the value was “APPLY_FIRST_ROW”, then what would you say? A primary virtue of XML is that it is humanly readable. Why throw that advantage away?

Finally, there are well supported alternatives to bitmasks in standard XML, such as enumeration types on XML Schema. Why avoid a data representation that allows both validation and manipulation by common XML tools?

It seems to me that the only reason that bitmasks were used here is that the Excel application already used them. Much easier for Microsoft to make the specification match the source code than to make a standard that is good, platform and application neutral XML.

So, for the second time in a month the thought enters my mind: “You expect me to eat this tripe ?!”

When language goes on holiday

2006/10/15 By Rob 4 Comments

This apt phrase is from Wittgenstein, Philosophical Investigations, section 38, “Philosophical problems arise when language goes on holiday”. One cannot be sloppy in language without at the same time being sloppy in thought.

Of course, this thought is not new. In Analects 13:3, Confucius is given a hypothetical question by a disciple: “If the ruler of Wei put the administration of his state in your hands, what would you do first?”. Confucius replied, “There must be a Rectification of Names,” explaining:

If language is not correct, then what is said is not what is meant; if what is said is not what is meant, then what must be done remains undone; if this remains undone, morals and art will deteriorate; if justice goes astray, the people will stand about in helpless confusion. Hence there must be no arbitrariness in what is said. This matters above everything.

In that spirit, let us talk of “choice”, a word loaded with meaning. Choice is good, right? Who would voluntarily give up their god-given right to choose for himself? Reducing choice is immoral. A central role of government is to ensure that we can choose freely. For a market to thrive it must be free of every regulation that reduces our ability to choose. These are all self-evident truths.

Or are they?

Let me set you a problem. I place before you a glass of water. Whether it is half full or half empty I leave to your imagination. What use is this glass of water to you? Certainly you can drink it. Or you could sell it to someone else. Or you could create a derivative option to buy the water, and sell this option to someone else. Or you could pledge the water as collateral for some other purchase. You have several options, several choices. But suppose you are thirsty. Then what do you do with this nice, cold glass of water? If you drink it, then you can no longer sell it, sell options on it, or pledge it. Drinking the water eliminates choice. So better not to drink it. Just let it sit there, on the table. But still you get thirstier and thirstier.

What a cruel dilemma I’ve given you! You cannot drink without reducing your future options, without eliminating choice. Of course, the water slowing gets warmer and evaporates. Even not choosing is itself a choice.

The Moving Finger writes; and, having writ,
Moves on: nor all your Piety nor Wit
Shall lure it back to cancel half a Line,
Nor all your Tears wash out a Word of it.
— Omar Khayyam

How are we to make sense of this paradox? The fact is that every decision, ever choice you make, commits you and eliminates some other choices. We choose because without choosing we cannot claim the value in a single path among alternatives. If you want to quench your thirst then you must drink the water. It is that simple.

So I’ve found it amusing to see how Microsoft and their supporters constantly attack open source and open standards on the grounds that they reduce choice. For example, Microsoft’s lobbying arm, with the Orwellian doublespeak name “The Freedom to Innovate Network” lists this among its policy talking points:

[G]overnments should not freeze innovation by mandating use of specific technology standards

This talking point is picked up and repeated. Open Malaysia picks on a local news article which quoted a Microsoft director speaking on Malaysia’s move toward favoring Free and Open Source Software (FOSS) in government procurements:

My opinion is that it [the policy] limits choice as the country has a software procurement preference policy

The Initiative For Software Choice is the latest face on the hundred-headed hydra spreading FUD around the world. However they have recently had the embarrassment of seeing an example of their handiwork leaked to the press which is worth a read in full.

This in itself is neither new nor news, but it just recently occurred to me that this is all just an abuse of language, with no substance behind it. When one adopts a technology standard one does it with some desired outcome in mind. One chooses this path in order to receive that benefit. Adopting a standard is like drinking a glass of water. You doing it because you are thirsty.

A recent Danish report (the “Rambøll Report”) looked at the significant cost savings of moving the Danish government to OpenOffice/ODF compared to using MS Office with OOXML. Is it wrong to choose a less expensive alternative? Or is it better not to choose at all, and forgo the cost savings?

I think we need to all ask ourselves what we thirst for. Are you suffering from vendor lock-in? Are your documents tied to a single platform and vendor? Are you overpaying for software of which you use only a fraction of the functionality? Are you unable to move to a more robust desktop platform because your application vendor has tied its applications to a single platform? If you are thirsty, I have one word of advice: “Drink”.

A Leap Back

2006/10/12 By Rob

1/23/2007 — A translation of this post, in Spanish has been provided by a reader. You can find it in the Los Trylobytes blog.

I’ve also taken this opportunity to update page and section references to refer to the final approved version of the Ecma Office Open XML specification, as well as providing a link to the final specification.

Early civilizations tried to rationalize the motions of the heavenly bodies. The sun rises and sets and they called that length of time a “day”. The moon changes phases and they called a complete cycle a “month”. And the sun moves through the signs of the zodiac and they called that a “year”. Unfortunately, these various lengths of time are not nice integral multiples of each other. A lunar month is not exactly 30 days. A solar year is not exactly 12 lunar months.

To work around these problems, civil calendars were introduced — some of the world’s first international standards — to provide a common understanding of date reckoning, without which commerce, justice and science would remain stunted.

In 45 B.C., Julius Caesar directed that an extra day be added to February every four years. (Interestingly, this extra day was not a February 29th as we have today in leap years, but by making February 24th last for two days.) This Julian System was in use for a long time, though even it has slight errors. By having a leap year every four years, we had 100 leap years every 400 years. However, to keep the seasons aligned properly with church feasts, etc., (who wants to celebrate Easter in Winter?) it was necessary to have only 97 leap years every 400 years.

So, in 1582 Pope Gregory XIII promulgated a new way of calculating leap years, saying that years divisible by 100 would be leap years only if they were also evenly divisible by 400. So, the year 1600 and 2000 were leap years, but 1700, 1800 and 1900 were not leap years. This Gregorian calendar was initial adopted by Catholic nations like Spain, Italy, France, etc. Protestant nations pretty much had adopted it by 1752, and Orthodox countries later, Russia after their 1918 revolution and Greece in 1923.

So, for most of the world, the Gregorian calendar has been the law for 250-425 years. That’s a well-established standard by anyone’s definition. Who would possibly ignore it or get it wrong at this point?

If you guessed “Microsoft”, you may advance to the head of the class.

Datetimes in Excel are represented as date serial numbers, where dates are counted from an origin, sometimes called an epoch, of January 1st, 1900. The problem is that from the earliest implementations Excel got it wrong. It thinks that 1900 was a leap year, when clearly it isn’t, under Gregorian rules since it is not divisible by 400. This error causes functions like the WEEKDAY() spreadsheet function to return incorrect values in some cases. See the Microsoft support article on this issue.

Now I have no problems with that bug remaining in Excel for backwards compatibility reasons. That’s an issue between Microsoft and their customers and not my concern. However, I am quite distressed to see this bug promoted into a requirement in the Ecma Office Open XML (OOXML) specification. From Section 3.17.41 of SpreadsheetML Reference Material, page 3305 of the OOXML specification (warning 49MB PDF download!) , “Date Representation”:

For legacy reasons, an implementation using the 1900 date base system shall treat 1900 as though it was a leap year. [Note: That is, serial value 59 corresponds to February 28, and serial value 61 corresponds to March 1, the next day, allowing the (nonexistent) date February 29 to have the serial value 60. end note] A consequence of this is that for dates between January 1 and February 28, WEEKDAY shall return a value for the day immediately prior to the correct day, so that the (nonexistent) date February 29 has a day-of-the-week that immediately follows that of February 28, and immediately precedes that of March 1.

So the new OOXML standard now contradicts 400 years of civil calendar practice, encodes nonexistent dates and returns the incorrect value for WEEKDAY()? And this is the mandated normative behavior? Is this some sort of joke?

The “legacy reasons” argument is entirely bogus. Microsoft could have easily have defined the XML format to require correct dates and managed the compatibility issues when loading/saving files in Excel. A file format is not required to be identical to an application’s internal representation.

Here is how I would have done it. Define the OOXML specification to encode dates using serial numbers that respect the Gregorian leap year calculations used by 100% of the nations on the planet. Then, if Microsoft desires to maintain this bug in their product, then have Excel add 1 to every date serial number of 60 or greater when loading, and subtract 1 from every such date when saving an OOXML file. This is not rocket science. In any case, don’t mandate the bug for every other processor of OOXML. And certainly don’t require that every person who wants the correct day of the week in 1900 to perform an extra calculation.

Sure this requires extra code to be added to Excel. Excel has a bug. Of course it will require code to fix a bug. Deal with it. I think the alternative of forcing the rest of the world to a adopt a new calendar system is the ultimate in chutzpah. The burden of a bug should fall on the product that has the bug, not with everyone else in the world.

Further, I’d note that section 3.2.28 (page 2693) defines a workbookPr (Workbook Properties) element with several attributes including the following flag:

date1904 (Date 1904)

Specifies a boolean value that indicates whether the date systems used in the workbook starts in 1904.

A value of on, 1, or true indicates the date system starts in 1904.
A value of off, 0, or false indicates the workbook uses the 1900 date system, where 1/1/1900 is the first day in the system.

The default value for this attribute is false.

What is so special about 1904 you might ask? This is another legacy problem with Excel, that implementations of Excel on the Mac, for reasons unknown to me, had an internal date origin of January 1st, 1904 rather than January 1st, 1900. This is unfortunate for Microsoft’s Mac Business Unit, and has likely been a source of frustration for them, needing to maintain these two date origins in their internal code.

But why is this my problem? Why should a standard XML format care about what Excel does on the Mac? Why should it care about any vendor’s quirks? If RobOffice (a fictional example) wants to internally use a date origin of March 15th, 1903 then that is my business. In my implementation I can do whatever I want. But when it comes to writing a file format standard, then the caprices of my implementation should not become a requirement for all other users of the file format. Further, if I cannot make up my mind and choose a single date origin then my indecisions should not cause other implementations to require extra code because of my indecision.

So there you have it, two ways in which Microsoft has created a needlessly complicated file format, and made your life more difficult if you are trying to work with this format, all to the exclusive advantage of their implementation. I wish I could assure you that this is an isolated example of this approach in OOXML But sadly, it is the rule, not the exception.

The OOXML Compatibility Pack

2006/09/06 By Rob

Just saw something worth noting. I was on a machine running Office XP and tried to open an Office Open XML (OOXML) formatted document. I don’t know why I tried that, but I did.

Word was smart enough to put up the following dialog:

Now, that is something I hadn’t seen before. I think we all knew that Microsoft was planning a compatibility pack for enabling OOXML on Office 2003 and Office XP. But my 2002 version of Windows XP knows about OOXML? I guess this wisdom must have come down in a previously downloaded Office patch.

In any case, if you click yes, you are directed to this page where you are offered a download of “Microsoft Office Compatibility Pack for Word, Excel, and PowerPoint 2007 File Formats (Beta 2)”. I had the pre-req’s, which included Windows XP SP2 and Office XP SP 3. So downloading a few file conversion filters should be simple and small, right?

Well, simple, but not so small. I was suprised to see that the convertors download was 43MB. That seems a bit large. In comparison, you can download a complete copy of OpenOffice.org, with included support for ODF documents and the Office binary formats, and the entire product is only a 93MB download. The 0.2 ODF Add-in for Word is only 1MB in size. So why does adding OOXML support to Office XP require a 43MB download?

In any case, once it is downloaded and installed, the integration with Office appears seemless. You can open OOXML files from the Windows Explorer by double-clicking on them, you can browse and load them as expected from the File Open dialog in Office, you can re-save files in OOXML format via the File Save, you can create a new document and save it as OOXML, you can even configure Word XP so the OOXML formats are the default format for all saved documents in Word. In fact, you can do all of those things that the Microsoft-supported ODF Add-in is not doing.

As reported earlier, the Microsoft support for ODF puts this ISO standard at a distinct disadvantage, providing no shell integration, removing it from its expected place in the File/Open and File/Save menus, and preventing users from making it the default format in Office.

So, let’s update the file format support matrix:

Criterion	DOC Format in OpenOffice	ODF Format in Word 2007	OOXML Format in Word XP
1. Format supported in default install	Yes.	No. Requires a download and install of separate, unsupported Add-in.	No, but you are prompted to download a free converter pack the first time you attempt to open an OOXML file
2. File Open integration	Yes.	No. ODF is not listed in the default File Open dialog and doing a Control-O will not show ODF documents. However, ODF import is available in a separate menu item elsewhere in the menu system.	Yes.
3. Save new document integration	Yes.	No. In fact no ODF save ability exists in the current version of the Add-in. There is a place holder for the ODF save operation, though it is on its own menu, and would not be shown when doing a simple Control-S to save a new document.	Yes.
4. Can be made the default format	Yes.	No. Although other non-Microsoft formats, such as “Plain Text” can be made the default format, ODF cannot.	Yes.
5. Simple round-tripping	Yes.	No. When an ODF document is loaded, its name is automatically changed and it is made read-only. So loading sampler.odt results in Word having a read-only version of sampler_tmp.docx. Attempting a simple Control-S to save will give an error.	Yes.
6. Shell integration	Yes.	No.	Yes.

I tip my hat to Microsoft for the way they have provided OOXML support in earlier versions of Office. Aside from the size of the download, the process was simple and the integration was seamless. That’s the way it should be. But what makes them think that customers using ODF format would want anything less than this? That fact that they’ve been able to integrate OOXML so well only increases the shame in having integrated ODF so poorly.

A Tale of Two Formats

2006/08/22 By Rob 16 Comments

As he stood staring at them, they asked him no questions, for his face told them everything.

‘I cannot find it,’ said he, ‘and I must have it. Where is it?’

His head and throat were bare, and, as he spoke with a helpless look straying all around, he took his coat off, and let it drop on the floor.

‘Where is my bench? I’ve been looking everywhere for my bench, and I can’t find it. What have they done with my work? Time presses: I must finish those shoes.’

They looked at one another, and their hearts died within them.

Charles Dickens, a careful student of human nature, provides us here a vivid portrait of Dr. Alexandre Manette, who, after being held 18 years in the Bastille, is released, but is unable to adjust to his new freedom, and in times of stress lapses back to the familiarity of his prison labors, making shoes.

We all have been prisoners of Microsoft Office and their proprietary file formats. You may no longer recognize it as a prison, because this cell has been your home for the past 15 years, but here is what it looks like:

Editing a document requires Microsoft Office.
Since Office runs only on Windows, you also require Windows
These restrictions lead to a purely heavy-client view of document processing.
This also leads to a model of programmability that emphasizes storing executable code (macros/script) inside of the document, resulting in years of security nightmares. Here is a typical recital of the known dangers.
If you don’t want to put script inside your document, you could access the data via Office automation API’s, but this again required a machine running Windows and Office.
It also emphasizes a view of WYSIWYG which emphasizes early formatting and layout decisions and de-emphasized semantic richness in documents. For example, see “What has WYSIWYG Done to Us?”.
The tools that were created for us to record our thoughts instead now constrain or even substitute for our thoughts. For example, “PowerPoint Panders to our Weaker Points” in the Guardian, and Tufte’s “PowerPoint is Evil”.
The above also lead to a stifled the market for 3rd party document processing tools. We will never see the value of what was never allowed to occur, but the opportunity cost of the innovation that did not happen in this single-vendor world is enormous.
This also lead to general lack of competition in the productivity editor market, leading to a decade of buggy products with little innovation. Is the “Ribbon” the most we can look forward to?
We’ve been locked into a one-size-fits-all offerings of bloated applications. Many people are over-served by Office and therefor are over-paying for functionality they do not need, while others are under-served by the resulting products they cannot afford.
Functionality has been arbitrarily segregated into three and only three application classes, “Spreadsheet”, “Word Processor” and “Presentation Graphics”.

The move from proprietary binary formats to new standard formats, like OpenDocument Format (ODF), is a movement from imprisonment to freedom. The technical constraints have been lifted, but have we really made the mental adjustments necessary to engage our new freedom? Or are we still silently pacing a 10-foot cell in our minds? If we merely recreate our cell walls in XML, then we are still prisoners.

I am a creature of habit and have been as much a prisoner as you have, so don’t look to me for all the answers. But I do have a few thoughts on what this new freedom might look like.

Instead of being opaque black boxes that can only be used on one vendor’s system, documents will be transparent. Anyone can access them using whatever operating system and whatever tools they want, and for any purpose they want. Python on Linux, REXX on AS/400, and C# on Windows will all have equal opportunity.

This also implies that document processing will no longer be restricted, technically or by license, to the desktop. Innovative things will occur on servers. We’re starting to see some of that with Google Docs and wikiCalc. But that is only the beginning. We will see search engines that can intelligently search content for specific MathML expressions, spiders that will collect and aggregate slides from presentations and allow you to share them, document repositories that will automatically check citations in papers and calculate the intellectual social networks these imply, stock brokers that will allow you to download your statements formatted in a spreadsheet, with additional analytics calculated via spreadsheet formulas. Creating, editing, reading, viewing, storing, collaborating will be able to be done anywhere, from your cellphone to the largest servers.

Since the server typically has access not only to your own documents, but your organization’s as well, as well as easy access to other information about the users, such as your role and group via LDAP, an application can drive workflows that relate the contents of the document to similar content, as well as to you organizational role, and to your business. The companies that unlock the knowledge stored by your knowledge workers in your organization’s documents will be the companies leading us into the next decade.

The old walls will fall that once segregated functionality into the arbitrarily defined boundaries of “Spreadsheet”, “Word processor”, and “Presentation graphics”. Dan Bricklin is leading the way with his wikiCalc. Is it a Spreadsheet or is it a Wiki? If you have to ask the question then you are still a prisoner. The point is wikiCalc is whatever Dan Bricklin wants it to be. That is freedom to innovate. We will see the arbitrary divisions between application genres become fuzzy and fall away as we all recognize our new freedom.

Document programmability will be turned inside-out. Instead of putting code inside of the document, turning documents into virus vectors, the code will be carefully segregated. Once the code and the data are distinct, we can put the code on the server, where it can be more easily managed, maintained, and secured. This clean separation of code and data will be as important to system stability and security as was protected-mode in the 80286 processor when it first enforced this data/code separation at operating system level. I see macro viruses becoming a thing of the past, like smallpox, because the importance of data/code separation will finally be enforced, and users will not be emailing around code disguised in documents.

We will start thinking of documents as data, and as inputs to modules that process data. I see visual design tools that will allow you to drag and drop a document template onto a design surface and expose various fields in the document which can be wired up to databases, web services or other data sources.

I see financial analysts creating financial models in spreadsheets, then converting the spreadsheet into a web application that can then be deployed anywhere to provide browser-based access and execution of the model via any browser.

I see a variety of productivity editors available at a variety of price points, from free, open source ones, to commercial offerings for desktop and other devices, to specialized offerings with extra features for vertical markets, like legal, medical, academic, or scientific uses.

I see an escape from documents-as-pictures, where users sweat over pixel-perfection and pray that the applications don’t screw them up. Today the end user doesn’t worry about font kerning. We rely on the font managers to get this right, and we accept the results, and concentrate on what we, the authors, add to the document. We are freed from that mental burden of kerning. But why stop there? With smarter applications, we will be freed of most or all formatting burdens. We will concentrate on writing, not on styling, and rely on the applications to get the appearance right. This will free our time to give an increased emphasis on semantic richness, putting our knowledge and experience and outlooks and opinions into the document, and encoding it in an way that allows new modes of collaboration and redefines what a document is.

That is a gimpse at what freedom looks like to me. But let’s not forget that being freed is not the same as being free. There are those out there who are attempting to merely recreate the same single-vendor closed system we’ve had for the past 10 years, and recoding it in XML. This may be a comfortable choice to those who have known no other way. But is it really freedom? I look out and see the jailer offering to sell 10-foot apartments to those just released from their 10-foot prison cells. Will you follow?

Change Log

1/30/2007 — updated wikiCalc link, made other assorted wording changes at my whim, corrected a spelling error, changed to curly quotes.