• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar

An Antic Disposition

  • Home
  • About
  • Archives
  • Writings
  • Links
You are here: Home / Archives for 2009

Archives for 2009

OpenDocument Format: The Standard for Office Documents

2009/05/05 By Rob Leave a Comment

A belated note that an article of mine on ODF was recently published in IEEE Internet Computing, called “OpenDocument Format: The Standard for Office Documents“. I think it is a good introduction to ODF, what it is, where it came from and why it is important. They allow authors to post a copy on their websites. So feel free to link to it, but any redistribution will need to be negotiated with the publisher.

At the same time I’ve taken the opportunity to put together a new web page of some of my other publications, workshop and conference presentations. I have few others that I want add, once I find them. But this is a start.

Filed Under: ODF

Update on ODF Spreadsheet Interoperability

2009/05/03 By Rob 33 Comments

[2009/05/07 — I’ve posted a follow up article on this topic which you may want to read]

A couple of months ago I did some experiments on the interoperability of ODF spreadsheets, the theory and practice. In that earlier post I looked at the then current ODF implementations, including:

  1. OpenOffice.org 2.4
  2. Google Spreadsheets
  3. KOffice KSpread 1.6.3
  4. IBM Lotus Symphony 1.1
  5. Microsoft Office 2003 with the Microsoft-sponsored CleverAge Add-in version 2.5
  6. Microsoft Office 2003 with Sun’s ODF Plugin

I created a test document in each of those editors and then loaded each test document in each of the other editors. I showed what worked, what didn’t, and made some suggestions on how interoperability could be improved. I found only two notable failures, when the Microsoft/CleverAge Add-in for Excel loaded KSpread and Symphony documents. The other scenarios I tested were OK:

Created In
CleverAge Google KSpread Symphony OpenOffice Sun Plugin
Read In

 

CleverAge OK OK Fail Fail OK OK
Google OK OK OK OK OK OK
KSpread OK OK OK OK OK OK
Symphony OK OK OK OK OK OK
OpenOffice OK OK OK OK OK OK
Sun Plugin OK OK OK OK OK OK

I lot has happened in the two months since I did that analysis. Several of the applications I tested have been updated:

  • CleverAge has released version 3.0 of their Add-in.
  • OpenOffice 3.01 is now out and in wide use.
  • Symphony 1.3 is now in beta.
  • The Sun ODF Plugin is now at version 3.0.
  • Microsoft Office 2007 SP2 has been released, with integrated ODF support.
  • KOffice 2.0 RC 1 is now available.

I haven’t been able to get the release candidate of KOffice installed, so I’m still including KSpread 1.6.3 in my tests, but for the rest I have created new test files in each editing environment, saved them to ODF format and then loaded the resulting documents into each of the other editors. From these test documents I was able to perform 42 different test combinations.

I’ll explain a bit more how I tested, then give you the table of results, and finally make some observations and recommendations.

The test scenario I used was a simple wedding planner for a fictional user, Maya, who is getting married on August 15th. She wants to track how many days are left until her wedding, as well as track a simple ledger of wedding-related expenses. Nothing complicated here. I created this spreadsheet from scratch in each of the editors, by performing the following steps:

  • Enter the title in A1 “May’s Wedding Planner” and increased font size to 14 point.
  • Enter formula = TODAY() in B3 and set US style MM/DD/YY date format/
  • Enter the date of the wedding as a constant in cell B4, also setting date format.
  • Added simple calculations on cells B6-B8, to calculate days, weeks and months until the wedding.
  • A11 through E16 is a simple ledger of the kind that is done thousands of times a day by spreadsheet users everywhere. Once you have the formula set up in column E (Balance = previous balance + credits – debits) then you can simply copy down the formula to the new row for each new entry.

The resulting spreadsheet looks something like this:

Feel free to download a zip of all of the test spreadsheet files. The file names should be self-explanatory.

Here is what I found when I tested the various scenarios:

Created In
Google KSpread Symphony OpenOffice Sun Plugin CleverAge MS Office 2007 SP2
Read In

 

Google OK OK OK OK Fail OK Fail
KSpread OK OK OK Fail Fail OK Fail
Symphony OK OK OK OK OK Fail Fail
OpenOffice OK OK OK OK OK OK Fail
Sun Plugin OK OK OK OK OK OK Fail
CleverAge Plugin OK OK OK OK Fail OK OK
MS Office 2007 SP2 Fail Fail Fail Fail Fail Fail OK

So what is happening here?

CleverAge appears to have heeded the advice from my earlier blog post and now correctly processes KSpread and Symphony spreadsheets. This is great news and they deserve credit for that work. But this is a small bit of good news in a table that now shows awful lot of red. Let’s see if we can figure this out.

First, some combinations that worked previously, when I tested two months ago, are now not working:

  • Symphony 1.3 beta hangs when attempting to load the spreadsheet created with the CleverAge 3.0 ODF Add-in. Symphony 1.1 also hangs when trying to load that same spreadsheet. However both versions of Symphony work fine when loading the CleverAge 2.5 spreadsheet from two months ago. The CleverAge document appears to be valid, so my guess is that this is a bug in the Symphony 1.3 beta. I’ll pass this document on to the Symphony development team to see what they say.
  • KSpread 1.6.3 does not read formulas from OpenOffice 3.01 documents. KSpread had no problems with OO 2.4 documents. The problem appears to be that OpenOffice 3.01, by default, writes out documents according to the ODF 1.2 draft which puts formulas in the OpenFormula namespace. But KSpread is expecting them in the legacy namespace. The result is that spreadsheet formulas are dropped when the document is loaded in KSpread.
  • In a similar way, Sun’s new ODF Plugin writes out documents according to the ODF 1.2 draft. KOffice is unable to handle these files. This also causes problems for Google Spreadsheets and the Microsoft/CleverAge Plugin for Excel, which report errors “We were unable to upload this document” and “The converter failed to open this file”.

The new entry to the mix is Microsoft Office 2007 SP2, which has added integrated ODF support. Unfortunately this support did not fare well in my tests. The problem appears to be how it treats spreadsheet formulas in ODF documents. When reading an ODF document, Excel SP2 silently strips out formulas. What is left is the last value that cell had, when previously saved.

This can cause subtle and not so subtle errors and data loss. For example, in the test document I presented above, the current date is encoded using the TODAY() spreadsheet function. If the formulas are stripped, then this cell no longer updates, and will return the wrong value. Similarly, if Maya tries to continue her ledger of expenses by copying the formula cells from column E down a row, this will cause incorrect calculations, since there is no longer a formula to copy, so she would just be copying the prior balance. In general, SP2 converts an ODF spreadsheet into a mere “table of numbers” and any calculation logic is lost.

In the other direction, when writing out spreadsheets in ODF format, Excel 2007 SP2 does include spreadsheet formulas but places them into an Excel namespace. This namespace is not what OpenOffice and other ODF applications use. It is not the ODF 1.2 namespace. It isn’t even the OOXML namespace. I have no idea what it is or what it means. Not every ODF application checks the namespace of formulas when loading documents, but the ones that do reject the SP2 documents altogether. And the ones that do not check the namespace try and fail to load a formula since it is syntactically different than what they expected. The applications essentially display a corrupted document that is shows neither the formula nor the value correctly. For example, a SP2 document, loaded in MS Office using the Sun ODF Plugin looks like this:

Similar corruption occurs when loading the Excel 2007 SP2 spreadsheet into KSpread, Symphony and OpenOffice. Google doesn’t import the document at all.

I must admit that I’m disappointed by these results. This is not a step forward compared to where we were two months ago. This is a big step backwards. Spreadsheet interoperability is not hard. This is not rocket science. Everyone knows what TODAY() means. Everyone knows what =A1+A2 means. To get this wrong requires more effort than getting it right. It is especially frustrating when we know that the underlying applications support the same fundamental formula language, or something very close to it, and are tripped up by lack of namespace coordination. Whether it is accidental or intentional I don’t know or care. But I cannot fail to notice that the same application — Microsoft Excel 2007 — will process ODF spreadsheet documents without problems when loaded via the Sun or CleverAge plugins, but will miserably fail when using the “improved” integrated code in Office 2007 SP2. This ain’t right.

I have some suggestions for how to move things forward again. There will be a lot less red on the above table if two simple changes are made:

  1. Sun should write out formulas in ODF 1.1 format, using the legacy “oooc” namespace prefix that the other vendors are using. Remember, the other vendors are using that namespace specifically for compatibility with OO’s ODF documents. This is the current convention. To unilaterally switch, without notice or coordination, to a new namespace, is not cool. When ODF 1.2 is an approved standard, then we all can move there in a coordinated fashion, to cause users minimal inconvenience. But the above table clearly shows the confusion that results if this move is not coordinated. I know OO 3.01 has an option to save in ODF 1.0/1.1 format. IMHO, this setting should be the default. I’m not sure if the Sun Plugin has a similar configuration option, but I hope it does.
  2. In addition to writing out compatible formulas as per the above comments on the Sub Plugin, Microsoft should remove the code in SP2 that causes it to reject every other vendor’s spreadsheet documents. Give the user a warning if you need to, but let them have the choice.

Finally, let me try to anticipate and debunk some of the counter-arguments which might be raised to argue against interoperability.

First, we might hear that ODF 1.1 does not define spreadsheet formulas and therefore it is not necessary for one vendor to use the same formula language that other vendors use. This is certainly is true if your sole goal is to claim conformance. If your business model requires only conformance and not actually achieving interoperability, then I wish you well. But remember that conformance and interoperability are not mutually exclusive options. An application can be conformant to a standard and also be interoperable, if you use the legacy formula namespace and syntax. So the desire to be conformant is not an excuse for not also being interoperable, or at least not a valid excuse. One might also wryly note that Microsoft has several Directors of Interoperability, not Directors of Minimal Conformance, and they workshops are called Document Interoperability Initiatives, not Minimal Conformance Initiatives. The difference between minimal conformance and interoperability is well illustrated in these tests.

Remember, it is not particularly difficult or clever to to take an adverse reading of a standard to make an incompatible, non-interoperable product. Take HTML, for example. It does not define the attributes of unstyled (default) text. So I could create a perfectly conformant browser implementation that makes all default text be 4-point Zapf Dingbats, white text on a white background. It would conform with the standard, but it would be perfectly unusable by anyone. If you try hard enough you can create 100% conformant, but non-interoperable, implementations of almost most standards. Standards are voluntary, written to help coordinate multiple parties in their desires for interoperability. Standards are not written to compel interoperability by parties who do not wish to be interoperable.

(A side point is that SP2’s implementation of ODF spreadsheets does not, in fact, conform to the requirements of the ODF standard, but that is another story, for another blog post.)

We might also hear concerns that supporting other vendors’ ODF spreadsheet formulas cannot be done because this formula language is undocumented. The irony here is that the formula language used by OpenOffice (and by other vendors) is based on that used by Excel, which itself was not fully documented when OpenOffice implemented it. So an argument, by Microsoft, not to support that language because it is not documented is rather hypocritical. Excel supports 1-2-3 files and formulas and legacy Excel versions (back to Excel 4.0) neither of which have standardized formula languages. Why are these supported? Also, the fact that the Microsoft/CleverAge add-in correctly reads and writes the legacy ODF formula syntax shows not only that it can be done, but that Microsoft already has the code to do it. The inexplicable thing is why that code never made it into Excel 2007 SP2.

We’ll probably also hear that 100% compatibility with legacy documents is critical to Microsoft users and that it is dangerous to try to save Excel formulas into interoperable ODF formulas because there is no guarantees that OpenOffice or any other ODF application will interpret them the same as Excel does. So one might try to claim that Microsoft is protecting their customers by preventing them from saving interoperable spreadsheet formulas. But we should note that fully-licensed Microsoft Office users have already been creating legacy documents in ODF format, using the Microsoft/CleverAge ODF Add-in. These paying Microsoft Office customers will now see their existing investment in ODF documents, created using Microsoft-sanctioned code, get corrupted when loaded in Excel 2007 SP2. Why are paying Microsoft customers who used ODF less important than Microsoft customers who used OOXML? That is the shocking thing here, the way in which users of the ODF Add-in are being sacrificed.

If you are cynical, you might observe that if Excel 2007 SP2 allowed Microsoft/CleverAge ODF Add-in formulas to work correctly, then SP2 would need to allow all vendors’ formulas to work, since the other vendors are using the same legacy namespace. The only way for Microsoft to make their legacy ODF documents work and to exclude other vendors would be (hypothetically) to specifically look in the document for the name of the application that created the document, and allow their ODF Add-in but reject OpenOffice, etc. IANAL, but I think something like that would look very, very bad to competition authorities. So the only way out, if your goal (hypothetically) is to avoid interoperability, is to sacrifice your existing Office customers who are using the Microsoft/CleverAge ODF Add-in. It serves them right for not sticking to the party line in the first place. This’ll teach ’em good.

Of course, I am not that cynical. I was taught to never assume malice where incompetence would be the simpler explanation. But the degree of incompetence needed to explain SP2’s poor ODF support boggles the mind and leads me to further uncharitable thoughts. So I must stop here.

As I mentioned before, this is a step backwards. But it is just one step on the journey. Let’s look forward (and move forward). This is just code. Code can be fixed. We know exactly what is needed to have good interoperability of spreadsheet formulas. In fact most of the code already exists for this. The only thing we need now is to actually go do it and not get too far ahead, or lag too far behind from the other implementations. This is more a question of timing and coordination than hard technical problems.

[2009/05/07 — For more on this topic, see my “A follow-up on Excel 2007 SP2’s ODF Support“]

Filed Under: Interoperability, ODF

Shooting Daffodils

2009/04/21 By Rob 2 Comments

I like daffodils. I’ve been planting a couple hundred additional bulbs each fall, so that now I have a lovely spring-time display, right around this time.

In past years I would walk through the garden and take a photo here and there, mainly while standing, shooting straight down, not paying particular attention to the lighting or the composition. Flower “mug shots” I’d call them. Then last year, I started doing macro (close-up) photography. Although the results were technically adequate — sharp, detailed closeups — they were…well… rather dull, symmetrical and artless.

This year I’ve decided to try something different. I realized that a flower can be posed like a person. I guess that is obvious in retrospect, but it never occurred to me before that the poses of classical portraiture, like the 2/3 view, over-the-shoulder, profile view, etc., apply to flowers as well as people. And you don’t need to show all of the flower. A close up of part of it can also be interesting.

I’ve also worked to improve my technique, shooting with a tripod and remote trigger, using the McClamp to steady and isolate the blossoms in the field, using small erapertures to get greater depth of field, locking the mirror up before shooting to reduce any residual camera shake, shooting on days and at times where harsh shadows can be avoided, etc.

Here are three examples, intimate portraits, all shot on location in my garden. You can view more on my Flickr page.

Daffodil

Daffodil

Daffodil

Filed Under: Photography Tagged With: Daffodils, Macro Photopgraphy, Narcissus

A Time for Decision

2009/04/13 By Rob 1 Comment

April 15th is Tax Day in the United States, the day by which we must file our income tax returns for 2008 and pay any balance due.

The day before, April 14th, is also a day of reckoning, with another outstretched hand asking for our money. This is the day which marks the end of “mainstream support” for Microsoft Windows XP and Office 2003. After this date, licensed owners of these products will no longer receive free support and updates.

Depending on how consumers respond, one of three things will result from this end of life.

  1. Users migrate to Vista/Office 2007
  2. Users stay on unsupported Microsoft products for the near term and wait for Windows 7/Office 14 to come out in 2010.
  3. Users take the opportunity to evaluate the available alternatives, including open source.

Since Windows XP is the most widely-deployed version of Windows, and Windows is the most-widely deployed operating system in the world, many licenses will be up for grabs as IT shops decide what to do next. Especially in these tough economic times, upgrading to Vista just to see Vista become obsolete in less than a year doesn’t make sense. But neither does remaining on an unsupported version of Windows.

This is a significant opportunity for alternatives, such as Linux and other open source applications, to increase their representation on the desktop. We should spend the next nine months making it especially easy for Microsoft’s seemingly unwanted and expendable Windows XP and Office 2003 customers to migrate to better alternatives. The Windows/Office release calendar and economic conditions have combined to make this a huge upgrade cycle. In 2010 almost everyone will be looking to upgrade. An opportunity like this does not come every year. Let’s make the most of it!

Filed Under: Open Source Tagged With: Linux, Windows

Project BudBurst

2009/03/29 By Rob 2 Comments

Drift of Crocuses

The crocuses have bloomed here in Westford, one of first four flowers of my spring garden, the others being Galanthus nivalis (Snowdrops), Iris reticulata (dwarf Iris) and Eranthis hyemalis (Winter Aconite). But the crocuses are the most noticeable, since I have naturalized them in small drifts over the lawn.

For a few years I’ve been keeping a garden journal and have recorded the dates of first bloom for various flowers. So I see that for the snowdrops, the first bloom was March 14th this year, March 26th in 2008 and March 24th in 2007. Is this global warming? From just three observations, there is no way of telling.

But what if we had thousands of people record such information all over the country and pool their observations? Then we might be able to observe some interesting patterns. That is the idea of Project BudBurst, a distributed public field study run by a group of researches from UCAR, the Chicago Botanic Garden and the College of Forestry and Conservation of University of Montana. You sign up for a free account, state your location (US only, sorry) and pick from a list of local plant species that you can observe.

The emphasis is on widespread, native species, so the exotic bulbs I have in my garden won’t be of use. But I can report observations on things like dandelions or white pines. Depending on the type of plant, you report the date it reaches each of various “phenophases” such as first flower, fully flowering, pollen release, first ripe fruit, etc. Different types of plants will have different phenophases. Volunteers enter their observations which are then plotted, along with all the other data on Google Maps.

Aside from climate change research, I wonder if this might also be useful for predicting the onset of spring allergies?

Filed Under: Gardening Tagged With: Crocus, Galanthus, Project BudBurst

  • « Go to Previous Page
  • Page 1
  • Interim pages omitted …
  • Page 3
  • Page 4
  • Page 5
  • Page 6
  • Page 7
  • Interim pages omitted …
  • Page 9
  • Go to Next Page »

Primary Sidebar

Copyright © 2006-2026 Rob Weir · Site Policies