• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar

An Antic Disposition

  • Home
  • About
  • Archives
  • Writings
  • Links
You are here: Home / 2007 / Archives for August 2007

Archives for August 2007

The dog that didn’t bark

2007/08/21 By Rob 15 Comments

“Is there any point to which you would wish to draw my attention?”

“To the curious incident of the dog in the night-time.”

“The dog did nothing in the night-time.”

“That was the curious incident,” remarked Sherlock Holmes.

— Silver Blaze by Sir Arthur Conan Doyle

A curious blog post from Brian Jones, looking at spreadsheet interoperability between Gnumeric and Apple’s new Numbers spreadsheet, using OOXML. Take a read there and come back and we can compare notes.

Did anything strike you as odd? What raised my eyebrows was the utterly trivial nature of the spreadsheet document that was tested. Typically an interoperability demonstration will be a little flashy, showing as much functionality as possible. But this one has no text attributes, only a single, default numeric style, no charts, no use of spreadsheet functions, nothing. Why bother? There is nothing in this spreadsheet that one could not easily have created in VisiCalc 25 years ago. So why is this simplistic document being used to demonstrate interoperability with OOXML? This seems very odd. Interoperability with a more substantial document would have been far more persuasive. So why didn’t they do that? Hmmm….

So I decided that I would give it a try, on my Windows XP laptop running Office 2007, OpenOffice 2.1 Novell Edition (giving it a test drive this week) and Gnumeric 1.7.10. Let’s see what really works.

First, let’s start with a more substantial spreadsheet document. I created the following in Office 2007, illustrating a variety of everyday features:

  1. numeric format
  2. simple text styles
  3. cell background fills
  4. cell alignment
  5. spreadsheet functions
  6. charts
  7. row widths
  8. worksheet password protection
  9. cell validation
  10. hyperlinks
  11. word art and shapes
  12. OLE embedding

Nothing fancy here, nothing that hasn’t been around in Office since Office 2000 or earlier. You could have done most of this in 1-2-3 version 2.3 a decade before that. I saved the document both in XLSX (OOXML) and XLS (legacy binary) formats. Both appeared identical as shown here:

Next, I tried opening the XLS file in OpenOffice. We see that it handled the file well:

The colors in the chart are clearly different, but I didn’t set any particular colors in the original, opting for the default. So this may just be an indication that the charts in OpenOffice have different default colors. As you can see from the above picture, everything else looks fine. I did verify that worksheet protections, cell validation and the hyperlink worked correctly. However, although the OLE embedding seems to be there, I was not able to activate it.

Next, I fired up Gnumeric to see how it would fare. I first tried the same XLS file which loaded and displayed like this:

What do we notice?

  1. Cell A7 did not format properly. It should be in long date format, but it is displaying in time format.
  2. Chart colors differ, but this is probably just a difference in defaults.
  3. Chart text is clipped in several instances.
  4. The OLE embedding failed to come through with correct metafile for display.
  5. Workbook protections and cell validation worked as expected.
  6. Hyperlink worked correctly.
  7. Arrow shape was dropped.

So, that is Gnumeric with a basic binary Excel worksheet. Since Gnumeric obviously supports this level of functionality, what would we expect to see when it loads the same document, but in OOXML format? See the image below for what I saw:

Hmm…OK… I think we hear the dog barking now. The OOXML import into Gnumeric is not really usable yet. In addition to the problems indicated above with the XLS import, we can add the following:

  1. None of the charts converted
  2. Worksheet password protection was lost
  3. The hyperlink is broken
  4. The OLE embedding is missing
  5. Cell validation is broken
  6. The word art is missing

Now a few points I wish to make concerning this. First I don’t want to have this taken as an attack on Gnumeric or those who work on it. Gnumeric is a fine spreadsheet with an impressive range of analytic features. The developers have done an outstanding job on it.

However, Microsoft points to Gnumeric as proof that OOXML can be implemented by other vendors. I suggest the jury is still out on this. 1-2-3 release 1.0a (1984) supported more functionality than Gnumeric does via OOXML. Note that even though there is no complete public documentation on the legacy binary formats, Gnumeric does a far better job at supporting them than it does with “standard” Ecma-376 OOXML and its 6,000 pages of documentation.

Now certainly, with much time and much effort, I’m sure Gnumeric will reach the point where it can read an OOXML document as well as it can read an XLS document. It might take another two or three years, but that day will come. But what benefit is that? All that effort will be spent writing code and testing to achieve practical results that Gnumeric already has achieved with the binary formats. This effort comes at the expense of other development activities such as adding features or fixing bugs. I hardly think that Jody wakes up in the morning joyed by the prospect of adding OOXML support to an application that is already compatible with billions of legacy Microsoft documents.

Similarly I have to scratch my head at OpenOffice and their announcement that they are adding OOXML support. As shown earlier, their support of the binary formats is already excellent. I guess that is why Microsoft is so eager to change their default formats. When a product like OpenOffice is able to effectively exchange documents with Office, it is too much of a threat to their Office revenue. So OpenOffice must now spend several person-years recreating this same level of interoperability, and the net result is that they will end up with the same capability they had before, but at the expense of forgoing work on new features. I wonder what Microsoft will do when OpenOffice catches up again in a few years? Hmmm…

(It would be interesting to examine out some of the other products that are said to support OOXML. Of the ones that support OOXML as well as the binary formats, how many of them also have OOXML support that is far worse than their binary format support? Is any editor vendor able to stand up and say that OOXML is a blessing to them because it allows higher fidelity interchange with Office than they were able to achieve with the binary formats?)

Everyone is in the same boat with this: KOffice, Corel, Google, IBM, anyone who has applications that work with Microsoft documents. We’re all faced with the prospect of significant expenses to rewrite our file format support with no net benefit to our customers. This is the toll we all must pay to Microsoft just for the ability to fight for the scraps their monopoly may leave behind. If Microsoft jerks their format around, we all must run and chase after it, reallocating resources away from feature work, becoming in the process less competitive in the marketplace, while Microsoft forges ahead with new features. They can easily repeat this game every few years, just to keep competitors busy. This is what a death spiral looks like.

Giving absolute control of a standard document format to a monopolist that is notorious for abusing their control of file formats in the past is insanity. It doesn’t take a Sherlock Holmes to figure that out.

  • Tweet

Filed Under: OOXML

e to the power of hype

2007/08/12 By Rob 20 Comments

I had a good chuckle over the new content at Microsoft’s Open XML Community web site. Please take a look. What it lacks in accuracy it makes up for in the use of shiny graphics and stock photos of shiny people, the kind of eye candy that years of shiny PowerPoint presentations have numbed us into believing is an adequate substitute for thought.

What especially caught my eye was this claim:

Global support for Open XML is growing exponentially. Thousands of organizations have joined OpenXMLCommunity.org, hundreds of ISVs are developing solutions on Open XML, and more and more governments are opting for Choice in standards policies. Additionally, more than 10 million compatibility packs that allow users of earlier versions of Microsoft Office to work with Open XML have been downloaded around the world. The momentum is growing, the adoption is real.

Exponential growth is quite a claim. But what is the evidence? Microsoft provides this chart further down on the page, showing the growth in their “community”:

Years ago, when I was a student, we had a technical term for curves like this. We called them “lines” and referred to this type of growth as “linear.” We did not call it “exponential growth”

Let’s take a look at the growth in document usage, instead of community membership. Here’s an update of a chart I showed a couple of months ago:

In this chart you see two series, one for ODF (blue) and one for OOXML (red). The horizontal axis shows the number of days since each standard was published, namely May 2005 for ODF and December 2006 for OOXML. The vertical axis shows the number of documents in that format on the web, according to Google, by doing “filetype” searches. For example, a query of “filetype:ods” gives you all of the ODS (ODF spreadsheet) documents on the web.

(Ben Langhinrichs also has some updated numbers and analysis on this topic.)

Is this what you would call exponential growth? Eight months after Office 2007 shipped, and despite the claim of “10 million compatibility packs” downloaded, the OOXML line is only slowing and linearly rising (R-squared=0.943). ODF remains 100-times more prevalent on the web today and is growing 20-times faster than OOXML.

So “Global support for Open XML is growing exponentially”? Uh. I don’t think so. Maybe something is growing exponentially, like the hype. But the users, the documents and the “community” — these appear to be only slowly and linearly growing.

But lest you leave without some dramatic growth to think about, let me share some with you. If you recall, back in April I brought your attention to the fact that two scientific journals, Science and Nature, were both rejecting submissions from authors in OOXML format. I’ve been looking around and found an embarrassingly large number of additional journals which explicitly disallow OOXML.

The Optical Society of America’s journal, Optics Letters, will not accept Word 2007 format. The American Phytopathological Society’s Plant Disease warns in bright red print [pdf], “This journal does not accept Microsoft Word 2007 documents at this time.” The American Institute of Physics, tells their authors “Word 2007 and the new Word docx format should not be used. Docx files will currently cause problems for reviewers and complicate many existing preproduction and production routines.” Vandose Zone Journal warns submitters that they cannot use the new equation editor in Word 2007 and should use MathML instead. “Word 2007 .docx format is not accepted” according to The Journal of Nutrition.

But wait, there’s more!

Wiley InterScience tells authors for almost 200 of its journals that “This journal does not accept Microsoft WORD 2007 documents at this time,” ruling out OOXML for authors of these journals:

  1. Journal of the History of the Behavioral Sciences
  2. International Journal of Quantum Chemistry
  3. Software Process: Improvement and Practice
  4. Pediatric Blood & Cancer
  5. Lasers in Surgery and Medicine
  6. Medicinal Research Reviews
  7. American Journal of Physical Anthropology
  8. Journal of Mass Spectrometry
  9. Journal of Polymer Science Part B: Polymer Physics
  10. Developmental Dynamics
  11. Journal of Applied Polymer Science
  12. Magnetic Resonance in Medicine
  13. Synapse
  14. Genes, Chromosomes and Cancer
  15. Journal of Medical Virology
  16. Flavour and Fragrance Journal
  17. Biofuels, Bioproducts and Biorefining
  18. Clinical Anatomy
  19. Hepatology
  20. Advances in Polymer Technology
  21. Journal of Orthopaedic Research
  22. Molecular Carcinogenesis
  23. Environmental Progress
  24. Infant Mental Health Journal
  25. Annals of Neurology
  26. International Journal of Imaging Systems and Technology
  27. Developmental Neurobiology
  28. AIChE Journal
  29. Journal of Traumatic Stress
  30. genesis
  31. Meteorological Applications
  32. Process Safety Progress
  33. Atmospheric Science Letters
  34. Systems Research and Behavioral Science
  35. Journal of Community Psychology
  36. Diagnostic Cytopathology
  37. Birth Defects Research Part B
  38. Journal of Software Maintenance and Evolution
  39. International Journal of Climatology
  40. The Chemical Record
  41. Wireless Communications and Mobile Computing
  42. International Journal of Intelligent Systems
  43. Computer Animation and Virtual Worlds
  44. Statistics in Medicine
  45. Concurrency and Computation: Practice and Experience
  46. Developmental Psychobiology
  47. Applied Stochastic Models in Business and Industry
  48. The Prostate
  49. Journal of Computational Chemistry
  50. X-Ray Spectrometry
  51. Peditric Blood & Cancer
  52. Random Structures and Algorithms
  53. Microwave and Optical Technology Letters
  54. Lasers in Surgery and Medicine
  55. Rapid Communications in Mass Spectrometry
  56. Weather
  57. Mental Retardation and Developmental Disabilities Research Reviews
  58. International Journal of Finance & Economics
  59. Psycho-Oncology
  60. Chirality
  61. Applied Cognitive Psychology
  62. American Journal of Medical Genetics Part B:
  63. Medicinal Research Reviews
  64. Biopharmaceutics & Drug Disposition
  65. Zoo Biology
  66. Catheterization and Cardiovascular Interventions
  67. Plus 103 more journals!

Oxford Journals, part of Oxford University Press, tells submitters, “This journal does not accept Microsoft Word 2007 documents at this time.” Journals directly effected by this policy include:

  1. Bioinformatics
  2. Journal of Antimicrobial Chemotherapy
  3. American Journal of Epidemiology
  4. PEDS
  5. Briefings in Functional Genomics & Proteomics
  6. The Computer Journal
  7. Health Policy and Planning
  8. Journal of Environmental Law
  9. Review of English Studies
  10. Behavioral Ecology
  11. ELT Journal
  12. Molecular Biology and Evolution
  13. CESifo Economic Studies
  14. Journal of Pediatric Psychology
  15. Cerebral Cortex
  16. Literary and Linguistic Computing
  17. Molecular Human Reproduction
  18. Enterprise & Society
  19. Age and Ageing
  20. European Journal of Public Health
  21. Publius
  22. Integrative and Comparative Biology
  23. Nephrology Dialysis Transplantation
  24. Rheumatology
  25. Glycobiology
  26. And 35 more journals!

Blackwell Publishing, publisher of over 800 journals, rejects OOXML submissions telling authors, “Will authors please note that Word 2007 is not yet compatible with journal production systems.” This adds to our list of journals where OOXML cannot be used:

  1. Psychophysiology
  2. Acta Anaesthesiologica Scandinavica
  3. Transfusion Alternatives in Transfusion Medicine
  4. Acta Neuropsychiatrica
  5. Nursing Forum: An Independent Voice for Nursing
  6. Experimental Techniques: A Publication for the Practicing Engineer
  7. Cytopathology
  8. Asian Journal of Social Psychology
  9. Journal of Anatomy
  10. Annals of Applied Biology
  11. Lethaia: An International Journal of Palaeontology
  12. Journal of the American Water Resources Association
  13. Clinical Physiology and Functional Imaging
  14. Ibis: The International Journal of Avian Science
  15. Basin Research
  16. Digestive Endoscopy
  17. Journal of Empirical Legal Studies
  18. European Journal of Neurology
  19. Surgical Practice: Formerly Annals of the College of Surgeons
  20. FEMS Yeast Research
  21. FEMS Microbiology Reviews
  22. FEMS Microbiology Ecology
  23. FEMS Microbiology Letters
  24. Regulation & Governance
  25. FEMS Immunology & Medical Microbiology
  26. Clinical and Experimental Optometry
  27. Journal of Food Process Engineering
  28. The Journal of Cardiovascular Electrophysiology
  29. Medical Education
  30. European Journal of Clinical Investigation
  31. Diseases of the Esophagus
  32. Sleep and Biological Rhythms
  33. International Migration Review
  34. Computational Intelligence
  35. Asia Pacific Viewpoint
  36. Seminars in Dialysis
  37. Peace & Change: A Journal of Peace Research
  38. Journal of Applied Social Psychology
  39. Basic & Clinical Pharmacology & Toxicology
  40. Dermatologic Therapy
  41. WorkingUSA: The Journal of Labor and Society
  42. Journal of Travel Medicine
  43. Singapore Journal of Tropical Geography
  44. Australasian Radiology
  45. Genes to Cells
  46. The Clinical Respiratory Journal
  47. Echocardiography
  48. The American Journal of Gastroenterology
  49. Histopathology
  50. Personal Relationships
  51. Clinical and Experimental Dermatology
  52. Alcoholism: Clinical and Experimental Research
  53. Experimental Dermatology
  54. Journal of Social Philosophy
  55. The Journal of Popular Culture
  56. Pathology International
  57. Pain Practice
  58. The Journal of American Culture
  59. Clinical & Experimental Immunology
  60. Religious Studies Review
  61. Entomological Science
  62. Plus 107 more journals!

I won’t claim it is exponential, but I will suggest that the most impressive growth occurring around OOXML is the number of journals that will not accept it.

  • Tweet

Filed Under: OOXML

The most recognized tune of all time

2007/08/10 By Rob 41 Comments

Simple question. What tune would you say is the most recognized tune? If we limited ourselves to the United States and the present day, the answer might be “Happy Birthday.”

What if we included all time and all nations? “Happy Birthday” goes back to only 1893. Some tunes are much older, like “Greensleeves,” (16th century) but well-known in only some nations. While others have global reach, but are of more recent vintage, like McCartney’s “Yesterday” (1965).

So what do you get if you account for both factors and try to seek the tune that the most people in history would be able to recognized, something that has great durability over time as well as a global reach?

Any ideas? I’ll hold my guess and post it later.

  • Tweet

Filed Under: Music

Two Feet, No Feathers

2007/08/02 By Rob 20 Comments

We typically use words to communicate, to be understood. That is the common case, but not the only case. In some situations, words are used like metes and bounds to carefully circumscribe a concept by the use of language, in anticipation of another party attempting a breach. This is familiar in legislative and other legal contexts. Your concept is, “I want to lease my summer home and not get screwed,” and your attorney translates that into 20 pages of detailed conditions. You can be loose with your language, so long as your lawyer is not.

But even among professionals, the attack/defense of language continues. One party writes the tax code, and another party tries to find the loopholes. Iteration of this process leads to more complex tax codes and more complex tax shelters. The extreme verbosity (to a layperson) of legislation, patent claims or insurance policies results from centuries of cumulative knowledge which has taught the drafters of these instruments the importance of writing defensively. The language of your insurance policy is not there for your understanding. Its purpose is to be unassailable.

This “war of the words” has been going on for thousands of years. Plato, teaching in the Akademia grove, defined Man as “a biped, without feathers.” This was answered by the original smart-ass, Diogenes of Sinope, aka Diogenes the Cynic, who showed up shortly after with a plucked chicken, saying, “Here is Plato’s Man.” Plato’s definition was soon updated to include an additional restriction, “with broad, flat nails.” That is how the game is played.

In a similar way Microsoft has handed us all a plucked chicken in the form of OOXML, saying, “Here is your open standard.” We can, like Plato, all have a good laugh at what they gave us, but we should also make sure that we iterate on the definition of “open standard” to preserve the concept and the benefits that we intend. A plucked chicken does not magically become a man simply because it passes a loose definition. We do not need to accept it as such. It is still a plucked chicken.

(This reminds me of the story told of Abraham Lincoln, when asked, “How many legs does a dog have if you call the tail a leg?” Lincoln responded, “Four. Calling a tail a leg does not make it a leg.”)

With the recent announcement here in Massachusetts that the ETRM 4.0 reference architecture will include OOXML as an “open standard” we have another opportunity to look at the loopholes that current definitions allow, and ask ourselves whether these make sense.

The process for recommending a standard in ETRM 4.0 is defined by the following flowchart:

So, let’s go through the first three questions that presumably have already been asked and answered affirmatively in Massachusetts, to see if they conform to the facts as we know them.

  1. Is the standard fully documented and publicly available? Can we really say that the standard is “fully documented” when the ISO review in the US and in other countries is turning up hundreds of problems that are pointing out that the standard is incomplete, inconsistent and even incorrect? We should not confuse length with information content. Just as a child can be overweight and malrnourished at the same time, a standard can be 6,000 pages long and still not be “fully documented.” Of course, we could just say, “A standard fully documents the provisions that it documents” and leave it at that. But such a tautological interpretation benefits no one in Massachusetts. We should consider the concept of enablement as we do when prosecuting patent applications. If a standard does not define a feature such that a “person having ordinary skill in the art” (PHOSITA) can “make and use” the technology described by the standard without “undue experimentation” then we cannot say that it is “fully documented.” By this definition, OOXML has huge gaps.
  2. Is the standard developed and maintained in a process that is open, transparent and collaborative? We’re talking about Ecma here. How can their process be called transparent when they do not publicly list the names of their members or attendance at their meetings, do not have public archives of their meeting minutes, their discussion list or document archive, do not make publicly available their own spreadsheet of known flaws in the OOXML specification nor of the public comments they received during their public review period? How is this, by any definition, considered “transparent”? We can also question whether the process was open. When the charter constrains the committee from making changes that would be adverse to a single vendor’s interests, it really doesn’t matter what the composition of the committee is. The committee’s hands are already tied and should not be considered “open.” If I were writing a definition of an open, transparent process, I’d be sure to patch those two loopholes.
  3. Is the standard developed, approved and maintained by a Standards Body? Without further qualifying “Standards Body” this is a toothless statement. As should be apparent right now, not all SDO’s are created equal. Some of the standards equivalent of diploma mills. Accreditation is the way we usually solve this kind of problem. Ecma’s Class A Liaison status with JTC1 is not an accreditation since their liaison status has no formal requirements other than expressing interesting in the technical agenda of JTC1. In comparison, OASIS needed to satisfy a detailed list of organizational, process, IPR and quality criteria before their acceptence as a PAS Submitter to JTC1/SC34. Why bother having a requirement for a Standards Body unless you have language that ensures that it is not a puppet without quality control?
  4. Is there existing or growing industry support around the use of the standard? Again, very vague. A look at Google hits for OOXML documents shows that there are very few actually in use. My numbers show that only 1 in 10,000 new office documents are in OOXML format. But I guess that is more than 0 in 10,000 that existing last year. But is this really evidence for “growing industry support”? I’d change the language to require that there be several independent, substantially full implementations.

There are two additional questions which I won’t presume to answer since they rely more on integration with internal ITD processes.

We learn lessons and move on to the next battle. Just as GPLv2 required GPLv3 to patch perceived vulnerabilities, we’ll all have much work to do cleaning up after OOXML. Certainly JTC1 Directives around Fast Tracks will need to be gutted and rewritten. Also, the vague and contradictory ballot rules in JTC1, and the non-existent Ballot Resolution Meeting procedures will need to be addressed. I suggest that ITD take another look at their flowchart as well, and try to figure out how they can avoid getting another plucked chicken in the future.

  • Tweet

Filed Under: OOXML, Standards

An Invitation: ODF Interoperability Workshop

2007/08/02 By Rob Leave a Comment

The OASIS ODF Adoption TC is organizing an ODF Camp to be held on September 20th in Barcelona, Spain. Facilities for this event are graciously provided by OpenOffice.org, which will be holding its annual conference concurrently.

The hope is that this will be the first of several such events to bring ODF vendors together to explore ways of greater technical coordination, especially in the area of interoperability. I’ve written about and presented on this topic before. Now is the time for action, and I’m extremely pleased that so many vendors will be attending.

On other occasions I’ve called interoperability “the price of success” because a standard implemented by only a single vendor and a single application need not worry about it. Only successful standards with many implementations need to rent a hall to bring the implementors together to review and perfect interoperability.

(It is like capital gains taxes. I grumble when I pay them, but take some solace in the fact that my investments were profitable. Those who make a losing investment don’t pay capital gains taxes on it.)

The focus of this first interoperability event will be on the ODF word processor format. Follow-up events will look at spreadsheets and presentations.

Please have a look at the detailed agenda for the camp and consider joining us in Barcelona.

  • Tweet

Filed Under: Interoperability, ODF

  • « Go to Previous Page
  • Go to page 1
  • Go to page 2

Primary Sidebar

Copyright © 2006-2023 Rob Weir · Site Policies