Archives for 2008

What is Rick smoking?

2008/07/17 By Rob 11 Comments

Former Microsoft consultant Rick Jelliffe has posted his own particular brand of science fiction/fantasy, this time in his favorite subgenre, a parody of a drug-induced psychosis, where after uneasy slumber Rick awakes in some alternate parallel universe and finds that JTC1/SC34 is open and transparent and OASIS is closed, and decides to write a rambling blog post about it.

If you like unintentional humor, you will enjoy reading Rick’s over-the-top post.

Rick suggests that organizationally JTC1/SC34 is a more participatory environment for developing standards than OASIS.

JTC1’s process, based on National Body voting is both effective … and more genuinely open, because it is impossible to stack either directly or indirecty.

Let’s test that proposition. Let’s compare OASIS and JTC1/SC34.

Who can participate? In OASIS, anyone can participate, from any company, organization, government agency, non-profit corporation in the world. Or you can join as an unaffiliated individual, as many have. You don’t need your government’s permission to join. You just do it. Most join with a nominal membership fee ($300 for individuals) but membership grants are available in some cases, when the fee would be burden for active individual contributors.

What about participation in JTC1/SC34? First, you must be a member of your NB. How do you become a member of your NB? In the US the price is $1,200 and you must be representing a company or organization. Individuals? Sorry, you are not allowed to participate. In other countries the rules vary. In some cases membership is not available at all at any price. You are essentially wait-listed until an opening becomes available. (Sorry, we don’t have enough seats, we heard in Portugal). In some countries, like China, membership is forbidden to native citizens who are employees of foreign subsidiaries in China. In other countries you can’t join at all. It is entirely a government decision. So, good luck joining the NB of Syria, where the constitution has been suspended under emergency rule since 1963. (But somehow they managed to make time to vote on the OOXML ballot. Zimbabwe as well, that paragon of open participation.)

Now, it is entirely possible for a standards organization to appear open, but in practice to be inaccessible. So we must look at the complete cost of participation, not just the initial membership fees.

The OASIS ODF TC does its work entirely on an email list, a wiki, and via weekly phone calls, which are toll-free calls for most participants. I don’t recall there ever being a face-to-face meeting, certainly not so long as I’ve been a member. This use of technology lowers the barrier to participation, so anyone can be effective on the TC if they wish. In particular it makes it easier for those who have day jobs and can only contribute to the mailing list during non-work hours.

What about JTC1/SC34? To participate effectively requires attendance at several international meetings each year (Plenary’s, WG’s, Ad-hocs, BRM’s, etc.), as well as participation at NB meetings. Since many of the participants are representative of large corporations or government agencies, a junket mentality prevails and the meetings are often held in some of the most expensive places in the world: Geneva, Granada, London, Kyoto, Jeju Island, etc.

JTC1 does not allow meeting participation by telephone. Since important votes, are held at these meetings, and no provision is made for remote participation, one cannot effectively participate in JTC1/SC34 without a substantial budget for international travel. Attendance at a single meeting — the DIS 29500 BRM — was $3687.52 for me, and I flew coach and ate cheap. How many standards meetings like that can you as an individual or your small company afford per year?

Further, note the nature of your membership — what can you actually do? Can you vote? In OASIS, it is one person/one vote. In the TC, your vote as an individual with a $300 membership fee is counted exactly the same as my vote representing an OASIS Foundational Sponsor. At the organizational level, it is one company/one vote, and the smallest OASIS member organization has exactly the same vote as the largest.

In JTC1/SC34 however, you typically can’t vote at all. NB’s vote, not individuals, not companies. So your opinion and your wishes are subject to the will of your NB. If your opinion varies from your NB’s, you may not be accredited to attend an international meeting, and even if you are able to attend you may not be allowed to speak your opinions. This extra level of indirection and censorship means that you, as an individual, can do little. And to the extent your NB’s committee is stacked by a single vendor and their partner community, or your NB decides to overrule or ignore its technical committee, or Microsoft calls your head of state to change the NB’s vote, or any of the dozens of other documented shenanigans that recently occurred, your entire membership fee and participation will be an entire waste of time, money and effort.

Membership is OASIS is far more open and inclusive. You join. You discuss. You vote. Period. In JTC1/SC34, you are mired in layers of bureaucracy at the national and international level, in a system crafted by and for the big boys to cut back room deals and manipulate the process to the benefit of large corporations.

(Now that isn’t to say that there are not some individual consultants out there who thrive in the JTC1 environment by mastering its dark, dusty, demon-haunted hallways. Even the largest corporations occasionally have need of this expertise, as Rick and others are quite aware. If JTC1/SC34 were truly open and transparent, such skills would not be needed. You certainly don’t see anyone selling their services to help companies navigate OASIS, do you?)

What about transparency? As Rick demonstrates, OASIS meeting minutes and agenda are all posted and public. So is our mailing list. So are all of our drafts. So is our member and public comments.

But in JTC1/SC34, most of the documents are private, only accessible to SC34 members by password. And then occasionally JTC1 will step in prevent SC34 from releasing their own work , suppressing documents even from their own SC members. There are no public comments to speak of, and member comments on draft standards are secret.

So when you are back from your “trip”, Rick, please let us know again, who wins on openness, participation and transparency?

And for the record, a couple of outright deceptions in Rick’s post:

Rick says that there are 80 NB’s, and thousands people participating in JTC1, but only 13 people participating on the ODF TC. This is a particularly inept comparison. Why is he comparing all of JTC1 to a single OASIS TC? If you look at OASIS overall, you will see that OASIS has more than 5,000 participants, representing over 600 organizations and individual members in 100 countries. The ODF TC itself has 53 members, including 7 members of JTC1/SC34.
Rick picks a “random” ODF TC minutes post from a year ago to attempt to suggest domination by a single company. Not so random a choice, methinks. It was a rare joint meeting of the ODF TC and the Metadata subcommittee, which brought in a far greater number of Sun employees than typically participate in a call.

Toy Soldiers

2008/07/16 By Rob 3 Comments

It is interesting to watch the activities of JTC1/SC34 as they go through the motions of processing activities related to OOXML, long after any serious justification for their continuation has ceased. That is the nature of bureaucracy — wind up their clockwork and watch the little soldiers go through their prescribed motions. Come back in an hour and they may be stuck in a corner or knocked over onto the floor. But they’ll keep on shuffling their feet, back and forth, in small steps toward ends unknown and unknowable, the little senseless mechanical men.

One example is the proposals in SC34 to create a new project to create a Technical Report on translating between ODF 1.0 and OOXML 1.0. This might have made sense at some point in the past. But this proposal seems out of place now.

Consider:

Few applications today support exclusively ODF 1.0 and only ODF 1.0. Most of the major vendors also support ODF 1.1, one (OpenOffice 3.x), now supports draft ODF 1.2 as well.
No one supports OOXML 1.0 today, not even Microsoft.
No one supports interoperability via translation, not Sun in their Plugin, not Novell in their OOXML support, and not Microsoft in their announced ODF support in Office 2007 SP2.

So the proposal essentially will be to create an technical report for a translation task that no vendor is implementing, between versions of the ODF and OOXML standards that no vendor is supporting.

Excuse me if my enthusiasm is muted.

And yes, the proposers want accelerated processing for this proposal. But the idea was already obsolete the day it was proposed to SC34. Events have overtaken it, though the clockwork motions continue, and SC34 is currently having a ballot for this proposal, ending on 29 July. I’m not in favor of it. Perhaps it would be worth considering if resubmitted in one year’s time, and was targeted to consider ODF 1.2 and OOXML 1.1 (or whatever their next version is). But is it really a priority for SC34 now?

Another example of working on autopilot is the ad-hoc working group in SC34 looking at OOXML maintenance. Although it was heralded with much pomp “SC takes control of OOXML”, the fact is SC34 currently can’t even look at OOXML, let alone maintain it. They are entirely impotent. But still they will go through the motions and meet next week in London to advise Alex Brown, who will then take all this advice and later formulate and write up his OOXML maintenance plan for SC34 to vote on.

All the best to them. They voted on OOXML without seeing it. Now they’ll determine how to maintain it without seeing it. Maybe ISO should stand for Invisible Standards Organization? Maybe one of the participants can let me know where can I submit my invisible defect report?

In any case, since Microsoft has effective voting control of SC34, after almost two years of packing the committee, my bet is that OOXML will effectively be handed over to Ecma for maintenance. That is what JTC1 has done for every other Ecma Fast Track that has been approved. They might call it a “maintenance group” and allow token participation from SC34 liaisons in a non-voting capacity, but in all important ways it will remain Microsoft/Ecma standard. In the end, this makes some sense. Who is better positioned to clarify exactly how Excel financial functions work, the Microsoft engineer who has access to the Excel source code, or an SC34 representative from Kazakhstan?

Given the leisure to do the job right, my bet is on Microsoft. Everyone knows it for what it is now. There is no longer need for elaborate attempts to disguise the fact that OOXML is and will remain a Microsoft-only standard. Why continue the charade? If Microsoft put OOXML on MSDN, at least we would all have access to it and would know where to send our defect reports to, which is more than we can say about ISO OOXML. A real open standard is preferred, of course. But given a choice of fake ISO standard and a real MSDN specification, I’ll take the real MSDN specification any day.

Spirits

2008/06/26 By Rob 8 Comments

Google Books has provided me endless hours of enjoyment. Sure, the collection seems to consist mainly of old, out of copyright books, but that makes it even more enticing. It is like a fine used-book store, without the musty smell.

Imagine. It is late at night, and the library is closed and Amazon cannot deliver fast enough. But you absolutely must have immediate access to Kramer and Sprenger’s witch-hunting manual, the Malleus Maleficarum, specifically the 1669 Lyon edition. What do you do?

A quick search of Google Books, and there you have it, in full text, with the ability to download it in PDF if you want.

What you do with the “Hammer of Witches” is your own business. But who wouldn’t have a little extra bounce in their step knowing they have a copy tucked away on their USB memory stick, “just in case”?

And so, I was browsing through “many a quaint and curious volume of forgotten lore” when I came across a book with the imposing title The Moral Aspects of Medical Life, Consisting of the ‘Akesios’ of K.F.H. Marx, translated from the German with Biographical Notes and Illustrative Remarks by James Mackness, M.D., published in London in 1846. Note that this is by Karl Friedrich Heinrich Marx (1796-1877), the German physician, not to be confused with the founder of modern communism Karl Heinrich Marx (1818-1893).

Within this text is printed a most remarkable diagram, “A Moral and Physical Thermometer”. A footnote ascribes it to another work,’Public Characters’, London, 1801, pg 499.

In any case here it is, a marvelous example of data presentation and visualization from 200 years ago. It shows, in parallel scales, the strength of various drinks as well as their effects.

Now, mind you that this diagram is coming from an advocate of water drinking, but it is an interesting view of the state of drink in 1801 London.

A few things stand out:

Cider (to American readers) is non-alcoholic. Here it apparently refers to hard cider. Perry is like hard cider, only made from pears. Don’t see much of that around here.

Why is wine listed as weaker than porter and great beer? Today porter is 5% ABV or so, while wine is 12% or so. And even strong beers today are 6-7% ABV. So were things so much different back in 1801? Or is the chart correlated with the social stigma attached to the drink, and not necessarily the potency of the drink?

Some of the drinks are obscure today, in particular toddy (any hot, spiced drink, like mulled cider), crank (unknown — not listed in the OED), flip (eggnog-like drinks) and shrub (lemon juice and rum)

Hysteric Water — we may forget that the root word here is from the Greek “hystera”, meaning uterus. It was thought that hysteria was predominately a disorder of women, and was treated by various means, including “Hysteric Water”, which according to one period recipe was brandy infused with medicinal herbs: valerian, pennyroyal, rue, mugwort, savin, orange peel, and lovage. It was taken in small doses: 1 ounce.

So why is Hysteric Water listed as the worst of the lot, the one where drinking it leads one to risk whipping or exile to the Australian penal colony of Botany Bay? Perhaps there was some additional social stigma attached.

One period commenter described the state of affairs like this:

Many a good woman, who would start at the very mention of strong waters, cannot conceive there can be any harm in a cordial. And as the fair sex are more particularly subject to a depression of spirits, it is no wonder that they should convert their apothecaries’ shops into rich cordial warehouses, and take drams by way of physic; as the common people make gin serve for meat, drink, and clothes. The ladies perhaps may not be aware, that every time they have recourse to their Hartshorn or Lavender Drops, to drive away the vapours, they in effect take a dram ; and they may be assured, that their Colic, Surfeit, and plague Waters, are to be ranked among spirituous liquors, as well as the common stuff at the gin-shop. The college of Physicians, in their last review of the London Dispensatory, for this very reason expelled the Strong Water, generally known by the soothing name of Hysteric Water; because it was a lure to the female sex to dram it by authority, and to get tipsy secundum artem.

If any of my fair readers have at all given into this pernicious practice of dram-drinking, I must intreat them to leave it off betimes, before it has taken such hold of them, as they can never shake off. For the desire of drams steals upon them, and grows to be habitual, by imperceptible degrees : as those who are accustomed to take opiates, are obliged to encrease the dose gradually, and at last cannot sleep without it.

But it is comforting to know that then, as today, a moderate beer is still good for “reputation, long life and happiness”.

Beautiful Word Clouds

2008/06/26 By Rob 16 Comments

We’ve all seen tag clouds by now, the visualization technique that shows the importance (however defined, but typically by prevalence) of a word by assigning a proportionately sized font.

But now comes along a tool that treats these clouds as art. Wordle’s “Beautiful Word Clouds” is quite addictive, allowing you to enter the raw text and then play around with layout algorithms, fonts and coloring schemes to produce some very nice looking clouds. The author — Jonathan Feinberg — works here at IBM, a fact I did not discover until I had already wasted hours playing with the tool. So maybe I can count this as work now?

Here are a few examples of word clouds formed by analyzing three different texts. Can you guess the identity of the three texts?

Some of my wish-list items are:

Apply a stemming algorithm to conflate words with the same root. So in the last example, “standard” and “standards” are counted separately, when they are probably best counted as the same word.
Auto generate an image map associated with the cloud
Export to PNG (even if just written temporarily to server, I can download it from there)
I’d love to read a paper on how the layout algorithms works
What would happen if you combined Kohonen self-organizing maps with word clouds? Arrange the words so their proximity in the cloud was correlated with co-occurrence in the text.

Fractured YEARFRAC and Discounted DISC

2008/05/19 By Rob 48 Comments

There are a few things that a spreadsheet can get wrong. You can be hard to use. You can be ugly. You can be slow. You can even crash occasionally. These are all annoyances. With a monopoly, annoyances are a way of life. However, there is one place where a spreadsheet can never be wrong, and that is in the area of calculations, especially financial calculations. A business uses spreadsheets to model and track its business. It makes decisions based on these spreadsheets. If the spreadsheet gives the wrong answer, then the business makes the wrong decision. And a business that makes the wrong decisions does not stay in business for long.

So it was with much lament last year that I reported that OOXML, then under ballot in ISO/IEC JTC, had many egregious errors in its spreadsheet formula definitions. In addition to enumerating these errors in my blog, I submitted them for consideration by INCITS V1, the US SC34 mirror committee, and these became part of the bundle of comments with which the US accompanied its ballot.

Although the ballot wasn’t due until September 2nd, the extravagant “sit and do nothing” provisions in INCITS lead to our technical review being cut off in July. Because of the lack of review time, I was not able to do an in-depth review of all the spreadsheet functions, but only a cursory review. But the existence of such errors as I did identify, in the already-approved Ecma Standard, was disquieting. It should have lead Ecma TC45 to conduct a more through review of the spreadsheet functions. But this did not appear to happen.

When we received the response from Ecma to the NB comments, in January, INCITS V1 was asked to go through all the responses (over 1,000 of them) to determine whether they were acceptable. This review period was again insufficient.

As for the BRM in February, this was a travesty, as I and others have noted.

I have a theory concerning committees. A committee may have different states, like water has gas, liquid or solid phases, depending on temperate and pressure. The same committee, depending on external circumstances of time and pressure will enter well-defined states that determine its effectiveness. If a committee works in a deliberate mode, where issues are freely discussed, objections heard, and consensus is sought, then the committee will make slow progress, but the decisions of the committee will collectively be smarter than its smartest member. However, if a committee refuses to deliberate and instead merely votes on things without discussion, then it will be as dumb as its dumbest members. Voting dulls the edge of expertise. But discussion among experts socializes that expertise. This should be obvious. If you put a bunch of smart people in a room and don’t let them think or talk, then don’t expect smart things to happen as if the mere exhalation of their breath brings forth improvements to the standard.

So the the BRM ended a different committee than it started, and the mode of operation it was lead into caused it to act like a very stupid committee indeed. I don’t say this to be accusatory. I’m just making an observation about crowd behavior. When a committee of experts ceases to be a deliberate committee, then you will achieve subpar results.

One of the ways the BRM was stupid is that it approved changes to OOXML that have totally broken SpreadsheetML’s financial calculations, rendering the resulting calculations both mathematically incorrect as well as inconsistent with what Excel actually calculates. More about this later.

Here be dragons.

One of the persistent problems with OOXML has been in the area of “day count conventions” as used in SpreadsheetML.

Why should day counting be complicated? Just count how many days between the two dates and you are done, right? Indeed, everywhere but in finance it is simple. Some of the complications are for historical reasons, to try to make hand calculations easier in the pre-computer era. Also, these conventions made the calendar more regular, so financial instruments were less distorted by calendar irregularities like leap years or variable length months. If you assume a year is exactly 360 days in 12 months with 30 days each, then some things in life are simpler. Of course, this makes other things in life more complicated, including defining spreadsheet functions.

The devil is in the details. You must get these count conventions right. Being wrong by just one day in a year may only be 0.3%, but in a million dollar transaction, that is $3000. Not many people can afford to routinely ignore a $3000 error.

Different financial organizations have developed their own different day count conventions. We have the Banker’s Rule, the LIBOR rule, the NASD rule, the ISDA rule, ICMA rule, SIFMA rule, etc.

In Excel, these conventions are defined by a function parameter called the “basis” and this basis is used by many financial spreadsheet functions. In fact, none of the following functions are adequately defined unless the day count basis values are adequately defined:

ACCRINT()
ACCRINTM()
AMORDEGRC
AMORLINC
COUPDAYBS()
COUPDAYS()
COUPDAYSNC()
COUPNCD()
COUPNUM()
COUPPCD()
DISC()
DURATION()
INTRATE
MDURATION()
ODDFPRICE()
ODDFYIELD()
ODDLPRICE()
ODDLYIELD()
PRICE()
PRICEDISC()
PRICEMAT()
RECEIVED()
YEARFRAC()
YIELD()
YIELDDISC()
YIELDMAT()

There are five basis conventions defined in OOXML, with values 0-4:

Basis 0 = US 30/360
Basis 1 = Actual/Actual
Basis 2 = Actual/360
Basic 3 = Actual/365
Basic 4 = European 30/360

In this convention X/Y indicates that X is the assumption for month length, and Y is the assumption for year length. Note that there is no external authority given as a reference for any of these conventions. So it is not clear, whether the Actual/Actual convention is following, for example ICMA rules or ISDA rules. In fact, it may be following neither rule. So lacking any external authority, all we have to go on are the definitions provided in the specification itself. OOXML is its own authority for day count conventions.

As I reported last July, the definitions provided by OOXML did not sufficiently define the behavior of the conventions that lie behind most of the financial functions in Excel. Unfortunately, Microsoft/Ecma have failed to fix this problem in their proposed Resolutions, and in fact have made it worse. Further, the DIS 29500 BRM, in its negligent bulk approval of Ecma’s responses merely advanced these serious errors into the text which was then approved as an International Standard. So essentially, my work in pointing out errors in the spreadsheet language was for naught. Microsoft just shoved it all though anyways.

This puts me in a delicate situation. On the one hand, the ODF TC really would like to finish its work on ODF 1.2, and part of that is completing the OpenFormula work. A key remaining part of OpenFormula is to ensure that our financial functions synch up with how Excel works. What OOXML says is irrelevant, except to the extent that it accurately tells how Excel defines these functions.

However, inquiries to Ecma on these day count conventions, inqueries made months ago, have received no response. Also, the final DIS text of OOXML has not been made available, not even to SC34 members and liaisons. And there is not mechanism in place yet in SC34 for collecting defect reports on OOXML. So we’re stuck.

Or maybe not.

David Wheeler, Chair of the OpenFormula subcomittee of the OASIS ODF TC, has been trying to nail down the behavior of Excel’s spreadsheet functions for over a year now. One of the last remaining pieces is to nail down the day count conventions. After waiting and waiting for this to be clarified in OOXML, David took matters into his own hands and decided to solve the problem by brute force, enumerating millions of test cases, indeed a comprehensive set of date pairs over a 6 year period, to try to determine exactly how the date bases in Excel work. You can read David’s conclusions on his blog.

What strikes me in David’s report is that not only are the OOXML definitions incomplete and inconsistent, but they do not accurately reflect what Excel actually calculates. So either Excel is wrong, or the OOXML standard is wrong when calculating almost every financial spreadsheet function. This is quite an embarrassment for an ISO standard, and an unnecessary one, since I have been talking about how poorly defined these functions are for almost a year now.

Let’s work through some of David Wheeler’s test cases by hand, to get a better feel for how OOXML is broken.

Let’s take the YEARFRAC() function as the simplest example. YEARFRAC() takes two dates and a basis value as inputs, and returns the fraction of a year that is between those two dates. So logically, the calculation is like:

YEARFRAC = (interval between start-date and end-date) / length of year.

That is the logical definition. The only complication is the 5 date bases, and what exactly the year length is. This last point is something that stuck out in my mind when I first reviewed the draft of OOXML last summer. You might think the year length is 365 days. But what about leap years? And what about date ranges that straddle normal years and leap years? This is the key fact that this function requires in is definition. This is the problem I reported on my blog last July. This was the problem INCITS V1 submitted with our ballot comments last September. This is the problem that Ecma responded to in a severely flawed way in January. This is the problem that the BRM refused to discuss and merely agreed with Ecma’s flawed changes. And this is the problem that is now in the final DIS version of OOXML.

Take a look for yourself in this brief extract from the final DIS version [pdf] of OOXML, provided for purposes of critical review and commentary.

(Yes, I now have a complete copy of the final DIS version of OOXML. If you think that this is unfair — and I would agree with you on that — then maybe you should ask ITTF why I was able to get a copy of the final DIS, but no one else in SC34 was.)

Let’s take a look at the ISO OOXML definitions and try some test calculations to reproduce some of David’s findings.

First, let’s take the date basis 1 first, Actual/Actual, since that is the easiest. But we immediately run into a problem. The standard says two different things. In the “description” table it says:

Actual/actual. The actual number of days between the two dates are counted. If the date range includes the date 29 February, the year is 366 days; otherwise it is 365 days.

However, later, when defining the return value, the standard says this:

If the Actual/actual basis is used, the year length used is the average length of the years that the range crosses, regardless of where start-date and end-date fall in their respective years.

There is absolutely no way in which these can both be correct. This would have been easily fixable at the BRM, if the BRM had been allowed to do its job. But I wasn’t even allowed to open my mouth and point out this problem. So now this fatal ambiguity sits in the text of OOXML, as authorized by the BRM experts and approved by JTC1 NB’s. Gotta love it.

But let’s forge on and assume that we really have two algorithms, the first (from the description) which we will call basis 1, and the second basis (from the return value section) which we will call 1′ (“one-prime”). We’ll calculate it both ways.

Let calculate YEARFRAC(DATE(2000,1,1),DATE(2001,1,1),1)

With basis 1, this is simple. 2000 was a leap year (since it was a century divisible by 400) so the interval between the two dates is 366 days. Similarly, since the date range includes a February 29th, the length of the year used in the calculation is 366. So the returned value returned by YEARFRAC should be 366/366 or 1.0.

With basis 1′, this is also simple. The date interval is still 366. But the length of the year is now the average of the year lengths crossed by the date range. So the average of 366 (for 2000) and 365 (for 2001), or an average of 365.5. So using basis 1′, YEARFRAC should return 366/365.5 or 1.00137.

Let’s try another, with basis 1. What is YEARFRAC(DATE(2000,1,1),DATE(2002,1,1),1) ?

In basis 1, the interval is 731 days (366 days in 2000 plus 365 days in 2001). The year length is 366, since the interval includes a February 29th. So YEARFRAC should return 731/366 = 1.997.

With basis 1′, the interval is also 731 days, but the year length is the average of 366 (for 2000), 365 (for 2001) and 365 (for 20002). So YEARFRAC should return 2.0009

Just to make sure we have this down, let’s try another example:

What is YEARFRAC(DATE(2000,1,1),DATE(2000,1,2),1) ?

A one day interval? Yes, please humor me.

OK. With basis 1, the interval is one day. Since it does not cross February 29th, the year length is 365 days. So it will return 1/365 or 0.0028.

With basis 1′ the interval is also one day, and the year length is 366, So it will return 1/366 or 0.0027

OK. That was too easy. One more example, to make sure that you have it down.

What is YEARFRAC(DATE(2000,1,1),DATE(2004,1,31),1) ?

Hmm…. this one will require more thought. I might have to take off my shoes and count using my toes as well.

With basis 1, the interval is 1491 days = 366 + 365 + 365 +365 +30.

The year lengths are:

2000=366 (since the range cross February 29th)
2001=365
2002=365
2003=365
2004=365 (since the date range does not cross February 29th)

So we have 4 full years plus 30 days of a 365 day year. YEARFRAC should return 4 + 30/365 = 4.0822.

With basis 1′ we treat the 2004 as having 366 days and average the years in the interval, so average year length = (366+365+365+365+366)/5 = 365.4. So YEARFRAC should return 4 + 30/365.4 = 4.0821.

Now that we’re done with the examples, we can throw them into a table, and compare them to what Excel 2007 calculates for these same parameters:

start-date	end-date	basis	ISO Value	Excel’s Value	Excel Correct?
2000-01-01	2001-01-01	1	1.0000	1.0000	Yes
2000-01-01	2001-01-01	1′	1.00137	1.0000	No
2000-01-01	2002-01-01	1	1.9972	2.0009	No
2000-01-01	2002-01-01	1′	2.0009	2.0009	Yes
2000-01-01	2000-01-02	1	0.0028	0.0027	No
2000-01-01	2000-01-02	1′	0.0027	0.0027	Yes
2000-01-01	2004-01-31	1	4.0822	4.0805	No
2000-01-01	2004-01-31	1′	4.0821	4.0805	No

Let your mind linger on these results for a bit. Let it sink in. Look at this table until you recognize its significance and cringe in disgust. In some cases, Excel seems to be using the first definition of date basis 1. In other cases it is using the second definition of date basis 1. And in one case it is using neither definition. In other words, this is a lot more screwed up than at first it appears. This is not just a simple ambiguity. The OOXML definition of date basis 1 is totally wrong.

That is just the Actual/Actual date basis. The other 4 conventions are more complicated. I encourage you to read David’s write up in full to see how 3 of 5 basis conventions defined in OOXML differ from what Excel actually calculates. David also shows how he believes Excel really calculates these day count conventions, based on his extensive tests.

Now if this was just a matter of one function in Excel, just YEARFRAC(), then this would not be a big deal. But this is flaw is inherited into most financial functions in OOXML. Let’s take an example at random, DISC(), which calculates the discount rate for a security, given settlement and maturity dates, as well as par and redemption values. You can read the definition of DISC() from the final DIS text here [pdf].

You don’t need to be an Wall Street quantitative analyst to see some obvious problems here. First, in the formula given, the 2nd term is divided by “SM”. What is SM? There is no “SM” defined here. There is a “DSM” defined, however. Is that what is meant? Let’s assume so.

We can try a test calculation, using the example given in the text of OOXML:

What is the value of: DISC(DATE(2007,1,25),DATE(2007,6,15),97.975,100,1) ?

B = 365, since the date range does not include February 29th. (Note that DISC has a single definition for year length in basis 1, not the two conflicting definitions we saw in YEARFRAC)
DSM = 141 days = 6 in Jan+28 in Feb+31 in March+30 in April+31 in May+15 in June
par = 97.975, which was our input parameter
redemption = 100, which is another one of our input parameters

You can do the calculation given in the text, and find the return value of DISC should be 5.35037%. Plug the example into Excel and you will get the answer 5.24202%.

So Excel is off by 2% or so. Do we really care. It’s just money, right?

The problem is that the function in OOXML is defined incorrectly, from the financial perspective. The discount rate is the discount from the redemption value, not the discount from the purchase price. So the first term in the formula should be (redemption-par)/redemption, not (redemption-par)/par. If you make this change, then the calculated value matches the value Excel gives.

Does anything strike you as odd in the above? Do you have a chill running down your spine? Do you have renewed sense of dread? You should, because I just illustrated another grave problem with the OOXML standard: The spreadsheet examples in OOXML are a fraud.

You might have mistakenly been reassured by these numerous examples in the spreadsheet formula, that these actually had some relationship to the standard, that they were examples of how the calculations should be done, that they were evidence of some sort of quality assurance, that they may even be of assistance to an implementor to see whether they implemented the function correctly.

But they aren’t.

What would be normal practice would be to take the definitions, as given in the OOXML text, and to calculate the values according to the definition provided in the text, and then to compare the resulting values with what Excel returns. That would provide a useful double-check of the definitions in the text. But OOXML doesn’t do that. The examples here are mere fluff.

The discrepancy here also indicates that no one has actually reviewed these formulas for accuracy. Errors like this are immediately evident, but only if you look. The fact that things like this have escaped the notice of Microsoft, Ecma TC45, their Project Editors, 80 NB reviews, the BRM experts, and the eagle eyes of ITTF, should make one have considerable concerns over the the sufficiency of the Fast Track review and approval process.

Let’s try one more example before we wrap this up.

What is the value of: DISC(DATE(2000,1,1),DATE(2004,1,31),97.975,100,1) ?

B = 366, since the date range includes February 29th
DSM = 1491 days = 366 + 365 + 365 +365 +30.
par = 97.975, which was our input parameter
redemption = 100, which is another one of our input parameters

So, using the stated formula in the OOXML text, the answer would be 0.50736%. Using the “correct” formula, in the financial sense, the answer would be 0.49708%. What does Excel say the value is? Well, 0.49627%, of course.

So with the DISC() function we found:

The given text provided a formula that referred to a non-existant “SM” variable. This appears to be a cut & paste error.
After accounting for that, we found that the formula was incorrect according to recognized financial standards. Securities are discounted from their redemption values, not from their purchase prices.
Even correcting for the formula errors, we find that the given definition of DISC() does not match what Excel returns, due to errors and ambiguities in the day count conventions, errors that David Wheeler delves into more deeply in his report.
The examples given in the standard are bogus. They are not actually examples of the defined function.
Excel does not implement OOXML.