A quick update on my post from last week on the “Microsoft Shuffle“, where I looked at how Microsoft’s “random” browser ballot was far from random.

First, I’d like to thanks those who commented on that post, or sent me notes, offering additional analysis. I think we nailed this one. Within a few days of my report Microsoft updated their Javascript on the browserchoice.eu website, fixing the error. But more on that in a minute.

Some random observations

Several commenters mentioned that if you search Google for “javascript random array sort” the first link returned will be a Javascript tutorial that has the same offending code as Microsoft’s algorithm. This is not surprising. As I said in my original post, this is a well-known mistake. But it is no less a mistake. If you use Google Code Search for the query “0.5 – Math.random()” lang:javascript you will find 50 or so other instances of the faulty algorithm. So if anyone else is using this same algorithm, they should evaluate whether it is really sufficiently random for their needs. In some case, such as a children’s game, it might be fine. But know that there are better and faster algorithms available that are not much more complicated to code.

Another thing to note is that the Microsoft Shuffle algorithm is bad enough with 5-elements in the array, but the non-randomness gets more pronounced as you increase the length of the array. Regardless of the size of the array, it appears that on Internet Explorer the 1st element will end up in last place 50% of the time. There are other pronounced patterns as well. You can see this yourself this this test file, which allows you to specify the size of the array as well as the number of iterations. Try a 50-element array for 10,000 iterations to get a good sense of how non-random the results can be.

I used that script to run a large test of 1,000,000 iterations of a 1024-element array. The raw results are here. I took that table, and using R’s image() function produced a rendering of that matrix. You can see here the clear over-representation at some positions, including (in the lower left) the flip of the first position to last place. (I’m not quite satisfied with this rendering. Maybe someone can get a better-looking visualization of this same data.)

Evaluating Microsoft’s new shuffle

Sometime last week — I don’t know the exact date — Microsoft updated the code for the browser choice website with a new random shuffle algorithm. You see see the code, in situ, here. The core of it is in this function:

function ArrayShuffle(a)
{
    var d, c, b=a.length;
    while(b)
    {
        c=Math.floor(Math.random()*b);
        d=a[--b];
        a[b]=a[c];
        a[c]=d
    }
}

This looks fine to me. I created a new test driver for this routine, which you can try out here. Aside from being much faster, it is gives much better results. Here is a run with a million iterations:

Raw counts

Position I.E. Firefox Opera Chrome Safari
1 199988 200754 199944 199431 199883
2 200320 200016 199838 199752 200074
3 199702 199680 199911 200865 199842
4 200408 200286 199740 199861 199705
5 199582 199264 200567 200091 200496

Fraction of total

Position I.E. Firefox Opera Chrome Safari
1 0.2000 0.2008 0.1999 0.1994 0.1999
2 0.2003 0.2000 0.1998 0.1998 0.2001
3 0.1997 0.1997 0.1999 0.2009 0.1998
4 0.2004 0.2003 0.1997 0.1999 0.1997
5 0.1996 0.1993 0.2006 0.2001 0.2005

And the results of the Chi-square test:

X-squared = 18.9593, df = 16, p-value = 0.2708

Final thoughts

In the end I don’t think it is reasonable to expect every programmer to be memorize the Fisher-Yates algorithm. These things belong in our standard libraries. But what I would expect every programmer to know is:

  1. That the problem here is one that requires a “random shuffle”. If you don’t know what it is called, then it will be difficult to lookup the known approaches. So this is partially a vocabulary problem. We, as programmers, have a shared vocabulary which we use to describe data structures and algorithms; binary searches, priority heaps, tries, and dozens of other concepts. I don’t blame anyone for not memorizing algorithms, but I would expect a programmer to know what types of algorithms apply to their work.
  2. How to research which algorithm to use in a specific context, including where to find reliable information, how to evaluate the classic trade-offs of time and space, etc.  There is almost always more than one way to solve a problem.
  3. That where randomized outputs are needed,  the outputs should be statistically tested. I would not expect the average programmer to know how to do a chi-square test, or even to know what one is. But I would expect a mature programmer to know either find this out or seek help.

{ 21 comments }

Evidently today is National Grammar Day.  I am not a fan.

Like most Americans of my generation I was taught to identify parts of speech, diagram sentences and intone with the rest of the class the mysteries of the three-and-twenty most holy helping verbs: “is, am, are, was, were, be, being, been, have, has, had, do, did, does, may, must, might, can, could, will, would, shall, should”.   Because I was good at it, and felt a call to the service of pedantry, I continued my novitiate in stranger accents, in German, Latin and Greek.

I was well on my path the the priesthood of a grammarian, when in 1992 I abandoned all my vows in a bus in Somerville, Massachusetts, when a drunk showed me what language was really all about.

Somerville, Massachusetts
Image via Wikipedia

I’m not one to start a conversation with a stranger –  even a sober one — on public transportation.  But in this case I had little choice in the matter, since this particular gentleman insisted on initiating a debate on the virtues of the Allman Brothers, a subject which I was neither equipped nor inclined to discuss with him.

When I expressed my disinclination to debate, and further, my ignorance of all things Allman, the dear fellow was offended and let out a string of expletives, starting with “Un-freakin’-believable” (albeit with a more emphatic, saltier interposed participial adjective than I can relate to you here) and continuing for several minutes.  Nothing he said was grammatical.  Little was even coherent.  But what I did understand was pure genius.  I wish I had a tape recorder.  As my stop approached, I hesitated a moment, intending to thank the man, offer him my congratulations and laud him as a poet of the first order.  But the smell, as well as my own instinct for self-preservation, held me in abeyance.

Since that day I have been an apostate to grammar.  I think we should all have a range of ways to speak and write,  and should be able to modulate according to circumstances. Language is like a wardrobe.  A man should have jogging shorts as well as a tuxedo.  In the end, language is not about rules.  It is about suiting the words to the occasion, of putting the right words in the right places, and what is “right’ will depend on circumstances.

So down with grammar, down with the rules! Go, split an infinitive, dangle your participles, and like my good friend on the #86 bus, even separate your inseparable prefixes.  To quote Duke Ellington, “If it sounds good, it is good”.  And remember that the goal in the end is expression and understanding.  If you are understood, then you’ve accomplished more than many.

As Gertrude Stein wrote:

Clarity is of no importance because nobody listens and nobody knows what you mean no matter what you mean, nor how clearly you mean what you mean. But if you have vitality enough of knowing enough of what you mean, somebody and sometime and sometimes a great many will have to realize that you know what you mean and so they will agree that you mean what you know, what you know you mean, which is as near as anybody can come to understanding anyone.

Reblog this post [with Zemanta]

{ 4 comments }

March 6th Update:  Microsoft appears to have updated the www.browserchoice.eu website and corrected the error I describe in thi post.  More details on the fix can be found in The New & Improved Microsoft Shuffle.  However, I think you will still find the following analysis interesting.

-Rob


Introduction

The story first hit in last week on the Slovakian tech site DSL.sk.  Since I am not linguistically equipped to follow the Slovakian tech scene, I didn’t hear about the story until it was brought up in English on TechCrunch.  The gist of these reports is this: DSL.sk did a test of the “ballot” screen at www.browserchoice.eu, used in Microsoft Windows 7 to prompt the user to install a browser.  It was a Microsoft concession to the EU, to provide a randomized ballot screen for users to select a browser.  However, the DSL.sk test suggested that the ordering of the browsers was far from random.

But this wasn’t a simple case of Internet Explorer showing up more in the first position.  The non-randomness was pronounced, but more complicated.  For example, Chrome was more likely to show up in one of the first 3 positions.  And Internet Explorer showed up 50% of the time in the last position.  This isn’t just a minor case of it being slightly random.  Try this test yourself: Load www.browserchoice.eu, in Internet Explorer, and press refresh 20 times.  Count how many times the Internet Explorer choice is on the far right.   Can this be right?

The DLS.sk findings have lead to various theories, made on the likely mistaken theory that this is an intentional non-randomness.  Does Microsoft have secret research showing that the 5th position is actually chosen more often?  Is the Internet Explorer random number generator not random?  There were also comments asserting that the tests proved nothing, and the results were just chance, and others saying that the results are expected to be non-random because computers can only make pseudo-random numbers, not genuinely random numbers.

Maybe there was cogent technical analysis of this issue posted someplace, but if there was, I could not find it.  So I’m providing my own analysis here, a little statistics and a little algorithms 101.  I’ll tell you what went wrong, and how Microsoft can fix it.  In the end it is a rookie mistake in the code, but it is an interesting mistake that we can learn from, so I’ll examine it in some depth.

Are the results random?

The ordering of the browser choices is determined by JavaScript code on the BrowserChoice.eu web site.  You can see the core function in the GenerateBrowserOrder function.  I took that function and supporting functions,  put it into my own HTML file, added some test driver code and ran it for 10,000 iterations on Internet Explorer.  The results are as follows:

Internet Explorer raw counts
Position I.E. Firefox Opera Chrome Safari
1 1304 2099 2132 2595 1870
2 1325 2161 2036 2565 1913
3 1105 2244 1374 3679 1598
4 1232 2248 1916 590 4014
5 5034 1248 2542 571 605
Internet Explorer fraction of total
Position I.E. Firefox Opera Chrome Safari
1 0.1304 0.2099 0.2132 0.2595 0.1870
2 0.1325 0.2161 0.2036 0.2565 0.1913
3 0.1105 0.2244 0.1374 0.3679 0.1598
4 0.1232 0.2248 0.1916 0.0590 0.4014
5 0.5034 0.1248 0.2542 0.0571 0.0605

This confirms the DSL.sk results.  Chrome appears more often in one of the first 3 positions and I.E. is most likely to be in the 5th position.

You can also see this graphically in a 3D bar chart:

But is this a statistically significant result?  I think most  of us have an intuitive feeling that results are more significant if many tests are run, and if the results also vary much from an even distribution of positions.  On the other hand, we also know that a finite run of even a perfectly random algorithm will not give a perfectly uniform distribution.  It would be quite unusual if every cell in the above table was exactly 2,000.

This is not a question one answers with debate.  To go beyond intuition you need to perform a statistical test.  In this case, a good test is Pearson’s Chi-square test, which tests how well observed results match a specified distribution.  In this test we assume the null-hypothesis that the observed data is taken from a uniform distribution.  The test then tells us the probability that the observed results can be explained by chance.  In other words, what is the probability that the difference between observation and a uniform distribution was just the luck of the draw?  If that probability is very small, say less than 1%, then we can say with high confidence, say 99% confidence, that the positions are not uniformly distributed.   However, if the test returns a larger number, then we cannot disprove our null-hypothesis.  That doesn’t mean the null-hypothesis is true.  It just means we can’t disprove it.  In the end we can never prove the null hypothesis.  We can only try to disprove it.

Note also that having a uniform distribution is not the same as having uniformly distributed random positions.  There are ways of getting a uniform distribution that are not random, for example, by treating the order as a circular buffer and rotating through the list on each invocation.  Whether or not randomization is needed is ultimately dictated by the architectural assumptions of your application.  If you determine the order on a central server and then serve out that order on each innovation, then you can use non-random solutions, like the rotating circular buffer.  But if the ordering is determined independently on each client, for each invocation, then you need some source of randomness on each client to achieve a uniform distribution overall.  But regardless of how you attempt to achieve a uniform distribution the way to test it is the same, using the Chi-square test.

Using the open source statistical package R, I ran the  chisq.test() routine on the above data.  The results are:

X-squared = 13340.23, df = 16, p-value < 2.2e-16

The p-value is much, much less than 1%.  So, we can say with high confidence that the results are not random.

Repeating the same test on Firefox is also non-random, but in a different way:

Firefox raw counts
Position I.E. Firefox Opera Chrome Safari
1 2494 2489 1612 947 2458
2 2892 2820 1909 1111 1268
3 2398 2435 2643 1891 633
4 1628 1638 2632 3779 323
5 588 618 1204 2272 5318
Firefox fraction of total
Position I.E. Firefox Opera Chrome Safari
1 0.2494 0.2489 0.1612 0.0947 0.2458
2 0.2892 0.2820 0.1909 0.1111 0.1268
3 0.2398 0.2435 0.2643 0.1891 0.0633
4 0.1628 0.1638 0.2632 0.3779 0.0323
5 0.0588 0.0618 0.1204 0.2272 0.5318

On Firefox, Internet Explorer is more frequently in one of the first 3 positions, while Safari is most often in last position.  Strange.  The same code, but vastly different results.

The results here are also highly significant:

X-squared = 14831.41, df = 16, p-value < 2.2e-16

So given the above, we know two things:  1) The problem is real.  2) The problem is not related to a flaw only in Internet Explorer.

In the next section we look at the algorithm and show what the real problem is, and how to fix it.

Random shuffles

The browser choice screen requires what we call a “random shuffle”.  You start with an array of values and return those same values, but in a randomized order. This computational problem has been known since the earliest days of computing.  There are 4 well-known approaches: 2 good solutions, 1 acceptable (“good enough”) solution that is slower than necessary, and 1 bad approach that doesn’t really work.  Microsoft appears to have picked the bad approach. But I do not believe there is some nefarious intent to this bug.  It is more in the nature of a “naive” algorithm”, like the bubble sort, that inexperienced programmers  inevitably will fall upon when solving a given problem.  I bet if we gave this same problem to 100 freshmen computer science majors, at least one of them would make the same mistake.  But with education and experience, one learns about these things.  And one of the things one learns early on is to reach for Knuth.

The Art of Computer Programming, Vol. 2, section 3.4.2 “Random sampling and shuffling” describes two solutions:

  1. If the number of items to sort is small, then simply put all possible orderings in a table and select one ordering at random.  In our case, with 5 browsers, the table would need 5! = 120 rows.
  2. “Algorithm P” which Knuth attributes to Moses and Oakford (1963), but is now known to have been anticipated by Fisher and Yates (1938) so it is now called the Fisher-Yates Shuffle.

Another solution, one I use when I need a random shuffle in a database or spreadsheet, is to add a new column, fill that column with random numbers and then sort by that column.  This is very easy to implement in those environments. However, sorting is an O(N log N)  operation where the Fisher-Yates algorithm is O(N), so you need to keep that in mind if performance is critical.

Microsoft used none of these well-known solutions in their random solution.  Instead they fell for the well-known trap.  What they did is sort the array, but with a custom-defined comparison function or “comparator”.  JavaScript, like many other programming languages, allows a custom comparator function to be specified.  In the case of JavaScript, this function takes two indexes into the value array and returns a value which is:

  • <0 if the value at the first index should be sorted before the value at the second index
  • 0 if the values at the first index and the second index are equal, which is to say you are indifferent as to what order they are sorted
  • >0 if the value at the first index should be sorted after the value at the second index

This is a very flexible approach, and allows the programmer to handle all sorts of sorting tasks, from making case-insensitive sorts to defining locale-specific collation orders, etc..

In this case Microsoft gave the following comparison function:

function RandomSort (a,b)
{
    return (0.5 - Math.random());
}

Since Math.random() should return a random number chosen uniformly between 0 and 1, the RandomSort() function will return a random value between -0.5 and 0.5.  If you know anything about sorting, you can see the problem here.  Sorting requires a self-consistent definition of ordering. The following assertions must be true if sorting is to make any sense at all:

  1. If a<b then b>a
  2. If a>b then b<a
  3. If a=b then b=a
  4. if a<b and b<c then a<c
  5. If a>b and b>c then a>c
  6. If a=b and b=c then a=c

All of these statements are violated by the Microsoft comparison function.  Since the comparison function returns random results, a sort routine that depends on any of these logical implications would receive inconsistent information regarding the progress of the sort.  Given that, the fact that the results were non-random is hardly surprising.  Depending on the exact search algorithm used, it may just do a few exchanges operations and then prematurely stop.  Or, it could be worse.  It could lead to an infinite loop.

Fixing the Microsoft Shuffle

The simplest approach is to adopt a well-known and respected algorithm like the Fisher-Yates Shuffle, which has been known since 1938.  I tested with that algorithm, using a JavaScript implementation taken from the Fisher-Yates Wikpedia page, with the following results for 10,000 iterations in Internet Explorer:

Internet Explorer raw counts
Position I.E. Firefox Opera Chrome Safari
1 2023 1996 2007 1944 2030
2 1906 2052 1986 2036 2020
3 2023 1988 1981 1984 2024
4 2065 1985 1934 2019 1997
5 1983 1979 2092 2017 1929
Internet Explorer fraction of total
Position I.E. Firefox Opera Chrome Safari
1 0.2023 0.1996 0.2007 0.1944 0.2030
2 0.1906 0.2052 0.1986 0.2036 0.2020
3 0.2023 0.1988 0.1981 0.1984 0.2024
4 0.2065 0.1985 0.1934 0.2019 0.1997
5 0.1983 0.1979 0.2092 0.2017 0.1929

Applying Pearson’s Chi-square test we see:

X-squared = 21.814, df = 16, p-value = 0.1493

In other words, these results are not significantly different than a truly random distribution of positions.  This is good.  This is what we want to see.

Here it is, in graphical form, to the same scale as the “Microsoft Shuffle” chart earlier:

Summary

The lesson here is that getting randomness on a computer cannot be left to chance.  You cannot just throw Math.random() at a problem and stir the pot, and expect good results.  Random is not the same as being casual.  Getting random results on a deterministic computer is one of the hardest things you can do with a computer and requires deliberate effort, including avoiding known traps.  But it also requires testing.  Where serious money is on the line, such as with online gambling sites, random number generators and shuffling algorithms are audited, tested and subject to inspection.  I suspect that the stakes involved in the browser market are no less significant.  Although I commend DSL.sk for finding this issue in the first place, I am astonished that the bug got as far as it did.  This should have been caught far earlier, by Microsoft, before this ballot screen was ever made public.  And if the EC is not already demanding a periodic audit of the aggregate browser presentation orderings, I think that would be a prudent thing to do.

If anyone is interested, you can take a look at the file I used for running the tests.  You type in an iteration count and press the execute button.  After a (hopefully) short delay you will get a table of results, using the Microsoft Shuffle as well as the Fisher-Yates Shuffle.  With 10,000 iterations you will get results in around 5 seconds.  Since all execution is in the browser, use larger numbers at your own risk.  At some large value you will presumably run out of memory, time out, hang, or otherwise get an unsatisfactory experience.

Reblog this post [with Zemanta]

{ 163 comments }

How to photograph an asteroid

February 22, 2010

Over the years, I’ve seen Mercury, Venus, Mars, Jupiter and Saturn with my naked eyes.  And I’ve seen Uranus and Neptune through a telescope.  But I’ve never seen an asteroid until last night, when I photographed the 2nd largest minor planet, Vesta.
Vesta is currently near opposition, meaning as seen from the Earth, Vesta and the [...]

0 comments Read the full article →

Weekly Links #3

February 20, 2010

Danish Open Source Vendors declares victory in open standards war
“The mood at this year’s general meeting was joyous. In late January 2010, OSL could declare victory in maybe the most important and hard fought battle that OSL has been part of since its formation.
On 29 January 2010 the Danish Parliament (Folketinget) decided unanimously to place [...]

0 comments Read the full article →

Microsoft Office document corruption: Testing the OOXML claims

February 15, 2010

Summary
In this post I take a look at Microsoft’s claims for robust data recovery with their Office Open XML (OOXML) file format.  I show the results of an experiment, where I introduce random errors into documents and observe whether word processors can recover from these errors.  Based on these result, I estimate data recovery rates [...]

22 comments Read the full article →

Weekly Links #2

February 13, 2010

Q&A: IBM’s Alistair Rennie on the big picture for Lotus | Between the Lines | ZDNet.com
“Ultimately, new types of documents are possible. I don’t see my kids creating content 5 to 10 years from now going into a dumb text editor and doing a presentation. I see them using rich media and aggregating small bits [...]

2 comments Read the full article →

Weekly Links #1

February 6, 2010

Advogato: Proprietary File Formats conflict with Equal Opportunities
tags:  standards

Bob Sutor: What would ODF support for WordPress look like?
tags: ODF

Lotus Solutions Development Lab: Lab 04: ODFDOM: Generating ODF Documents from a Notes Agent
tags: ODF, Notes

Gwennel – A WYSIWYG and WYSIWYM editor
“Gwennel is a free WYSIWYG and WYSIWYM editor for Windows supporting natively the Open Document Format.”
tags: [...]

0 comments Read the full article →

ODF 1.2 Part 1 Public Review

January 25, 2010

A major milestone was reached for the OASIS ODF TC today.  The latest Committee Draft of ODF 1.2 Part 1 was sent out for a 60-day public review.
“What does this mean, and why should I care?” you might be asking.  That’s a fair question.
First, a quick review of the OASIS standards approval process.  The stages [...]

6 comments Read the full article →

At the Setting of the Sun

January 21, 2010

As my readers have no doubt heard by now, today the EC cleared Oracle’s proposed acquisition of Sun Microsystems.  This will undoubtedly have a significant impact on all Sun employees,  many of whom I have worked with toward common purposes, on standards or open source projects, and whom I am [...]

4 comments Read the full article →