(This post represents my personal opinion only. The standard disclaimer applies.)
Part II is here and Part III is here.
I’ve recently read some implausible claims from the LibreOffice project, concerning their stats for downloads and users. (These two different statistics are unfortunately conflated in their publicity campaigns, but more about that later). Their claims fall apart if given any scrutiny and placed against comparable numbers from Apache OpenOffice. I think you’ll agree by the time you are done reading my analysis .
If this were merely yet another case of puffery from the LibreOffice marketing department then I might just let it go, as I have with many other similar claims in the past couple of years. But to the extent that some people seem take these claims as facts, and are repeating, them, then I hope I will be forgiven for giving truth a chance to be heard. I’ll lay out the numbers as I know them and let you be the judge.
First, what do we have on the Apache OpenOffice side? Most of our downloads are from our download site hosted by SourceForge. The download stats are public and exposed by SourceForge via their REST API. We gather these stats with a Python script (also public here) and that data is saved to a data file, which is then plotted on our website. So everything is open and transparent here. The downloads are counted by a respected 3rd party and the entire processing of these numbers is open for inspection. It is all there, day-to-day, including breakdown by country and operating system. We have nothing to hide.
The LibreOffice numbers, on the other hand, we only know from download claims in press releases, and then only at long intervals. We have no idea what exactly they are counting. They have never made the detailed stats public. This does not mean that the numbers are incorrect of course. It just means that no one outside of their project’s leadership is able to verify the claims.
But taken for what its worth, let’s look the recent LibreOffice claims and compare it to the actual data posted by Apache.
- On Sept 27th, LibreOffice claimed “Downloads since January 25, 2011, the date of the first stable release, have just exceeded 18 million”.
- On that same day, OpenOffice had accumulated 18,207,610 download via SourceForge. (Per the posted data file, which you can verify against SourceForge’s Stats API if you wish)
So both projects are doing equally well, yes?
Well, no, not at all. You need to take the time interval into consideration. The LibreOffice counts were from January, 2011. The OpenOffice counts were from May, 2012. So in just a few months OpenOffice was downloaded as many times as LibreOffice was in its first two years.
If we convert to an average daily download rate we see:
- LibreOffice: 18,000,000/611 days = 29, 460 /day
- Apache OpenOffice: 18,207,610/143 days = 127, 326 /day
So the download rate has been 4x greater for Apache OpenOffice, and shows no sign of slowing.
A chart might make this clearer, showing the actually OpenOffice download figures (that is why the line is a little wavy) and the claimed LibreOffice trend. (Anyone want to guess on what this chart will look like six months from now?)
Downloads versus Users
It is important when looking at download numbers that one does not equate download counts with user counts. This is especially true when you are dealing with upgrade cycles. As you probably know, neither OpenOffice nor LibreOffice have an incremental update facility. If you want to update, say from Apache OpenOffice 3.4.0 to 3.4.1 then you need to download a complete copy of 3.4.1 and install it over your 3.4.0.
This complicates things. Upgrades tend to inflate the download counts, since an upgraded user is counted twice: once for their original download and a second time for their upgrade. This makes estimating the number of users from the number of downloads tricky. So to be fair, when estimating the number of Apache OpenOffice users we must not neglect the impact of having a minor maintenance release on causing two downloads for users who upgraded.
But if this is an impact on OpenOffice, which had only two releases to reach 18,207,610 downloads, then how much greater must be the impact for LibreOffice? For example, their 3.5.x series had 7 releases to fix critical bugs. Their download counts included downloads from 3.3.x, 3.4.x and 3.6.x series as well, each one with its own set of bug fix releases. One must assume, due to the long duration of this reporting interval (nearly two years) and the instability of the early releases within each series, that LibreOffice users have upgraded numerous times each, causing numerous duplicate download counts, and leading the aggregate download count to reflect several times the number of actual users.
In other words, having a rapid release cycle with no incremental update facility will juice your download numbers since each real user will end up downloading many copies of your product. Since LibreOffice had a dozen or more releases, and OpenOffice only two, it is logical to conclude that the LibreOffice user numbers are far less than suggested by their download numbers, perhaps lower by a factor of 4 or 5.
Objection: External Sites
I anticipate several objections against the above analysis, so let’s treat those one by one.
First, one might note that LibreOffice has claimed an additional two million downloads from “external sites offering the same package”. Since these claims are not backed with names or numbers, I cannot say much other than the fact that OpenOffice is downloaded from external websites as well. But we don’t count those in our main download counts. But suppose we wanted to, and wanted to do apples-to- apples comparison with LibreOffice, with numbers from a 3rd party neutral source?
Let’s take Download.com, CNET’s software download repository, one of the most popular download sites around, as an example. Here are the download numbers for the 3-month period from 7/28/12 through 10/28/12:
- OpenOffice: 328,846 downloads
- LibreOffice: 18,008 downloads
In this case the OpenOffice download numbers are greater by a factor of 18x.
So I don’t think the external download sites changes things much. The numbers are small overall, but per day the OpenOffice numbers are far higher than LibreOffice’s.
Objection: Linux users
On top of the 20 million users LibreOffice claims on Windows and Mac, they also stick a finger in the air and decide they have 30 million Linux users as well. This leads to extravagant claims like, “As of today, LibreOffice is being used by close to 60 million people”. They don’t detail how they arrive at this number, but it appears to be the culmination of a series of implausible assumptions:
- Take the highest of the several estimates for the number of Linux desktops
- Assume that everyone is using their Linux desktop for document editing
- Conveniently ignore AbiWord, KOffice, Gnumeric, Calligra, Google Docs or even MS Office under Wine, and assume that everyone on Linux uses LibreOffice.
- Ignore the many Linux users who are displeased with LibreOffice and who have uninstalled it and replaced it with OpenOffice instead.
They make these assumptions and then claim another 30 million LibreOffice users on top of their inflated claim of 20 million Windows/Mac users.
But this really misses the point. The trajectory is what matters. In a long race you bet on the faster horse, not the one who has a small head start. You can have 100% of the 3% Linux desktop market and even under the rosiest assumptions that is only 3%. And that number is decreasing, as desktop users move to tablets, where Android is the player and the Apache License is preferred for userspace code. And I doubt Google will prefer LibreOffice in this space over their own recent QuickOffice acquisition, which already has an app supporting Android (and iOS).
Another point is that one should not equate users who intentionally download and install a product with users who have it automatically installed as part of an OS, without their knowledge. These are not the same thing, and to treat them as such is to confuse a downhill skier with someone who fell down a snowy hill. The one does something intentionally; the other has something done to them.
That is not to say that Linux users are not important. We certainly treat Linux as a first-class platform within the project. I’d like to see us do the packaging work necessary to make Apache OpenOffice available to users on Linux, via their distros. Users should have choice, even on Linux. If you’re interested in helping with this, send me an email.
Objection: All numbers are incomparable
Another objection is to say that all projects live in a different context, with a different user base and that the numbers can never be compared against each other in a fair way. All is relative, subjective, and LibreOffice is justified in making any claim it feels like, since it is its own reference and base of comparison.
There are several counters to this objection. First, when LibreOffice publishes numbers, in press releases and blog posts, it has an obligation not to deceive its readers. This is basic professional ethics. When you claim a certain number of users, there should be some solid basis for making that claim, not merely the absence of contradictory information. In any case, I’ve provided adequate contradictory data in this post.
Another counter is to point out that some comparisons are closest to an apples-to-apples comparison. For example, the number of Windows downloads directly from a project’s website. OOr downloads from a neutral 3rd party website like CNet’s Download.com. Of course we can debate the fine details and nuances to the right of the decimal place. But that does not provide an excuse for conflating download numbers with user numbers in a press release. You may not know everything, but you should know that this is not right.
Apache OpenOffice makes available detailed download statistics in near-realtime for inspection. LibreOffice makes download claims in press releases at wide intervals with no supporting data.
If you do an apples-to-apples comparison, of Windows and Mac users, which together constitute 97% of the desktop market, Apache OpenOffice, although it took a while to make its first release, 3.4.0, has taken off like a rocket, and has eliminated any head-start advantage LibreOffice had, and is racing ahead with 4x the downloads that LibreOffice is reporting. And since the LibreOffice numbers are inflated by duplicate counting of upgrade downloads, OpenOffice is probably already ahead of LibreOffice in users on these platforms by a factor of 10 or more.
Under a series of implausible assumptions, LibreOffice claims an additional 30 million users on Linux. The actual number is unknown, but likely far less. But since Linux desktops are only 3% of the desktop market, and that market is shrinking, this is not a realistic growth opportunity for LibreOffice.
(This post represents my personal opinion only. The standard disclaimer applies.)