{"id":2117,"date":"2012-11-04T15:43:40","date_gmt":"2012-11-04T20:43:40","guid":{"rendered":"http:\/\/2d823b65bb.nxcli.io\/?p=2117"},"modified":"2013-12-16T17:09:55","modified_gmt":"2013-12-16T22:09:55","slug":"libreoffices-dubious-claims-part-3-developers","status":"publish","type":"post","link":"https:\/\/www.robweir.com\/blog\/2012\/11\/libreoffices-dubious-claims-part-3-developers.html","title":{"rendered":"LibreOffice\u2019s Dubious Claims: Part 3, Developers"},"content":{"rendered":"<p>(This post represents my personal opinion only.\u00a0 The <a href=\"https:\/\/2d823b65bb.nxcli.io\/blog\/who-is-rob-weir\">standard disclaimer<\/a> applies.)<\/p>\n<p>In previous posts I looked at claims made by LibreOffice, in project blog posts and press releases, related to the <a href=\"https:\/\/2d823b65bb.nxcli.io\/blog\/2012\/10\/libreoffices-dubious-claims-part-i-downloads.html\">number of LibreOffice users<\/a> and the <a href=\"https:\/\/2d823b65bb.nxcli.io\/blog\/2012\/11\/libreoffices-dubious-claims-part-2-community-size.html\">number of active LibreOffice contributors<\/a>.\u00a0 I showed that in both cases the claims from LibreOffice were greatly inflated due to various flaws.\u00a0 For example, they double counted users who upgraded from earlier release of LibreOffice, often several times over.\u00a0 And they counted as &#8220;active contributors&#8221; those who registered for a wiki account but never actually contributed anything.\u00a0 In this blog post we&#8217;ll look at the even more egregious ways which the LibreOffice project is overstating the number of developers that are active with the project.<\/p>\n<h3>A Quick Quiz<\/h3>\n<p>To prepare your frame of mind for what you are about to learn, I encourage you to first take the following quiz.<\/p>\n<blockquote><p>When asked to report on the population of your home town, what would you report?<\/p>\n<p>A.\u00a0 The number of people with primary residences in the town.<\/p>\n<p>B. The number of people who have ever lived in the town, even if they no longer live there.<\/p>\n<p>C. The number of people who drive through the town on their way to somewhere else.<\/p>\n<p>D.\u00a0 All of the above.<\/p><\/blockquote>\n<p>If you picked D, you would be an excellent candidate for the LibreOffice marketing department.<\/p>\n<p>With that mental preparation out of the way, let&#8217;s continue.<\/p>\n<h3>The Claims<\/h3>\n<ul>\n<li>From a<a href=\"http:\/\/blog.documentfoundation.org\/2012\/11\/01\/the-document-foundation-announces-libreoffice-3-6-3\/\"> recent LibreOffice announcement<\/a>:\u00a0 &#8220;growing developer base, which has just reached the number of 550 since the launch of the project, making LibreOffice one of the fastest growing free software projects of the decade.&#8221;<\/li>\n<li>Or a<a href=\"http:\/\/blog.documentfoundation.org\/2012\/09\/27\/the-document-foundation-celebrates-its-second-anniversary-and-starts-fundraising-campaign-to-reach-the-next-stage\/\"> couple of weeks ago<\/a>: &#8220;LibreOffice is the result of the combined activity of 540 contributors&#8221;<\/li>\n<li>Quoted on<a href=\"http:\/\/www.linux.com\/news\/software\/applications\/660608-libreoffice-a-continuing-tale-of-foss-success\"> Linux.com<\/a>:\u00a0 &#8220;our large developer base &#8212; over 540 people at the end of September 2012 &#8212; is an incredibly efficient self-governing machine&#8221;<\/li>\n<\/ul>\n<p>You can find many variations on this same claim.<\/p>\n<h3>All Your Developer Are Belong to Me<\/h3>\n<p>With a number this large, it should not be hard to find these 550 developers.\u00a0 So let&#8217;s see if we can track this down.\u00a0\u00a0 One place to start is to look at the <a href=\"https:\/\/www.libreoffice.org\/about-us\/credits\/\">LibreOffice credits page<\/a>.\u00a0 We see there a large table of &#8220;Developers committing code since 2010-09-28&#8221;.\u00a0 If we count the names in this table we get 469.\u00a0 Not quite 550, but pretty close, yes?<\/p>\n<p>But if you look closer at the names in the list, you begin to scratch your head.\u00a0 There are names here of former Sun\/Oracle developers who lost their jobs when Oracle stopped developing the project.\u00a0 Some commentators, like Mark Shuttleworth, put much of the blame for Oracle divesting from OpenOffice on the &#8220;<a href=\"http:\/\/pages.citebite.com\/e7v0f3m9sder\">radical faction<\/a>&#8221; that forked to create LibreOffice.\u00a0\u00a0 Now aside from costing them their jobs, LibreOffice now insults them by using their names for propaganda purposes to puff up LibreOffice&#8217;s developer claims?!<\/p>\n<p>Looking further, I see the names of IBM colleagues who have never participated in the LibreOffice project.\u00a0 They are active developers on Apache OpenOffice, and former OpenOffice.org developers, but here they are listed in a table of &#8220;Developers committing code&#8221;.\u00a0 How curious the ways of LibreLand!<\/p>\n<p>If you scroll down to the bottom of the table you get a clue in the fine print:\u00a0 &#8220;We can not distinguish between commits that were imported from the OOo\/AOO code base and those who went directly into the LibreOffice code base.&#8221;<\/p>\n<p>Hmmm&#8230; so let me get this right.\u00a0\u00a0 If you take my code, you say that I committed it to the LibreOffice project.\u00a0\u00a0 And if I contributed to the code to OpenOffice.org or Apache OpenOffice, and you take it, you&#8217;ll list me in your LibreOffice developers table for a contribution I never made to LibreOffice and put a &#8220;joined&#8221; date next to my name for an organization I never joined.\u00a0\u00a0 Really?<\/p>\n<p>This is an odd way of accounting for developers.\u00a0\u00a0\u00a0 I&#8217;m pretty sure that 100% of readers of LibreOffice press releases and 100% of journalists who write articles based on LibreOffice claims would feel somewhat abused by such idiosyncratic definitions.\u00a0\u00a0 It is certainly not the most honest and forthright way of stating how many developers LibreOffice has.\u00a0\u00a0 One does not expect that OpenOffice.org developers, who were never involved with the LibreOffice project, and may not even know that their code is being used, will be included in the count.<\/p>\n<h3>Monotonically Increasing<\/h3>\n<p>Aside from counting people who are not actually involved in LibreOffice and never were, the LibreOffice claims are peculiar because of the low threshold for inclusion and the perpetuity of inclusion once added.\u00a0 When you hear claims of a &#8220;developer base&#8221; you are lead to think of a body of actual developers actually working at present on the code.\u00a0 That would be the normal usage.\u00a0 But in LibreLand it is not done that way.\u00a0 If you made a single contribution ever (or as we know now from the above, even never) then you are in the &#8220;developer base&#8221; and will be listed as a LibreOffice developer for all time.<\/p>\n<p>From the perspective of gratitude and acknowledgement, giving credit is fair and generous.\u00a0 Apache OpenOffice also has a long list of names on its <a href=\"http:\/\/www.openoffice.org\/welcome\/credits.html\">Credits page.\u00a0<\/a>\u00a0 But we don&#8217;t tally this retrospective list of past contributors and claim that number as an active community size.\u00a0 From the perceptive of claiming a community size, this would be deceptive.\u00a0 That is like calculating the population of a town by listing everyone who ever lived there.<\/p>\n<p>Because of this odd practice, the LibreOffice developer count will never decrease.\u00a0 It can only go up.\u00a0 Even &#8212; worst case &#8212; if an asteroid hits their next hackfest &#8212; the numbers would merely be flat.\u00a0 (So would the developers present)\u00a0 In any case if you&#8217;ve designed a metric that can never decrease, then it should not be newsworthy for you to report that it is increasing.\u00a0 This is not an accomplishment.\u00a0 That is just mathematics.<\/p>\n<h3>How to Juice the Developer Count<\/h3>\n<p>An easy way to increase, for reporting purposes, the number of &#8220;developers&#8221; a project can claim is to encourage trivial churning of the code base.\u00a0 For example, translating comments from German to English, removing dead code and other similar tasks can be done without even really knowing C++, or at least not knowing it well.\u00a0 But it can prompt the temporary or even one-time participation of many &#8220;developers&#8221;, and in the process increase your developer count.\u00a0 LibreOffice made a tremendous effort to enable a low threshold for contributions and this effort paid off, at least in developer counts.<\/p>\n<p>As an example of the impact such practices can have, I took a look at the &#8220;core&#8221; git repository for LibreOffice, and all of the commits since 2010-09-28.\u00a0 After identifying and collapsing multiple email addresses used by some persons, I ended up with 518 names.\u00a0 Of those names, 166 , or 1\/3 of them, have made only a single commit, and then were never heard of again.\u00a0 So it is curious to count them as part of LibreOffice&#8217;s vaunted &#8220;developer base&#8221;.\u00a0 A community is not made up of those who contribute once and then leave.<\/p>\n<p>In fact, once you take out those who never participated in LibreOffice but had their code taken from OpenOffice, you find that almost no one in this &#8220;developer base&#8221; actually does anything. For example 261 of the &#8220;developers&#8221; combined (over 1\/2 of all of the claimed developers) together did only 1% of total commits.\u00a0 So there is a long tail of inactive &#8220;developers&#8221; who are puffing up the LibreOffice claims.<\/p>\n<p>This is a little easier to see with reference to the following chart, which shows the cumulative number of code commits (y-axis) against the cumulative number of developers (x-axis).\u00a0 It shows, for example, that 10% of the developers, mainly Novell\/SUSE and RedHat employees, were responsible for nearly 90% of all of the work.\u00a0 It also backs up my observation that the vast majority of the claimed &#8220;large developer base &#8212; over 540 people&#8221; and the &#8220;incredibly efficient self-governing machine&#8221; makes an overall miniscule contribution.\u00a0 There is nothing wrong with this graph per se,.\u00a0 Many projects will show some form of this.\u00a0 But if you make a primary claim on your project&#8217;s success as having an independent developer community of 550 people, it is a bit embarrassing that most of them are not actually active, and that many of them never were.<\/p>\n<p style=\"text-align: center;\"><a href=\"https:\/\/2d823b65bb.nxcli.io\/blog\/wp-content\/uploads\/2012\/11\/lo-chartjpg.jpg\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-2119\" style=\"border: 1px solid black;\" title=\"lo-chart,jpg\" alt=\"\" src=\"https:\/\/2d823b65bb.nxcli.io\/blog\/wp-content\/uploads\/2012\/11\/lo-chartjpg.jpg\" width=\"754\" height=\"611\" srcset=\"https:\/\/www.robweir.com\/blog\/wp-content\/uploads\/2012\/11\/lo-chartjpg.jpg 754w, https:\/\/www.robweir.com\/blog\/wp-content\/uploads\/2012\/11\/lo-chartjpg-300x243.jpg 300w\" sizes=\"auto, (max-width: 754px) 100vw, 754px\" \/><\/a><\/p>\n<h3><\/h3>\n<h3>Toward a Better Metric<\/h3>\n<p>Part of the confusion here seems to stem from the desire to illustrate two things with one metric:\u00a0 project capabilities and project diversity.\u00a0 That is asking too much for one metric.\u00a0 If you want to look at the capabilities of the project, and do it from the input side, then you need to deal with normalizing differences in skills, experience, time on task, motivation, etc.\u00a0 This is very difficult in a project where you have a mix of full-time Novell\/SUSE employees mixed in with part time and occasional developers.\u00a0 But of all available options, a raw count of developers is the worst possible metric to pick.\u00a0 It is meaningless.\u00a0 Better would be to look at commits, or better line counts, or even better function points, or hours on task, or features, or some measure of output.\u00a0\u00a0 No one cares what your input is.\u00a0 A feature developed by 3000 is not necessarily better than a feature developed by 3.\u00a0 Results are what counts.<\/p>\n<p>From the diversity standpoint, adding hundreds of names who do nothing is not a way to increase diversity.\u00a0 There are standard metrics for measuring diversity, inspired by Shannon&#8217;s definition of information entropy and commonly used in ecological species surveys.\u00a0\u00a0 The Shannon Equitability Index is a scaled value, 0-1, that measures diversity.\u00a0 A value of 0 would indicate no diversity, that one person did all the work and the other names had zero contribution..\u00a0 A value of 1 would indicate that the work was evenly done.\u00a0 In the chart above a value of 1 would have a line at 45-degrees up from lower left to upper right.\u00a0\u00a0 If you calculate the Shannon Equitability Index for LibreOffice for all commits since 2010-09-28 you get a value of 0.6413.\u00a0 It would be interesting to see how this value evolves over time.\u00a0\u00a0 Oh, and if you calculate this index for Apache OpenOffice, the value is 0.7268, which is even better, more diverse.<\/p>\n<p>(This post represents my personal opinion only.\u00a0 The <a href=\"https:\/\/2d823b65bb.nxcli.io\/blog\/who-is-rob-weir\">standard disclaimer<\/a> applies.)<\/p>\n","protected":false},"excerpt":{"rendered":"<p>(This post represents my personal opinion only.\u00a0 The standard disclaimer applies.) In previous posts I looked at claims made by LibreOffice, in project blog posts and press releases, related to the number of LibreOffice users and the number of active LibreOffice contributors.\u00a0 I showed that in both cases the claims from LibreOffice were greatly inflated [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_genesis_hide_title":false,"_genesis_hide_breadcrumbs":false,"_genesis_hide_singular_image":false,"_genesis_hide_footer_widgets":false,"_genesis_custom_body_class":"","_genesis_custom_post_class":"","_genesis_layout":"","footnotes":""},"categories":[48,22],"tags":[],"class_list":{"0":"post-2117","1":"post","2":"type-post","3":"status-publish","4":"format-standard","6":"category-fud","7":"category-openoffice","8":"entry"},"_links":{"self":[{"href":"https:\/\/www.robweir.com\/blog\/wp-json\/wp\/v2\/posts\/2117","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.robweir.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.robweir.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.robweir.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.robweir.com\/blog\/wp-json\/wp\/v2\/comments?post=2117"}],"version-history":[{"count":16,"href":"https:\/\/www.robweir.com\/blog\/wp-json\/wp\/v2\/posts\/2117\/revisions"}],"predecessor-version":[{"id":2131,"href":"https:\/\/www.robweir.com\/blog\/wp-json\/wp\/v2\/posts\/2117\/revisions\/2131"}],"wp:attachment":[{"href":"https:\/\/www.robweir.com\/blog\/wp-json\/wp\/v2\/media?parent=2117"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.robweir.com\/blog\/wp-json\/wp\/v2\/categories?post=2117"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.robweir.com\/blog\/wp-json\/wp\/v2\/tags?post=2117"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}