Archives for 2013

Apache OpenOffice 2013 Mailing List Review

2013/12/18 By Rob 1 Comment

I did a quick study of the 2013 mailing list traffic for the Apache OpenOffice project. I looked at all project mailing lists, including native language lists. I omitted the purely transactional mailing lists, the ones that merely echo code check-ins and bug reports. Altogether 14 mailing lists were included in this study.

In 2013 the OpenOffice community mailing lists saw 24,423 posts from 2,211 unique posters, in 4,819 threads.

A word cloud of the most frequent words in post titles (thanks to Jonathan Feinberg’s Wordle app) follows. As you can see, the terms used in the Propose/Approve/Code/Test/Release workflow rise to the top. That shows the project’s focus.

I thought it would also be interesting to look at this from a social network perspective, looking at the atomic units of collaboration on a mailing list: responding to a post. Of course, not all posts involve a response. It is common for someone to post information, not requiring or expecting a response. But there are many responses. As mentioned above, there were 24,423 posts in 4,819 threads, so an average of 4 responses per post. We can represent this as a directed graph, with each poster treated as a node, and a directed arc to each responder node from the node of the original post author. (This might seem backwards, and you could argue for reversing the arcs, but in general in mailing lists the responder is providing value to the original poster, so the centrality of the responder will be more relevant. Consider, for example, the questions coming from random users, and the experienced project members who answer them.)

Forming a graph in this way gives us a giant component (representing 98.84% of the whole graph) with 1,955 nodes and 7,069 arcs. Average degree (number of collaboration partners for each person) is 3.6. 46 people responded to more than 50 other people. Maximum degree is 714 (Apache OpenOffice V.P. Andrea Pescetti). A visualization of this graph, using the open source Gephi) follows. You can click on the image for a larger version. Nodes have been scaled to reflect betweenness centrality (a measure the degree to which a node helps connect others into the graph) and colored via a modularity algorithm which finds sets of nodes that have a high degree of interconnection.

You should click on the graph to see the full-size version.

What a marvelous, large and complex project we have in Apache OpenOffice!

IBM Support for Apache OpenOffice

2013/11/04 By Rob 2 Comments

As you probably know, IBM has been involved with the OpenOffice.org community for many years. This included collaboration on ODF and accessibility at first, as we worked on our separate Lotus Symphony fork. And then in 2011 we followed the OpenOffice.org community to Apache where Apache OpenOffice then took off. Since then we’ve been merging in features and bug fixes from Symphony, essentially ending the Symphony fork. The first results of this collaboration showed up in Apache OpenOffice 4.0, with the new side panel UI. The reception of this new release has been phenomenal. The release received great reviews, including an 2013 InfoWord Best of Open Source (Bossie) award. The success of this release propelled us to recently hit a new download milestone: Over 75 million copies of Apache OpenOffice in the less than 18 months since the first release of Apache OpenOffice.

The overall market for office productivity suites is changing. Microsoft Office 2003 is hitting End of Life in April 2014, causing companies still using it to explore other options. The introduction of new subscription models from Microsoft, as well as emergence of new cloud-based editors from several players, including IBM, are also making customers reevaluate their dependency on Microsoft Office. Do we really need Office? For everyone? What are the alternatives?

I’m really pleased to see other parts of IBM starting to see the opportunities available with Apache OpenOffice. Already publicly announced include integrations with IBM Connections, IBM SmartCloud and IBM ECM and Case Manager. (If there are other IBM products that you think would benefit greatly from integration, let me know!)

The latest, and most significant, enabler of enterprise use of Apache OpenOffice is our IBM Support for Apache OpenOffice offering. Although individual end-users and even small businesses can easily deploy Apache OpenOffice on their own (75 million downloads testifies to that), larger enterprises with more complicated and demanding needs benefit from the kind of expertise that IBM can provide. So I’m glad to see this offering available to fill out the ecosystem, so everyone can use and be successful with Apache OpenOffice, from individual university students, to small non-profits, to large international corporations.

The Power of Brand and the Power of Product, Part 3

2013/10/21 By Rob Leave a Comment

In the previous two parts (one and two) I described a model of product adoption and market share that could be built with a single survey question. I applied this model to the open source productivity suites OpenOffice and LibreOffice, looking at adoption in September 2012 and April 2013.

The results were described in detail in the previous article in this series, but can be summarized as:

OpenOffice	September 2012	April 2013	Change
Customer Awareness	24.3%	27.6%	14% growth
Customer Motivation	63.0%	65.9%	5% growth
Customer Satisfaction	70.6%	68.7%	3% decline
Market Share	10.8%	12.5%	16% growth

Six months have now passed and it is worth taking another look to see how things have evolved. As I did previously, I used Google’s Consumer Survey service which uses sampling and post-stratification weighting to match the target population, which in this case was the US internet population. In other words, the survey is weighted to reflect the population demographics, for age, sex, region of the country, urban versus rural, income, etc. I did this survey in a personal capacity for my own interest. The Standard Disclaimer applies.

OpenOffice (N=1519)	September 2012	April 2013	September 2013	Change (September to September)
Customer Awareness	24.3%	27.6%	30.7%	26% growth
Customer Motivation	63.0%	65.9%	67.4%	7% growth
Customer Satisfaction	70.6%	68.7%	77.8%	10% growth
Market Share	10.8%	12.5%	16.1%	49% growth

So what do we see? Very nice results, indeed. The OpenOffice brand is strong and growing. Over 30% of consumers surveyed had heard of it. Of those who had heard of it, 67% had given it a try. That number is changed little. This is an opportunity for Apache OpenOffice marketing volunteers to improve both of these numbers. Of those who tried OpenOffice almost 78% continued to use OpenOffice. This is a modest increase, but there is certainly room to improve here. Put it altogether, and the estimated user share, the percentage of US internet users who use OpenOffice “sometimes” or “regularly” is 16.1%, nearly a 50% improvement year-over-year.

In any case, to summarize and to illustrate the improvements graphically, I’ve charted the growth in user share over the three surveys, including results for LibreOffice as well:

Visualizing OASIS Technical Committees

2013/07/01 By Rob 1 Comment

So what do we have here? This is a simple social network visualization, of OASIS Technical Committees. Each circle in this graph represents a single Technical Committee (TC). The size of the circle is proportionate to how many members are on the committee. The lines between the committees have a weight that is proportionate to the overlap in membership between the TCs. In this case I used Dice’s coefficient as a metric, although any of the several set similarity metrics (Jaccard, etc.) would work here. The color of each node represents the modularity class, a measure of communities or sub-networks within the graph. The resulting graph was then run through Gephi and its Force Atlas layout algorithm , which brings together the TCs that are more closely related by overlapping membership. Click the image for a larger version.

(For those who are interested, the raw data for this is all publicly available, on the OASIS website. Scraping the webpages for the data, calculating the graph and outputting a GEXF format file for Gephi was accomplished in 133 lines of Python.)

Note one important fact: the graph is formed entirely on abstract concepts, the size of each committee and the overlaps in membership. It has no knowledge of what the underlying technologies are, the companies and individuals involved, or of other items of semantic value that could describe the work of the committee. The structure is essentially based on the interests and affiliations of individual committee members. Where there is common interest it is assumed that there is commonality in the work of the TCs.

So how well does this match reality? The image that follows (click for an enlarged version) is the same chart, but with each node labeled by the short name of the TC. As you can see, the above approach does a fine job bringing together related TCs. This occurs both at the fine-grained level, where the DITA TC and the DITA Adoption TC, or the SCA and SCA Assembly TCs are adjacent, and it also applies at the broader level, where we see communities for content-related standards, for privacy/identity standards, legal/emergency, etc.

The Power of Brand and the Power of Product, Part 2

2013/06/12 By Rob

In Part 1 of this series we looked at a model of product adoption and market share that had a special and valuable property: the parameters of the model could be derived from a single survey question, e.g.:

“What is your awareness with the hand cream called Whizzo-Soft?”

A. I have never heard of it.

B. I have heard of it but I have never tried it.

C. I have tried it once.

D. I use it sometimes.

E. I use it regularly.

Given N responses to that survey questions you can derive the factors in the model by simple math:

Customer Awareness = 1 – A/N
Customer Motivation = (C + D + E) / (N -A)
Customer Satisfaction = (D + E)/(N – A – B)
Market Share = Customer Awareness * Customer Motivation * Customer Satisfaction

So let’s take a look at how this can be used in practice, taking the leading open source office productivity editor, OpenOffice, and the lesser known LibreOffice fork, as examples.

As mentioned in Part 1, the execution of the survey is critical here. Without a proper, random survey of the market, the results will not be accurate. In particular a survey of your current users will not work, since one of your goals is to find out what proportion of users are not familiar with your product.

So in this case I used Google’s new Consumer Survey service which uses sampling and post-stratification weighting to match the target population, which in this case was the US internet population. In other words, the survey is weighted to reflect the population demographics, for age, sex, region of the country, urban versus rural, income, etc. I did this survey in a personal capacity for my own interest. The Standard Disclaimer applies.

They survey question (and responses were):

What is your familiarity with the software application called “OpenOffice”?

I have never heard of it
I am aware of it but have never used it
I have tried it once
I use it only sometimes
I use it on a regular basis

With 1502 responses, the results were:

I have never heard of it	72.4%
I am aware of it but have never used it	9.3%
I have tried it once	5.7%
I use it only sometimes	5.9%
I use it on a regular basis	6.6%

And then with some simple arithmetic we have:

Customer Awareness	27.6%
Customer Motivation	65.9%
Customer Satisfaction	68.7%
Market Share	12.5%

What does that mean? In plain English:

Around 1/4 of US internet users have heard of the OpenOffice software application. That is the brand recognition.
Of those who have heard of OpenOffice, around 2/3 of them were sufficiently motivated to try the software.
And of those who tried OpenOffice 69% were sufficiently satisfied with the software that they continue to use it.
Overall, 1/8 of the surveyed population uses OpenOffice sometimes or regularly.

The absolute numbers are tricky to interpret in isolation. More interesting is to look at the numbers over time. The same survey question, with the same methodology was also given last September. The results and the change are in the following table, with changes having statistical significance (90% confidence level) emphasized in bold.

OpenOffice	September 2012	April 2013	Change
Customer Awareness	24.3%	27.6%	14% growth
Customer Motivation	63.0%	65.9%	5% growth
Customer Satisfaction	70.6%	68.7%	3% decline
Market Share	10.8%	12.5%	16% growth

The Apache OpenOffice project should be gratified that their efforts have paid off, and awareness of the product is increasing, as well as market share. This goes contrary to some loudly expressed concerns that the OpenOffice brand would languish at Apache. Clearly this is not so. The brand is growing, as well as the market share.

Since these factors are multiplicative, an increase in any one of them, or any combination of them, will grow the market share. But it is probably easiest to grow the factor that is smallest today. So looking to the future, increasing the awareness of the existence of OpenOffice would give the “biggest bang for the buck”.

For an entirely different view we can look at the same survey question and methodology, administered at the same times, only substituting the product name “LibreOffice” for “OpenOffice”. Again, statistically significant changes are shown in bold.

LibreOffice	September 2012	April 2013	Change
Customer Awareness	10.7%	9.9%	7% decline
Customer Motivation	53.3%	66.7%	27% growth
Customer Satisfaction	73.7%	59.7%	19% decline
Market Share	4.2%	4.0%	5% decline

The brand recognition is not growing and is stuck at 10%. The fact that in its third year of product availability the LibreOffice brand recognition has plateaued (if not declined) should be a concern.

But the more interesting thing here is the large increase in users trying LibreOffice (Motivation) offset by the large decrease in users who continue to use the product (Satisfaction). What does this mean? Only the LibreOffice folks can say for certain, but this pattern is exactly what one would expect from a product where marketing has got ahead of quality. It is like a movie that previews well, but suffers from bad reviews and poor sales after the first weekend. Product development aims to make products that users want. And marketing persuades users to try the product. But where there is a disconnect between the two, where the product is not fulfilling the needs of those to whom it is being marketed, or (the same thing really) the product is being marketed to unsuitable users, this is what you see.

I should note that LibreOffice supporters like to blame their lack of success on not having the OpenOffice brand. Yes, having a familiar brand is a nice thing to have, but the drop in Satisfaction for those trying LibreOffice is not a brand issue, since it is entirely among those who are already familiar with the LibreOffice brand. Satisfaction is an attribute of the product, not due to brand.

Also, we can compare the metrics across products. When we look at the most recent data OpenOffice clearly has an enormous lead in name recognition and market share, but also a large lead in Satisfaction. 69% of those who tried OpenOffice remained users, compared to 60% for those who attempted to use LibreOffice. Keep your users satisfied and it is hard to go wrong.

Finally, and to reiterate up what I wrote earlier in my Scarcity Fallacy post, when you consider the position of Microsoft Office in this market, both products have a relatively small presence, with ample of room to grow, at Microsoft’s expense. This is a great area to advance the cause of open source software, in a product category that almost every user needs. There is no shortage of opportunity here, only a shortage of imagination. Imagine if we combined the stability/quality and brand recognition of Apache OpenOffice with the enthusiastic marketing team of LibreOffice? (Combine our 50 million downloads with their 50 million press releases) What if we combined the disciplined development approach of OpenOffice with LibreOffice ‘s talented developers? Imagine what we could do?

Let’s admit it. LibreOffice has plateaued. They have their Linux desktop users, all 3% of the market that runs Linux on the desktop. This market share was not earned. These are not users that they won over. These are users they got via the control their corporate sponsors have over Linux distributions. They flipped a bit and instantly had that market share. But their sponsors are Linux vendors that have little motivation to reach beyond that niche market. (They certainly have little success doing so). The opportunity for growth is not on the Linux desktop, unless the goal is to merely be a small fish in an even smaller pond. Of course, LibreOffice could continue, and languish indefinitely as a pet project of a handful of Linux developers. Or they could work with us at Apache, and satisfy the Linux users, but do so very much more as well. This would also be a cost savings for LibreOffice’s corporate sponsors, no small factor in a world of declining PC sales. The choice now, as it always has been, is theirs.