Apache

Perspectives on Apache OpenOffice 3.4 download numbers

2012/06/22 By Rob 1 Comment

You may have read, on the Apache OpenOffice blog, news that the project has had 5 million downloads in the first 6 weeks since the release of version 3.4. And as the above chart shows, the download rate has increased in the past two weeks, as we’ve started to roll out the upgrade notifications to OpenOffice.org 3.3 users.

When I mention the “5 million” achievement, the reaction is generally along the lines of, “That’s excellent !!! Right? That is good, isn’t it?” The fact is the number is large, but without comparison or context, it is hard to gauge. I think I can provide some comparisons and context to put these numbers in perspective.

First, let’s look at OpenOffice.org 3, and their famous claim of 100 million downloads. That was in the time period from October 13, 2008 to October 28, 2009, so 380 days. That averages out to around 260K downloads/day. I’m not quite sure what they counted as a “download”, whether just full installers, or language packs as well. And that time period overlaps with several releases (3.0.0, 3.0.1, 3.1.0 and 3.1.1, so there is some double, triple and quadruple counting of users due to upgrades.

For another comparison, let’s take a look at LibreOffice. They claimed 7.5 million downloads between January 2011 and October 2011. That averages out to 27K downloads/day. And again, it is not clear if that counts all downloads, including multiple downloads by the same user as they update from release to release.

So how does Apache OpenOffice 3.4 compare? Let’s state the numbers as conservatively as we can. We’ll count only installer downloads, not language packs or SDK’s. And we’re counting only for a single release, AOO 3.4, so there is no double counting due to upgrades. Based on these assumptions, our average download rate has been 118K downloads/day. (But as the chart shows, since we enabled the update notifications, the rate is now more like 170K downloads/day.

So overall I think we’re doing quite well. There is room for improvement, but it is a good baseline against which we can show progress. One thing we can do to grow these numbers to increase the native language support, to restore some of the key translations. If you are interested in volunteering with Apache OpenOffice, you should read this page, and then send an email to our mailing list to introduce yourself and your interest.

A quick side note: Some readers will observe that Linux users get their software from the distros, not from downloading from a website. This is a safe assumption for most, but not all Linux users. But that doesn’t really change the math much. Assume that none of the LibreOffice downloads are from Linux users. Assume that the entire 27K/day are entirely Windows and Mac users. Then, we can do an apples-to-apples comparison to the Apache OpenOffice numbers, where we know that only 3% of the downloads are from Linux users. So the better comparison would then be to compare a very conservative 0.97 * 118K = 114 K/day versus LO’s best-case 27K/day for Windows and Mac. A similar calculation could be done on the legacy OOo 3 figure, with similar results.

+1 for Apache OpenOffice 3.4

2012/05/08 By Rob 3 Comments

Read more in the official announcement. You can download Apache OpenOffice 3.4 now, from http://download.openoffice.org/ Tell your friends. And welcome home.

Ending the Symphony Fork

2012/02/01 By Rob

What is a fork?

A fork is a form of software reuse. I like your software module. It meets some or many of my needs, but I need some additional features.

When I want to reuse existing functionality from another software product, I generally have four choices:

If your module is nicely designed and extensible, then I might be able to simply use your code as-is and write new code to extend it.
I can convince you to modify your module so it meets my needs.
I can work with you in your open source project to make the module (“our” module in this case) meet our mutual needs.
I can copy the source code of your module and change the code in my copy, and integrate that modified module into my product.

Note that options #1 and #2 are the only options available with most proprietary modules, since these techniques don’t require access to the module’s source code. Options #3 and #4 are the additional options made possible by open source. Option #4 is what we mean by “forking” . Forking is enabled by open source software and is fundamental to open source ecosystems. It is neither good nor bad. It is a tool, part social, part technological, for overcoming an inability or unwillingness to collaborate. The problem is not with forking. The problem is the conditions that lead to forking.

Why do forks come about and how do they end?

Forks can come about for many reasons, including leadership conflicts, ideological differences and other political issues, as well as differences in vision and technical direction of the project.

Generally, a fork ends when the conditions that necessitated the formation of the fork have been resolved. At least that is true for rational participants who are merely trying to optimize outcomes. But intransigent ideological forks can continue indefinitely, and often do.

The technical side of ending a fork is typically a code merge, as different branches of the project are brought back together again. This can be laborious, but it is a one-time task.

Ending the Symphony Fork

With the move of OpenOffice to Apache, this open source project has made the critical move from a corporate-led open source project under asymmetrical licensing terms, to a community-led open source project under a single permissive license. This is a tremendous change and one that should lead all forks of OpenOffice, and all those who wanted to get involved with OpenOffice before but never did, to reexamine their orientation to the project.

John Maynard Keynes, when criticized for reversing his position in a dispute, famously quipped, “When the facts change, I change my opinion. What do you do, sir?” The “facts” of OpenOffice have changed, with the move to Apache, and this change of venue has made a huge impact on the Symphony team, which recently announced that it was ending its fork and committing to contribute their code to Apache and to work with that community going forward.

This does not mean that Symphony enhancements are going away. Far from it. We’re very proud of the UI work and other innovations in performance, accessibility and interoperability we’ve brought to Symphony and we will be offering the source code of these enhancements to Apache, and if accepted, will work within that project to merge these changes into Apache OpenOffice. The DNA of Symphony is not going away. What is going away is Symphony as a fork, as a divided effort. The Symphony DNA, the cool work the Symphony team has worked so hard on, will live on, in Apache OpenOffice, combined with other ongoing contributions from the community, in a larger, stronger development effort.

Now that the Symphony fork is ending, the obvious question is: Who will be next? If we can end a four-year old fork and merge in our work with Apache, then so much easier it should be for forks that have been around for far less time. “When the facts change, I change my opinion. What do you do, sir?”

If you are interested in learning more about the Apache OpenOffice project, I recommend browsing the project’s website and blog. If you want to get involved, you can sign up for the ooo-dev mailing list and post a note to introduce yourself. As we push closer to our 3.4 release candidate we’re in particular need of volunteers to help us test this release, on Windows, Mac or Linux. If you are interested in helping with that, be sure to say so in your note.

(This post has also been translated into Serb-Croatian by Anja Skrba.)

First release of the Apache ODF Toolkit

2012/01/26 By Rob 2 Comments

The Apache ODF Toolkit 0.5 (incubating) release is now available for download. Detailed change notes are also posted. The ODF Toolkit is a Java library for reading, writing and creating ODF documents. It is entirely in Java and does not require that you install a desktop editor like OpenOffice. It operates directly on the file format and is suitable for server-side use, for tasks such as document automation, report generation, information extractions, etc.

As mentioned in a previous post, the Java components from the ODF Toolkit Union have moved over to Apache. Since this open source project was already using the Apache 2.0 license, the work required to achieve our first Apache release was relatively straightforward. The major task was to take the various components of the Toolkit, which were treated as independent projects at the ODF Toolkit Union, and get them to work better together as a single Toolkit, e.g., build together using the same version of the JDK, package them together into a consolidated release bundle. Not rocket science, but it did require some iteration.

We’re starting now to put together a plan for the next release and future releases. Some of the items under consideration include:

Adding document encryption/decryption support
Adding digital signature support
Update to final published ODF 1.2 schema
Update the demo applications
Concurrency testing
Adding support for ODF 1.2’s RDFa/RDF XML semantic metadata feature
Implement ODF 1.2’s OpenFormula spreadsheet formula language
Add high-performance event-driven streaming API, for subset of tasks that can be done efficiently that way
More cookbook examples
More testing and bug fixing

If you are interested in learning more about the ODF Toolkit, you should visit our website. If you have further questions, we have a users list and a development list that you are welcome to join.

If you know some Java and are interested in ODF, I’d encourage you to take a look at this project and consider participating. We are a small, international, welcoming group working on this project, with a strong focus on quality. Come, take a look.

An Invitation to the Apache ODF Toolkit

2011/08/15 By Rob 3 Comments

Perhaps overlooked in all the excitement generated by the move of OpenOffice.org to Apache was the fact that a parallel move is occurring with the ODF Toolkit. A few weeks ago we submitted a proposal to Apache to start a new project based on the Java components that were until then hosted by the ODF Toolkit Union. This was done after consulting with ODF Toolkit community and getting approval from the ODF Toolkit Union’s Steering Committee. This proposal was recently reviewed, voted on and approved by Apache. So now we have the Apache ODF Toolkit project in the Apache Incubator.

So what is this project and what is it good for?

This project consists of Java libraries and tools for working with ODF documents. Not editors, not viewers, not anything with a user interface. These are not end-user tools. These are tools for developers who need to write programs that read, write or manipulate ODF documents. These tools do not require that you have any ODF editor installed. They operate directly on the files. So they are ideal for running on a server, for things like report generation, information extraction, document validation, conversion, etc. We have a page of demos that gives a good idea of the range of things possible with the ODF Toolkit.

The ODF Toolkit is important because it enables innovation on top of ODF. By analogy, look at HTML. At one point, the web consisted mainly of hand-authored documents at a handful of academic and government websites. If that was all there was to the web, it would not have been very interesting. What made the web the platform it is today has been the technologies that enable server-side generation of web pages from database queries, or services that analyze web pages and extract and aggregate information. Google was made possible because HTML was an open standard that could be programmatically understood. PHP was possible because HTML was an open standard that could be written.

ODF, unlike the previous generation of binary document formats, is also an open standard. You can read and write ODF documents freely. But writing the code to understand the nitty-gritty of the ODF format is a considerable task. The ODF Toolkit makes this easy for Java programmers. How easy? Here is a “hello world” text document:

TextDocument doc=TextDocument.newTextDocument();
doc.addParagraph("Hello world!");
doc.save("hello.odt");

Other tasks, like change styles, combining presentations slide decks, searching and replacing text in a document, extracting text from a document are also simple. More examples that give a flavor of the ODF Toolkit are in the “cookbook“.

But along with the “Simple API” the ODF Toolkit has the ODFDOM layer. This layer allows you to get to every part of an ODF document, at the finest grain level. Some tools out there give you only a high level API but then leave you hanging if you want to do something more complicated. Not so with the ODF Toolkit. If you want to drill down and adjust the line spacing of a bullet list in a footnote, then you can do it.

These components enable innovation on top of ODF, innovation that thinks “outside the editors” and “beyond office”.

So how do you get involved? If you want to help with the project then I invite you to sign up on the project’s development mailing list. And if you have questions about using the ODF Toolkit, but don’t want the additional email traffic from the dev list, then you can sign up for the users list. Of course, I’ve signed up for both lists. I hope I’ll see you there!