• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar

An Antic Disposition

  • Home
  • About
  • Archives
  • Writings
  • Links
You are here: Home / Archives for ODF

ODF

Best Practices for Authoring Interoperable ODF Documents

2011/03/10 By Rob 2 Comments

In the OASIS ODF Interoperability and Conformance TC we have recently started work on a new document, a “Committee Note” which will be called, “Best Practices for Authoring Interoperable ODF Documents”.

I will be the editor for this document.

If you are not yet familiar with a “Committee Note”, it is a new category of document that has recently been added to the OASIS process.  Think of it being analogous to an ISO Technical Report.  A Committee Note (or CN) goes through the same level of review and approval with a Technical Committee, the same public review requirement, etc.  But it does not get approved as a standard, so it does not define, for example, conformance requirements.  It is intended for things like implementation guides,  best practices, white papers, etc.

The general aim of the new CN is to collect and describe guidelines for authors on how best to create interoperable (portable) ODF documents.  What to do and what to avoid.  Although the focus will be on ODF, much of this will be applicable to any WYSIWYG word processing environment.

I’m thinking of this as being analogous to the “How to write portable C” books we saw years ago.  As many of you know, C programs can range from the perverse (see the Obfuscated C Competition for examples) to highly portable.  But portability does not come about by accident.  The language permits portability, but it does not enforce it on the user.  C is powerful enough for a user to hang themselves.

The modern WYSIWYG word processor is similar.  A user can create interoperable (portable) documents, but the word processors also allows them to create documents that will be tied tightly to their precise operating environment and will render poorly everywhere else.  The tool takes you only so far, and then user education must help with the rest of the way.  I hope that this Committee Note will provide some of that user education.

I am absolutely certain that I am not the first one to have thought about this problem.  In fact, I suspect (and hope) that many of my readers have done so themselves.  So before I start drafting this document, I’d like to solicit for contributions of material.  Maybe you have written a paper, report or blog post on this  topic?  Maybe you can take a few minutes to jot down your ideas?  Maybe you can refer us to other sources of information?

But please don’t give me the information here.  Per OASIS IPR rules we need to channel any contributions to the Technical Committee, so permission to use your original material is secured, from the copyright perspective.  So if you have a contribution you want to make for this document, please do so on the OIC TC’s comment list.  And if you want to participate more closely in the creation and editing of this document, then you are always welcome to join OASIS and participate directly in the TC’s work.  The cost for individual memberships is quite reasonable.

  • Tweet

Filed Under: Interoperability, ODF

The Versions of ODF

2011/02/10 By Rob 8 Comments

It has been a few months now since the OASIS ODF TC has done substantive technical work on ODF 1.2.  We had a 60-day public review last summer, a 15-day public review last December and we will start another (hopefully final) 15-day public review starting this week.   Every time we make a change to the specification in response to public comments we are required to have another 15-day review of the changes.  This is all necessary procedural work, to make sure all stakeholders have the opportunity to comment.  But it is not very exciting.

However, as the ODF 1.2 specification goes through remainder of its review/approval process in OASIS, we’ve increasingly turned our attention to ODF-Next.   Tentatively (and we should have a TC vote on this work plan in the next few weeks), we’re looking at a two-year schedule for ODF 1.3, with four intermediate drafts (Committee Specification Drafts or CSDs).  The first CSD would appear in September, 2011.  We have not yet defined what features will be in ODF 1.3.  So this is a great time to join the ODF TC, to “get in on the ground floor” for defining the next release.

While we await approval of ODF 1.2 and start work on ODF 1.3, we continue to maintain ODF 1.0 and ODF 1.1, the previous versions of ODF.  And by “maintain” I mean we receive and track defect reports and publish corrections to the specification  So effectively, the OASIS ODF TC is working on four versions of ODF.

Since the progression from ODF 1.0 –> ODF 1.1 –> ODF 1.2 –> ODF 1.3 is designed to be compatible, the average user will not notice a difference.  Your ODF 1.0 documents should load just fine in your ODF 1.2 or ODF 1.3 editor.  We try very hard not to introduce “breaking changes” that would cause trouble with older documents.   Of course, the application vendor has a responsibility here as well, to pay attention to version compatibility issues.  But from the perspective of the standard I do not believe that we’ve done anything that would prevent an editor from being (at the same time) a conforming ODF 1.0, ODF 1.1 and ODF 1.2 application.  In fact, I’d expect most ODF editors today to be able to read any version of ODF, though they might only save the most-current version, or maybe the 2-most current versions.

An additional complexity is that we have ODF standards in OASIS and ISO.  I’ve heard that some are confused by this, especially how these different versions correspond.  I hope I can make this clearer.

First, flash back to the 1990’s.   After decades of success with standardizing nuts and bolts and shipping containers and the various aspects of the physical world relevant to international trade, ISO was at the crossroads.  There wasn’t much more left for them to standardize in that physical world.  They were seeing success with management standards, which would soon become a major part of their work, e.g., ISO 9001, quality management.  But ISO was not doing that well with technology standards.  Their OSI reference network model was a flop.  C++ was laboring on, six-years in committee.  And then competition emerged from new, more agile, standards consortia, like the IETF and W3C.  They were rocking the industry with highly relevant specifications that essentially created the web.  Almost every core technology of the internet, including TCP/IP, HTTP, HTML, XML, JavaScript, SMTP, MIME, POP3, IMAP, etc., was developed outside of the ISO system.

You can be quite sure that this new competition did not escape notice in Geneva.  As they say, “If you can’t beat them, join them”.  Or in this case, get them to join you.  One of the ways in which ISO/IEC JTC1 (the ISO committee that controls tech standards) responded was to introduce the Publicly Available Specification (PAS) transposition process.  The idea here was to allow recognized standards consortia (and there is a formal ISO process to gain such recognition) to submit already-approved market relevant standards to ISO/IEC JTC1 for accelerated processing and approval as an International Standard.  Essentially, such PAS submissions skip over the ISO Working Group and Subcommittee stages of work, and advance directly to a final approval ballot.   This is a win-win situation.  ISO has more relevant standards in its catalog, and consortia can continue to produce their work at a more nimble pace.

So when we look at the versions of ODF, we have more than just ODF 1.0, 1.1, 1.2 and 1.3.  For each of these have an OASIS and an ISO version.  And for each numbered version we have published corrections, and these are reflected both in the OASIS and the ISO catalogs.  It sounds messy at first, but the important thing to note is that OASIS and ISO/IEC JTC1 have agreed to keep their corresponding versions of ODF “technically equivalent”.  This was agreed to in a Memorandum of Understanding.   This means that you should be able to use the OASIS or the ISO version according to your needs and have confidence that they are compatible.  If you require an ISO version, then you can use that.  If you want the very latest version, then use the OASIS version, since the ISO version typically lags by a year or more.

I hope the above diagram clarifies which versions of ODF are technically equivalent.  Note that this is not a time line.  The actual order that the various versions were published in is more complicated, since corrections to older versions of ODF can (and do) come after publication of newer editions.  But this diagram shows the correspondence of “technically equivalent”  OASIS and ISO versions of ODF.  The big rounded blocks are published standards, the indented smaller ovals are published corrections (“Errata” in OASIS and “Corrigenda” in ISO), and the indented rectangle on the ISO side is an amendment.

In particular, note:

  • OASIS ODF 1.0 corresponds to ISO/IEC 26300:2006
  • OASIS has published two Errata documents for ODF 1.0, and both have corresponding Corrigenda in ISO, the first one already approved, the second one currently under ballot.
  • OASIS ODF 1.1 + Errata 01  corresponds to ISO/IEC 26300:2006 + Corr.1 + Corr.2 + Amd. 1.  This is a more complicated case, since we’re rolling up several corrigenda as well as the changes from OASIS ODF 1.1.  But the net result is that after Amd. 1 is approved (and the ISO ballot is now underway) we will have an ISO version of ODF 1.1.
  • The plan is to submit OASIS ODF 1.2 to ISO/IEC JTC1 under PAS transposition rules.  I expect that we will receive defect reports on ODF 1.2, and these would be addressed as Errata in OASIS and Corrigenda in ISO, to maintain technical equivalence.
  • Ditto for ODF 1.3.  Once approved by OASIS, we submit for PAS transposition and maintain to preserve technical equivalence.

So this isn’t really all that complicated.  We have a series of compatible ODF versions over several years.  The technical work is done in OASIS, in a technical committee.  Once approved by the OASIS membership the OASIS version of ODF is submitted under PAS rules to JTC1.  Once approved by ISO, the OASIS ODF committee and the ISO ODF committee (called ISO/IEC JTC1 SC34/WG6) meet regularly to ensure that the two versions remain aligned, with specific attention to ensuring that we’re both looking at the same set of defect reports and keeping corrections in sync.

  • Tweet

Filed Under: ODF

Microsoft Office and ODF: Best Practices

2010/12/20 By Rob 13 Comments

I’ve received a few questions about how to read/write ODF documents from Microsoft Office.  I looked around and did not find a comprehensive Microsoft web page on this topic, so I’m putting together this page as a reference for best practices on how to use ODF with Office.

I intend to update this post as I find more information, so feel free to add a comment if you have a link to some additional material.

Depending on what version of Microsoft Office you are running you may have up to three different ways of working with ODF documents, either through native support in Office or through a third-party extension.  Your options are listed in the following table:

[table id=7 /]

A few notes on each option:

  • Native support for ODF 1.1 is available in Office 2010, and in Office 2007 once you install Service Pack 2 (SP2).   Using the Office Customization Tool, administrators can configure Office to default to ODF format for new documents.  Office will give you an warning message whenever you try to save a document in ODF format, but this can be disabled according to these instructions.  Some ODF features are either not available, or are implemented in a way that is not interoperable with other ODF editors like OpenOffice.org.  Examples include spreadsheet formulas and change tracking.  Microsoft has written up in detail what features are and are not supported when saving Word, Excel and PowerPoint documents in ODF format.
  • Oracle’s ODF Plugin is available commercially, with support.  It is the only current option for those who require ODF 1.2 support.  It is also the only option that supports Office 2000.  An earlier version of this plugin, originally made available by Sun at no cost for individual use, is still available for download at Softpedia.
  • The ODF Add-in for Microsoft Office is an open source developed under Microsoft sponsorship by several smaller companies.

Unanswered questions:

  • Are there any head-to-head reviews of these three options, preferably one that looks at standards conformance and interop?
  • Are there any ODF options for Office 2008 or Office 2011 (Mac Office)?
  • Tweet

Filed Under: ODF

ODF TC Creates Advanced Document Collaboration Subcommittee

2010/12/05 By Rob 14 Comments

The OASIS ODF Technical Committee voted a couple of weeks ago to create a new subcommittee, on “Advanced Document Collaboration”.  Robin LaFontaine, from DeltaXML will chair the subcommittee.

Since the entire ODF TC is quite large now (almost 20 active members attend each meeting) it is impossible to do a technical “deep dive” on every topic in our meetings.  So when a particular specification domain requires sustained attention for a period of time, we can create a subcommittee, to allow interested TC members to study and draft specification enhancements.   We’ve done this several times before.  For example,  the Accessibility SC  developed the accessibility enhancements for ODF 1.1.  And the Formula and Metadata subcommittees drafted those key parts of ODF 1.2.  I hope that this new SC will be equally successful in their work.

So what is “Advanced Document Collaboration”?  A key part of this will be enhancing change tracking in ODF.  I’ve been looking at how existing applications implement change tracking and I’m not 100% satisfied.   And I don’t mean only ODF editors.  Even Microsoft Office using OOXML lacks full and complete change tracking support.  For example. Microsoft Word does not track changes that occur in an OLE object. And change tracking in PowerPoint is entirely absent.  And starting in ODF 1.2 we  have an additional RDF metadata layer in documents and we need to consider how change tracking deals with this.   So there is a good opportunity here for us to advance the state of the art.

We are fortunate that earlier this year the OpenDoc Society, with sponsorship from NLnet Foundation. commissioned a proposal of a feature-complete change tracking specification from DeltaXML.  This draft has also been contributed to the ODF TC and has attracted some implementor interest, with prototyping work occurring both in KOffice and AbiWord.

While studying change tracking, I’m hoping the SC will be able to give some thought to how we might canonically represent an “editing change” artifact.  By this I mean a high level change which in the general case might be a correlated set of content, style and metadata changes which appears atomic the user, but which at the implementation level might touch several XML files in the ODF document.  This editing change artifact, aside from being necessary to represent change tracking, could also be quite useful in other problems, such as a runtime clipboard format, as a quantum of change in a real-time collaborative editor, or to represent the persistent form of a document selection, which itself is useful in contexts such as fine-grained digital signatures.  Not all of this happens overnight of course  But I’m hoping that the initial work on feature-complete change-tracking will give other benefits down the road.

The charter for the new Subcommittee follows.  If you are interested in these topics but are not already a member of OASIS, then I’d encourage you to join now, so you can “get in on the ground floor” with these exciting new discussions.

Statement of Purpose

Many ODF documents do not involve collaboration. They are created by a single user, edited by a single user, and then perhaps presented or shared with multiple users, or maybe even just converted to PDF for distribution.

However, collaboration-based document scenarios are also common, including review and comment, change tracking as well as emerging work in real-time collaborative editing, in-context document collaboration, persistence of structured document fragments, and so on.

In order to bring together technical experts in these areas, and for them to evaluate trends, investigate opportunities and draft enhancements to ODF in these areas, we are proposing a dedicated subcommittee for this topic.

The initial and highest priority for the Subcommittee will be change tracking. Reliable and user-friendly revision management is critical for professional document workflows in corporate and public sector environments, and as such an important feature of Open Document Format.

The SC is asked to prepare a draft specification of a markup vocabulary that can accurately describe any incremental change to the content and structure of documents – typically made in multiple editing sessions by different authors.

Deliverables

  1. A draft specification for change tracking, including Relax NG schema
  2. A description on how to apply change tracking markup to the various
    versions of the OpenDocument Format (ODF) as a host format.
  3. A set of test documents that will allow implementers to validate their
    change tracking implementations.
  4. A document that describes in detail how the existing change tracking
    mechanism in ODF can be converted to the new markup.
  5. Other proposals, draft specifications and in-scope work related to the subcommittee’s Purpose.
  • Tweet

Filed Under: ODF

Introducing: the Simple Java API for ODF

2010/11/01 By Rob Leave a Comment

The Announcement

The first public release of the new Simple Java API for ODF is now available for download. This API radically simplifies common document automation tasks, allowing you to perform tasks in a few lines of code that would require hundreds if you were manipulating the ODF XML directly.

The Simple API is part of the ODF Toolkit Union open source community and is available under the Apache 2.0 license.   JavaDoc, demonstration code and a “Cookbook” are also available on the project’s website.

The Background

I first proposed an ODF Toolkit back in 2006, shortly after I got involved with ODF.  It was clear then that one of the big advantages of ODF, compared to proprietary binary formats, is that ODF lent itself to manipulation using common, high level tools. I made a list of the top 2o document-based “patterns of use“, but the key ones are in the following areas:

  • Mail merge style field replacement
  • Combining documents fragments/document assembly
  • Data-drive document generation
  • Information extraction

The hope was that we could it easy to write such applications using ODF.

So why wouldn’t this be easy?   In the end ODF is just ZIP and XML and every programming platform knows how to deal with these formats, right?

Yes, this is true.  However there clearly are a lot of details to worry about.  Although ZIP and XML are relatively simple technologies, defining exactly how ODF works requires over a thousand pages.  This level of detail is necessary if you are writing a word processor or a spreadsheet.  But you really don’t need to know ODF at this level in order to accomplish typical document automation tasks.

There have been several other attempts at writing toolkits in this space.  Some, such as the ODF Toolkit Union’s ODFDOM project have aimed for a low-level, Java API, with a 1-to-1 correspondence with ODF’s elements and attributes.  Others, like lpOD’s Python API have taken a higher-level view of ODF.  You can make a good argument for either approach.  Each has its advantages and disadvantages.

The advantage of the low-level API is that if want to manipulate an existing ODF document, which in general can contain any legal ODF markup, then you need an API that understands 100% of ODF.  But in order to understand that API would require understanding the entire ODF standard.  So that is too complicated for most application developers.

If you write a high level API, then it may be easy to use, but how can you then guarantee that it can losslessly manipulate an arbitrary ODF document?

I think the best approach might be a blended approach.  Build a low-level API that does 100% of ODF, and then on top of that have a layer that provides higher-level functions that do the most-common tasks.  This gives you the benefits of completeness and simplicity.  This is the approach we have taken with the Simple Java API for ODF.   It is built upon the schema-driven ODFDOM API, to give it a solid low-level foundation.  And on top of that it adds high-level functions.  How high?  The aim is provide operations that are similar to what you as an end-user would have available in the UI, or what you as an application developer would have with VBA or UNO macros.  So adding high level content, like tables or images.  Search and replace operations.  Cut and paste.   Simple, but still powerful.

A Quick Example

As a quick illustration of the level of abstraction provided by the Simple Java API for ODF, let’s do some simple app.  We want to load ODF documents, search for stock ticker symbols and add a hyperlink for each one to the company’s home page.

So, start with a document that looks like this:

We want to take that and find each instance of “FOO” and add a hyperlink to “http://www.foo.com” and so on.  If you tried this operation with an ODF document directly, it could certainly be done.  But it would require a good deal of familiarity with the ODF standard.  But using the Simple API you can do this without touching XML directly.

Let’s see how this is done.


// basic Java core libraries that any Java developer knows about
import java.net.URL;
import java.io.File;

// Simple API classes for text documents, selections and text navigation
import org.odftoolkit.simple.TextDocument;
import org.odftoolkit.simple.text.search.TextSelection;
import org.odftoolkit.simple.text.search.TextNavigation;

public class Linkify
{
    public static void main(String[] args)
    {
        try
        {
            // load text document (ODT) from disk.
            // could also load from URL or stream
            TextDocument document=(TextDocument)TextDocument.loadDocument("foobar.odt");

            // initialize a search for "Foo".
            // we'll be adding regular expression support as well
            TextNavigation search = new TextNavigation("FOO", document);

            // iterate through the search results
            while (search.hasNext())
            {
                // for each match, add a hyperlink to it
                TextSelection item = (TextSelection) search.getCurrentItem();
                item.addHref(new URL("http://www.foo.com"));
            }

            // save the modified document back to a new file
            document.save(new File("foobar_out.odt"));
        }

        catch (Exception e)
        {
            e.printStackTrace();
        }
    }

}

Run the code and you get a new document, with the hyperlinks added, like this:
Simple enough? I think so.

How to get involved

We really want your help with this API.  This is not one of those faux-open source projects, where all the code is developed by one company.  We want to have a real community around this project.  So if you are at all interested in ODF and Java, I invite you to take a look:

  1. Download the 0.2 release of the Simple Java API for ODF.  The wiki also has important info on install pre-reqs.
  2. Work through some of the cookbook to get an idea on how the API works.
  3. Sign up and join the ODF Toolkit Union project.
  4. Join the users mailing list and ask questions.  Defect reports can go to our Bugzilla tracker.
  5. If you want to contribute patches, more info on the wiki for how to access our repository.
  • Tweet

Filed Under: ODF

  • « Go to Previous Page
  • Go to page 1
  • Go to page 2
  • Go to page 3
  • Go to page 4
  • Go to page 5
  • Interim pages omitted …
  • Go to page 25
  • Go to Next Page »

Primary Sidebar

Copyright © 2006-2023 Rob Weir · Site Policies

 

Loading Comments...