The Challenge

2008/05/05 By Rob 17 Comments

<?xml version="1.0" encoding="UTF-8"?>
<office:document-content
xmlns:office="urn:oasis:names:tc:opendocument:xmlns:office:1.0"
xmlns:text="urn:oasis:names:tc:opendocument:xmlns:text:1.0"
office:version="1.0">
<office:body>
<office:text>
<text:p>Dear Alex Brown. Please prove that I am invalid ODF 1.0 (ISO 26300:2006).
I do not think that I am. In fact I think that your statement that there are
no valid ISO ODF documents in the world, and that there cannot be, is a brash,
irresponsible and indefensible piece of bombast that you should retract.</text:p>
<text:p>(Please note that this document contains no ID, IDREF or IDREFS attributes.
Nor does it contain custom content.)</text:p>
</office:text>
</office:body>
</office:document-content>

Comments

penguiniator says

2008/05/05 at 11:31 pm

Okay, I’m dumb. I took your text, copied it into a plain file, saved it, and opened it in OOo 2.3 and Kword 1.6.3.

OpenOffice just showed me the raw markup. Kword declared it was invalid and refused to open it, complaining it had no declared mime type.

Is this meant to only to be fed to a validator, or is it supposed to work in real office applications? And if so, how?

Reply
Alex says

2008/05/06 at 12:54 am

The response

1. As you have just written on my blog … Relax NG defines validity in section 3.25. A document is valid with respect to a particular schema when it is a “member of the set of XML documents described by the schema”.

2. The ODF schema contains an unresolvable ambiguity, as reported by me, and by both of the well-established validation applications.

3. Therefore the set of documents described by the schema is an empty set.

4. Therefore there are no valid ODF documents.

QED

P.S. My challenge to you: either point out a factual inaccuracy in the above, expose a flaw in my logic, or issue a “retraction” yourself.

Reply
Anonymous says

2008/05/06 at 5:37 am

“Please note that this document contains no ID, IDREF or IDREFS attributes. Nor does it contain custom content.”

Rob, as an exercise, it would be great to have a “hello world” example with the things above, and show just theoretically what should be done to assure conformance (the right word here I believe, right?) to the specification, since you can not use the Relax NG validator for that.

Reply
Rob says

2008/05/06 at 8:45 am

Wrong, Alex. Ambiguity increases the set of documents which are valid, not decreases it.

Also, from the Relax NG standard, section 9.1 “Inference Rules”:

“The semantics of a RELAX NG schema consist of a specification of what XML documents are valid with respect to that schema. The semantics are described formally. The formalism uses axioms and inference rules. Axioms are propositions that are provable unconditionally. An inference rule consists of one or more antecedents and exactly one consequent. An antecedent is either positive or negative. If all the positive antecedents of an inference rule are provable and none of the negative antecedents are provable, then the consequent of the inference rule is provable. An XML document is valid with respect to a RELAX NG schema if and only if the proposition that the element representing it in the data model is valid is provable in the formalism specified in this clause.”

Since this document instance provably matches one of the patterns described in the ODF 1.0 grammar, and this document instance contains nothing that is ambiguous with respect to the ODF 1.0 schema, it is valid. Can you show me an inference rule that is provably false among those necessary to validate this document instance?

So I question your logic for going from your step 2 to your step 3 in your reasoning. I don’t see this as following.

(Btw, both jing and msv validate this document.)

By analogy, I go into a store to by a banana. The fruit counter also has oranges. The oranges have some price stickers upside down. Some say 0.99 and some say 0.66. The price is ambiguous. Does that mean that the bananas have no valid price and I cannot buy a banana? Of course not. You seem to suggest that the whole store cannot work and that no wares have prices because of an upside down sticker on one fruit, an in particular a fruit that I don’t want at this time.

As for the question of loading this document in OpenOffice or KOffice, you would need to also follow the required packaging rules for ODF for this to work, adding a manifest file, putting in a Zip, etc. These are the conformance requirements beyond validity.

Reply
Jeetje says

2008/05/06 at 10:45 am

Just out of curiosity, how many ambiguities have been accounted for regarding DIS29500?

If at least more than one, why haven’t they been caught at the BRM, seeing such a meeting should result in meaningful schemas (not producing stuff that only describe empty sets ^^).

Reply
Joel Stobart says

2008/05/06 at 11:06 am

As I understand it:

ODF has a (potential) flaw in version 1.0 that means that validation throws up lots of warnings/errors. Well its annoying; but not the end of the world.

Hopefully ODF 1.1 does better with this test? I imagine if we spent ages we could come up with thousands of ways we could improve ODF… So Alex, what would you do to fix it? Rob; what would make it neater, more explicit?

Lets all forget the invisible-standard – nothing to see there.

– Joel

Reply
orcmid says

2008/05/06 at 11:53 am

Although this is in the single-XML-file case of the OpenDocument format, it is difficult to find an application that natively and automatically recognizes this as an ODF document, which is one of those great curiosities of standards, their specifications, and what gets implemented.

So, if a valid ODF document has no processor for it, is this like a tree falling in the woods?

There is (or was) a filter that allows this to be imported and exported from OO.o (Windows version), but you’ll need a Java Runtime because the filter requires it. Then you can try importing and exporting this document after using a hack to register the filter. (I forget, but it is on one of the ODF lists somewhere.)

Reply
Rob says

2008/05/06 at 12:24 pm

Don’t confuse conformity with validity. These are two different concepts. Validity is an XML concept, the relationship between an XML document instance and a schema. For sake of simplicity, I’ve given a single XML document instance valid to the ODF 1.0 schema.

Conformance is the relationship of a an ODF document, an ODF producer or an ODF consumer to the ODF 1.0 standard. Validity is part of it, but conformance requires more than just validity.

Alex has made a claim about validity. That is why I’ve given a single XML file. If I had put it in a Zip file and made it a full ODF document then it would not have been possible for me to make such a visually appealing, smart-ass blog post, would it?

This doesn’t mean that larger and full documents are not also valid. It just means that I’m stripping it down to the essentials so we can focus on the area of disagreement and not get lost in the forest.

Reply
Rob says

2008/05/06 at 3:09 pm

Do I love my readers or what?

I made a simple ODF document, in packaged format, using this post’s XML example as content.xml, adding a simple manifest.xml file, and zipping these two up and renaming to an ODT file. It loads fine for me in OpenOffice 2.4.0. It won’t look all that exciting since the document does not define any character or paragraph styles. But you don’t need ID/IDREF or custom content for styles, so the principle would be the same there as well.

You can download the ODT file here.

Reply
orcmid says

2008/05/06 at 3:54 pm

I couldn’t tell if the comment about conformity and validity (an useful discussion that you have taken up in a later post) was addressed to my observation about the plain XML version.

As far as I can tell, ODF specifies a plain XML version of ODF document, although it is hard to find any ODF-supporting processors that recognize it without help. So, technically, you needn’t have gifted us with the wrapped up .odt version (and it is nice to have done that).

Reply
Rob says

2008/05/06 at 4:16 pm

Orcmid, I think I missed your original point then. Let me try again.

Are you referring to the single-file version of an ODF document described in section 2.1? If so OpenOffice doesn’t handle that by default, but there was an extension of some sort I heard about (but never tried) that was said to add that capability.

This seems to be a feature that was never widely implemented. This might because it is bulky (no compression) and doesn’t allow images (no package to store them in) and it doesn’t give as clean a separation of content and styles (no styles.xml).

To your question about validity and trees falling in the woods. It depends on whether you want a technical or a business answer.

Technically, validity is a logical question, something that could be answered true or false, even with pencil and paper given sufficient time (and sufficient pencils and paper).

From the business perspective, there are no shortage of irrelevant standards. The real business imperative is interoperability. I wouldn’t say that validity is not important, but its narrow, binary 1/0 view of the world doesn’t suit well large, complex heterogeneous systems in the presence of software flaws, human errors and data of varying levels of quality. Validity is a tool. It is not a god. Same for conformance.

Ask yourself, who writes conformance clauses? Often a technical committee who may be masters of the underlying technology, but may know nothing about interoperability. Or in the case of OOXML, the conformance clause was written by ballot resolution meeting members working nights and in coffee breaks to rewrite a conformance clause for a 6,045 page standard during a one week editing orgy in Geneva. Are you going to bet your organization’s interoperability needs based on OOXML’s definition of conformance? I hope not.

Conformance clauses only scope the work of interoperability into neat buckets. It doesn’t do the real work. You need test suites, conformity assessment methodologies, profiles, and testing, testing, testing, to get interoperability.

Reply
Alex says

2008/05/07 at 3:47 am

@Rob

It seems we’re going to need to work on this proof (though I am encouraged we got as far as item “3”) before we’re all happy with it — I’ll take that work over to my blog …

You appear to have reduced your ambitions to showing that there is a small number of artificially contrived (i.e. useless) ODF documents which we can call valid (like your example). However, you should know that even this meagre goal cannot be fulfilled because of the uncomputable nature of the ODF schema.

As for the rest of your comment, you seem to be saying that it open to a validator to evaluate its schema lazily in response to the content it finds in an instance being validated. If this _is_ what you’re saying, it shows both a fundamental misunderstanding of the RELAX NG validation model, and of the necessities of validation for the purposes of document processing. A correct schema is a prerequisite for validation; validation is a prerequisite for correct onward processing. This is why there are no validators that can validate ODF 1.0, and why there are no valid ODF documents — as I shall both prove, and (for bonus points) demonstrate in code.

– Alex.

Reply
Rob says

2008/05/07 at 10:36 am

Alex, the simplicity of the example was better to illustrate the simplicity of your error. The intent was purely pedagogical.

You seem to have reduced your ambitions, no longer trying to justify your absurd, biased test, that you used the wrong version of the ODF schema, that you tested OOXML with a document that Microsoft had been testing for 18 months, that you gave invalid input to the ODF test, etc.

You say that there are no validators that can validate ODF. But still I can easily run both jing and msv on this document without errors. If your argument is that what I’ve demonstrated with two different validators is impossible, then this should be an interesting proof you’re coming up with. I just hope you’re not dividing by zero anyplace, eh.

Reply
Anonymous says

2008/05/07 at 6:50 pm

Dear Rob,

Will you excuse me? I don’t want to distract you all about the valuable discussion going on… but allow me a small comment, perhaps nearly outside your comment policy:

Dear Alex,

I appreciate your efforts regarding ODF, really. I’m just surprised how you haven’t given so much dedication, time, and effort, to blog about all the directives related to ISO standardization process, and how well or how badly they were followed in recent months. I’m also extremely surprised for the little attention you have given to the analysis of the international standardization process as it is nowadays. After all, technical experts, are the ones more qualified to make such kind of analysis. Since you seem very demanding with the quality of standard specification, I’m also surprised why DIS 29500 didn’t give you enough substance to examine so deeply as you have been doing for ISO 26300. Oh, right… sorry, you are probably waiting to read the document of the blindly approved specification.

Reply
The Mad Hatter says

2008/05/07 at 8:32 pm

Hum, well it opens fine in Text Edit on my Mac.

This discussion reminds me of the Tortoise and the Hare. If you pick one wrong assumption it can be proved that it’s impossible for the Hare to catch the Tortoise one he falls behind.

Reply
Anonymous says

2008/05/08 at 10:28 am

Rob,

My excuse to post it here, but it is the only place where I’m sure that our friend Alex Brown (a.k.a. Austin Powers) would read it.

Alex,

Now I really understand why you didn’t allow Brazil to ask for the Binary Mappings at the BRM.

Truth Happens !!!

Yeah, baby, yeah !!!

Cheers,

Mini Me

Reply
Ben Langhinrichs says

2008/05/13 at 10:46 am

Alex (and Rob) –

While I hesitate to jump in the middle of all this, I have created thousands of ODF documents, albeit for test purposes, that also skip the id’s and other non-validating points. The OpenSesame utility I am releasing shortly allows that as an option, so for testing purposes, I have workjed with all sorts of documents to validate that it works and is acceptable. While I have not tried validating with these validators, they meet the criteria Rob described, so I imagine that they would validate. I’ll check a few and let you know, if it is of interest.

– Ben Langhinrichs

Reply

Reader Interactions

Comments

Leave a Reply to orcmid Cancel reply