Although Microsoft publicly testifies from every available pulpit of their deep longing for multiple document formats, a quick glance at reality shows that this love remains unrequited in their products. For example, what new formats does Office 2007 include out of the box? A new Microsoft XML format (OOXML), an updated Microsoft binary format, and a different new Microsoft binary format for Excel. So Microsoft clearly loves multiple Microsoft document formats! (Discuss among yourselves whether this love is amour de soi or amour propre.) But what about other, standard formats? ODF support is available only as a separate download, in their ODF Add-in for Word. However this tool is very poorly integrated into the Office user interface, making it almost impossible to use for real work.
For document exchange between different versions of MS Office, on the surface it looks a little bit better. Office 2007 provides a “compatibility mode” for users of Office 2007 who wish to create or edit documents that will remain compatible with earlier versions of Office.
That’s the theory at least.
In practice, things are rather messy. I recently received an email from Julie Watson, a project manager who has been doing enterprise deployments & migrations for 15 years. She has spent the last few months working on a plan to migrate 18,000+ workstations, trying to find a way to have a gradual rollout while still maintaining round-trip collaboration between her Office 2003 and Office 2007 users. Julie has put together a nice report showing what works and what doesn’t. Ignore the official documentation and ignore intuition, since neither will serve you well here. Take a gawk at the seedy side of reality in “[Compatibility Mode] Confusion in Office 2007.”
Wow, what an incredible document. I had the same kind of feeling while reading it that had while reading Peter Gutmann’s “A Cost Analysis of Windows Vista Content Protection” (aka “The Longest Suicide Note in History”). Thank you Ms. Watson for making your hard work available to us all. Even to this hardened Microsoft observer your illumination of their sloppy work left me slack jawed.
I had the “opportunity” to work with Office 2003 at a small non-profit for about 9 months. I had been familiar with OpenOffice on GNU/Linux. I finally got so frustrated with the inconsistencies in Office 2003 that I left. Isn’t it remarkable that a dispersed community populated with people of varying abilities can put together an office suite that is powerful and easy to use while the most powerful software company in the world has given us an example of how *not* to build an office suite. I hope people pay heed to your document Ms. Watson, otherwise they may discover a new kind of pain.
Hmm… Compatibility mode never worked perfectly in previous versions of Office – why should it do now ?
OTOH there are another compatibility mode which works very well indeed: switch to OOXML is not very painful because OOXML is designed to keep all information from all previous formats in “round-trip ready mode” (i.e.: you can convert data from old .DOC or .XLS to OOXML and back without fidelity loss – with Office 2003, of course). Why the hell MS Office (and only MS Office) should have this ability built-in in ISO standard ?
P.S. Yes, there are few features which can not be saved in OOXML – but they are quite rare. ODF->OOXML->ODF is 10 times more lossy… And ODF is an ISO format…
This slideshow itself is a unnecessarily confusing document. Interestingly, it obliquely shows, by example, that OOXML doesn’t really meet one of it’s stated design goals of compatibility with legacy documents. But that’s no surprise.
In most of the examples in the document, application behaviour would seem to be fairly intuitive, but having one or other of the applications not adhere.
The document would be much easier to read if it just listed the exceptions (and probably about 90% smaller). For example:
When opening a Office 97-2003 Document in Office 2007:
Standard Behaviour: Document opens in compatibility mode
Exceptions: In Power Point, if the document contains no new features it will open in Normal Mode.
and so on.
As a side note, the author of the document should learn the correct usage of the words “affect” and “effect”.
Wow that’s… really hard to read :(
I don’t suppose they’d mind putting that up as a PDF somewhere? I don’t care about the slides so much as the text. I’d sure like to have something like this on hand in case we get migrated to the latest versions of anything.
I tried to read the PDF and it’s almost unreadable under any of the PDF viewers in Linux.
I even installed the Microsoft true-type core fonts to get the gamut of Windows fonts normally found on a Windows box. This did not help the readability – I still get several lines of text displayed as boxes with question-marks inside.
Further research has found that the author of the PDF apparently used Office 2007 to write the PDF because it was done in the new Vista-only Calibri font family and no-one but another Office 2007 system will have this font installed by default.
Searching the web turns up the fact that Calibri is the new default font in Office 2007 and I found several inquiries on whether it is possible to set the default to something more portable for sharing of documents.
This idea of changing the default font in Office 2007 to something it proprietary to MS, then not releasing that font into the wild appears to be another way for Microsoft to lock people into their product line.
Would it be too much to ask for Ms Watson to re-do the PDF using the previous Windows defualt font (either Times New Roman or Ariel) – or better yet for full cross-platform compatability, use Courier ?
Thank you – I’d really like to read this document but not having the latest Microsoft proprietary font on my machine because I’m not running Vista makes that difficult to impossible.
Thanks again.
One, two, three, four, I declare a thumb^W flamewar.
Looks like Miguel de Icaza is boosting OOXML, having called it a “superb standard” and accusing the rest of us (including you, Rob) of FUDing it. He’s even answering some of the posts over there on Slashdot, although I find it telling which points he *isn’t* responding to.
That document is a good one in that it’s speaking volume that program managers at Microsoft in the Office team can’t get a damn thing done properly.
This is at least consistent with Brian Jones (a senior program manager) sending praises to non-Microsoft products without actually testing them.
To add my own 50 cents to the topic, I have a section in my article “OOXML is defective by design” which might be worth a read. This is the section about BIFF where I explain that Microsoft not only created a completely new BIFF format (BIFF12) part of Excel binary workbooks, but they have also created BIFF11+ an extension to Excel 2003’s file format to support one of those compatibility scenarios.
An interesting bit is that, for instance, if you create a databar (new feature), this feature will be preserved when you save the file in compatibility mode, whereas if you enter data beyond the 64K row limit, it will simply be lost. From a user point of view, this is a non-functional product for collaboration purposes in an heterogeneous environment.
-Stephane Rodriguez
I’m using a 27″ flat panel widescreen monitor, so the presentation looks uncluttered to me in full-screen mode. Julie wrote me that she was thinking of reforming this material into a more expository, linear writeup.
The remark about fonts is a good one. As some will recall, the ubiquitous support for Arial, Times New Roman, Courier New, etc., was accidental, due to quirk in the EULA for Microsoft’s Core Font Pack, that allowed liberal redistribution of the installer for these fonts. However, some figured out how to unpack and install these fonts on Linux as well, and the industry saw some unintended interoperability. This had to be stopped, of course, so Microst yanked the font pack from the web in 2002.
Now with Vista/Office 2007 Microsoft has another opportunity to break interoperability with Linux by introducing a new set of fonts. All of the Office 2007 document templates uses these fonts. So even with a perfect OOXML standard, a Linux implementation that lacks these fonts (they would need to be licensed) or metrically equivalent ones, will suffer interoperability problems. overlapping text in graphics, etc.
As for Miguel’s comments on OOXML, I don’t find them to be particularly interesting. If they came from anyone but Miguel they would be ignored as being only regurgitated arguments from Brian Jones’s blog, just a rehash of the Microsoft line. If someone sees any morsel of original thought in his Slashdot posts, please point them out.
But I’m not much interested in imitating Microsoft’s arguments or their API’s. I just don’t get it. Civil War reenactors, model ship builders and Mono developers. I don’t understand what motivates them. The apprentices of the world live by emulating the masters. But at some point one should aspire to be a journeyman or a master, and think and perform new things. If open source is just going to be a cheap knockoff of Microsoft libraries, then why bother?
Well, the only ‘interesting’ thing (if you can call it that) were some insults directed at you, Rob, and at Groklaw.
I, too, am getting called all kinds of names even though no one knows who I am! The only payroll I’m on has no idea what I’m doing and my boss doesn’t even give a damn so long as I jump when there’s a computer to fix.
The rest of his posts were just a nonsensical defense of poor technical decisions. Yes, we know why they did things that way. That doesn’t make the choices right or good and we’re pissed off because we know how de facto standards work already, and we know that we will take all the crap for those poor decisions when we have to support them. Crap flows downhill. I mean, I don’t even control our WUS server, but I take crap when the patches screw us over like they did last month. And people wonder why some of us are so bitter…
What does worry me, though, is the way this incident will get spun. No doubt, it will come out as “Look at how all those OSS zealots turned on him! And after all he did for them!” while not bother to understand that when it comes to technology, geeks are not swayed by charisma. We want sound technical choices and we get angry whenever someone tries to blow smoke, even if they are on “our side.”
May the best and least encumbered technology win!
Thanks. Whenever I’m insulted in the same breath as Groklaw I take that as a compliment.
just out of interest, does the word “Calibri” appear in the OOXML spec?
Quick question – why should it matter if you use a different font on different systems when editing a document?
Surely everything will just (re)flow for you in your replacement font and you can just go on editing. If you give it back to someone who has the correct font, it will (re)flow for them “correctly”.
The effect would be similar to changing back and forth between A4 and letter page layouts for different printers, no?
(I’ve got the “re” part of reflow in brackets as I’m not sure if it’s technically a reflow as it would occur on the first “flow” of the document as it gets loaded. So, it’s not like changing the font halfway through an editing session, which would require an actual reflow.)
The question is whether documents will simply reflow to accommodate the font change. The short answer is yes, they will simply reflow, but that often “simple” is not good enough.
There are different schools of thought on document authoring. Or I should say one school of thought and one bad habit. The structuralist school puts the emphasis on the structure of the document. Here is the title, the headers, the paragraphs, the quotations, the footnotes, etc. They avoid direct application of styling attributes and instead rely on named styles to separate style and content. Cross references are encoded at the markup level, not by just typing “See Figure 2 on page 8.” If you design documents like this, with a structural approach, then you are encoding your document structure and content in a way that lends itself to further processing, whether changing page size, moving to a different device like a cell phone, or whatever. This approach is supported but not mandated by all major word processors.
However, many users (perhaps most users) treat a word processor like a digital typewriter. WYSIWYG has encouraged direct manipulation of text attributes on the page. The average user never touches named styles. They do things like align headers in tables with spaces, hard code page references “See page 39 for more information” and make other such ad-hoc changes to their document. A document created in this fashion will have greater problems moving from one word processor to another word process, or between operating systems, or even between users with different fonts.
Once you move beyond text documents, it gets even worse. In particular, presentation slides that mix text and graphics do so typically in a non-structural way. For example, how many times have you simply drawn an arrow in PowerPoint and then dropped a text block next to it and added a label? If this is loaded on a different machine, with different fonts, the width of that text block may be different, causing the text to either overlap the arrow, or to flow over to another line, possibly then overlapping with something else. The “correct” way of handling this, from a structural perspective, would be to associate the label with the arrow, in a similar way to how a label is associated with a form field in XForms. That way the application is given more flexibility in how it lays out the page. It may not appear “identical” on every machine, but at least it will appear “right” on every machine, and that is an important distinction.
As the person who asked the question about font changes, I probably should have mentioned that I’m more of a “document editor” than a “word processor” guy (to use LyX terminology).
Anyway, I read in Jensen Harris’s blog a year or so ago that Office 2007 was going to put more emphasis on styles than random markup in the new Ribbon, and with the new format being XML based (sort of :) I figured that this would hopefully help users move to a more structural way of working.
I was also hoping that the continuing spread of html and xml would have moved things in that direction anyway.
I probably should have also pointed out that I’ve not really used a word processor much in the last few years, and when I have it’s only been to read or make small changes to documents other people have sent me. Most of the stuff I’ve done recently (last 8 years or so) has been either plain text, html, or TeX (with LaTeX or LyX), as well as some work with Docbook XML (with text editor).
I stopped using Word many years ago because of it’s many failings with handling large documents, and never really got round to moving back to a word processor.
And, fortunately, I’ve never even needed to use Powerpoint, let alone had to draw an arrow in it. :)
Hmm… Compatibility mode never worked perfectly in previous versions of Office – why should it do now ?
Well….. Exactly…….
OTOH there are another compatibility mode which works very well indeed: switch to OOXML is not very painful because OOXML is designed to keep all information from all previous formats in “round-trip ready mode”
As has been pointed out on umpteen occasions, that’s the job of the application to reproduce that and not that of the new format to simply rehash old format and application features with new XML tags. It’s p-o-i-n-t-l-e-s-s, just to spell that out.
(i.e.: you can convert data from old .DOC or .XLS to OOXML and back without fidelity loss – with Office 2003, of course).
Yer. All with Microsoft’s famed compatibility, even between their own formats and different versions of Office…………
Why the hell MS Office (and only MS Office) should have this ability built-in in ISO standard ?
No too sure what this means, but if you mean why should Microsoft build this backwards compatibility junk into a new format that they are submitting to ISO, I quite agree.
Yes, there are few features which can not be saved in OOXML – but they are quite rare.
Yer, like bin files, macros, VML from documents saved with older versions of Office…… Nothing that people don’t currently have saved in their documents today ;-).
ODF->OOXML->ODF is 10 times more lossy…
And why would anyone want to go through that ridiculous conversion? Of course ODF -> OOXML -> ODF is lossy, because OOXML lacks the features within it to reproduce the original ODF content! Thanks for confirming that one. I can remember Gary Edwards giving some detail on that, but don’t have a link right now.
And ODF is an ISO format…
Errrrr. Yer.
2segedunum: I think you understood perfectly each separate sentence but misunderstood what I’ve tried to say: DOC=>OOXML=>DOC works very well because this roundtrip was explicitly supported by OOXML design. But ODF=>OOXML=>ODF does not work as well since OOXML was not designed with such rundtriping in mind. The question is why?
New international standard should try to support old international standard first, some other proprietary standards second (if at all). And OOXML violates this fundamental rule.
P.S. Of course if old international standard is stillborn, broken and totally unusable – there are no need to maintain compatibility with it. Sometimes it’s better to forget about failute then to try to revive it. Shit happens (think ISO networking vs TCP/IP). But I’ve yet to see Microsoft’s explanation about why ODF should be abandoned and OOXML used instead. “There Can Be Only One” is guiding principle of ISO and I’ve yet to see the explanation why OOXML should be exempt. Vague accusations about unfairness of “first case, first served” rule are just stupid: if the second comer is really better than first one then first one can be declared obsolete and newcomer should be used instead. Actually this is exactly what happened with ODF: ISO/IEC 26300 (ODF) replaced ISO 8613 (ODA) because ODA was stillborn, unworkable format which was mostly ignored by industry and few ODA documents were ever spotted “in the wild” while ODF had such support from the start and was actually used by the industry. So the OOXML is the third comer, not the second and it must replace ODF if it’s better – or go away if it’s not. “Competition between formats” goes against ISO principles…
> Would it be too much to ask for Ms Watson
> to re-do the PDF using the previous Windows defualt font
there is an option to include fonts in the pdf. *If* MS Office supports that option then there is no reason to re-do the PDF
The fonts are already embedded in the PDF. I downloaded the PDF and could view it in both Ubuntu7.04 and Fedora7 without any problem.
I’m working on making it easier to follow and read. Thanks for all of your suggestions. (Affect/Effect.. DOH! Sorry about that!)
Julie Watson (aka Funnybroad)
P.S.
I created the whole document in Word 2007, and exported it to PDF using the Microsoft-supplied plug-in. I did this because Slideshare.net does not support uploading documents in the new .docx format. Some of the formatting actually got lost. For example, in the titles, I used –> which got converted to a right-pointing arrow in Word, and then into a strange rectangle when exported to PDF.
The remark about fonts is actually amusing as Office Open XML supports PANOS for identifying near matching fonts so that implementations can replace fonts by installed versions that look similar.
PANOSE has been around forever. We used it in Lotus SmartSuite 15 years ago. It does, as you describe, attempt to find the closest font on your system. But if you don’t have a metrically equivalent font on your system, then close may not be good enough.
Embedding fonts is another possibility, but this will depend on the license restrictions of the individual fonts. In the case of Calibri, the terms allow “editable embedding” which sounds interesting.
MS has released SP3 for Office2003. Once you install this you can no longer open/save certain old file formats viz. wk1,wk3 etc (see – http://support.microsoft.com/kb/938810 ) because these formats are less secure.
Talking about fonts and Office 2007, it appears that Cambria Math (an MS font) has some undocumented tables that work only with Word 2007. See the current comment message at http://www.StixFonts.org (a somewhat free font for academic documents), which says
Our testing shows that the STIX Fonts cannot be used in place of Cambria Math within Word 2007 unless these (undocumented) tables are in place.