If you’ve been around this business for a while, you’ve seen your share of migrations. New operating systems, new networks, new hardware, even new document formats. I’d like to share some recollections of one such migration, and then some suggest a solution.
In 1995 I was working at Lotus on Freelance Graphics, along with many others, getting SmartSuite ready for Windows 95. One day, as I walked to work and rounded the corner of Binney Street, I saw something unusual, even more unusual than the usual unusual one sees in Cambridge. Something was up. There were news vans parked in front of LDB, camera crews and reporters looking for comments, Lotus security videotaping the reporters asking for comments, and me standing there, clueless.
This was how I first heard of IBM’s take-over offer. It was hard to concentrate on porting to Windows 95 with all that news going on downstairs, but we managed.
In the weeks and months that followed there were many changes. At Lotus we were 100% SmartSuite users. No surprise there. Most of us did not even have a copy of Microsoft Office on our machines, unless we worked on file compatibility. Not only did we use SmartSuite for our collaborative work, creating and reviewing specifications, giving presentations, etc., we also ran some of our business processes on it. In particular we used an expense report application, done in 1-2-3 with LotusScript.
But IBM used Microsoft Office. So when IBM took over, we needed to migrate. Sure, there was whining and moaning and gnashing of teeth on our end about having to move to an inferior product. And it did take a little while to get accustomed to the different conventions of Office, typing AVERAGE() in Excel, rather than @AVG() in 1-2-3 and stuff like that. But we did it. We moved to Office. It was clear to all that the benefits of having a single file format outweighed the short-term pain on migration.
It is interesting what we did not do:
- We did not go and convert all existing legacy SmartSuite documents into Office format. What would have been the point? Most old documents are never touched again. Let them rest in peace.
- We did not delete SmartSuite from our hard drives. We kept the application there for cases where we needed to access old documents.
- We did not simply continue using SmartSuite and tell it to save in Office format. We knew that both fidelity-wise and performance-wise it is far better to use an application that supports a format natively than to rely on conversion software for interoperability.
- We did not translate 1-2-3 macro-based applications into Excel macro-based applications. We took the opportunity to move straight to web based applications. Aside from some standard presentation templates and similar boiler-plate templates we did not do a lot of conversion work.
Looking back in retrospect, the migration of file formats was one of the least contentious changes that accompanied the IBM takeover. We can handle file format changes, but eliminating the traditional Friday Beer Cart, now that was something to complain about…
I’m not much of one for committing unprovoked acts of methodology, but if I had to summarize what little wisdom I have in this area, I’d say that for a migration you want evaluate your existing documents by three criteria: stability, complexity and business criticality, and develop a migration plan based on that.
In the first case you classify documents by how stable (unchanging) they are:
- Hot documents — the documents that are being heavily changed and edited today, works-in-progress, in active collaborations
- Cold documents — the documents which are no longer edited, though perhaps they are still read. Many of these documents may have zero value and are just taking up space. Others may be valuable records, but hidden away on someone’s hard-drive.
- Warm documents — These are the ones that are in the middle, not seeing heavy activity, but they aren’t quite frozen either.
From the perspective of complexity we have:
- Low complexity — simple text and graphics
- Medium complexity — using more advanced features, created by power users
- High complexity — “engineered documents”, using scripting and macros to create applications.
Finally you can also look at these documents from the perspective of business criticality. Of course, this will vary according to your business. It might be relevance to ongoing litigation, it might be according to a records retention policy, it might be whether it concerns currently open projects, etc. But for sake of argument, let’s take client or public exposure as a proxy for criticality, so we get this:
- Internal use documents — internal presentations and reports
- Customer facing documents — engagement reports, proposals, etc.
- Publication ready documents — white papers, journal articles, etc.
These three dimensions — stability, complexity and criticality — can be combined, creating 27 different document classes. For example, our old expense report based on 1-2-3 macros would be classified as a hot, high complexity, internal use document.
So you are transitioning from Office legacy binary formats to ODF. What do you do with each of these document classes? You have four main strategies to consider:
- Do nothing and preserve the document in the legacy format, maintaining, as needed, access to the legacy application.
- Convert document to a portable high fidelity static representation, like PDF
- Convert directly to ODF.
- Reengineer as something other than a document.
So one migration policy might look like this:
Stability | Complexity | Exposure | Strategy |
---|---|---|---|
Cold | Low | Internal Use | Do nothing |
Cold | Low | Customer Facing | Do nothing |
Cold | Low | Publication Ready | Do nothing |
Cold | Medium | Internal Use | Do nothing |
Cold | Medium | Customer Facing | Do nothing |
Cold | Medium | Publication Ready | Do nothing |
Cold | High | Internal Use | Do nothing |
Cold | High | Customer Facing | Convert to PDF |
Cold | High | Publication Ready | Convert to PDF |
Warm | Low | Internal Use | Convert to ODF |
Warm | Low | Customer Facing | Convert to ODF |
Warm | Low | Publication Ready | Convert to ODF |
Warm | Medium | Internal Use | Convert to ODF |
Warm | Medium | Customer Facing | Convert to ODF |
Warm | Medium | Publication Ready | Convert to ODF |
Warm | High | Internal Use | Convert to ODF |
Warm | High | Customer Facing | Publish as PDF |
Warm | High | Publication Ready | Publish as PDF |
Hot | Low | Internal Use | Convert to ODF |
Hot | Low | Customer Facing | Convert to ODF |
Hot | Low | Publication Ready | Convert to ODF |
Hot | Medium | Internal Use | Convert to ODF |
Hot | Medium | Customer Facing | Convert to ODF |
Hot | Medium | Publication Ready | Convert to ODF |
Hot | High | Internal Use | Reengineer |
Hot | High | Customer Facing | Reengineer |
Hot | High | Publication Ready | Reengineer |
There may be a better way of expressing this above (Karnaugh maps anyone?) but that gives the idea. Also, I’m not suggested that this is the “one true answer”, but merely that this may be a useful way of framing the problem.
Variations might include:
- Have a default policy of doing no conversions, but create all new documents in ODF format.
- By default, ignore all legacy documents. But the first time any legacy document is read or written, put it into a queue for evaluation and possible conversion.
Much of this lends itself to automation. For example:
- First you need to find all of the documents in an organization. This could be done by an activeX control on a page everyone in the company visits, an agent that spiders the intranet web pages and file servers, etc.
- Each document is then scored.
- Finding the stability of a document could be done by looking at the last read and last write stamps on the file. Also can look weblogs. Maybe even metadata in the document that tells how many times it has been edited.
- Complexity could be determined by scanning the document to see what features it uses. Some features, like script, would weight heavily for complexity. Think of it as a “goodness of fit” metric for how well the features used in the document fit within the ODF model.
- Business criticality is harder to automate, but could be done based on owner of the document, metadata in the document, location of the document (public web page versus intranet), etc.
- Calculate the scores, suggest actions to take, and then automate the action. This could lead to a nice automated migration solution.
In summary, it probably is not worth while simply to go out and convert all of your legacy documents in a giant cathartic orgy of document transformations. Not all documents are worth that effort. In any organization you probably have many many documents that will never be read again, ever. You also likely have some very complex documents that probably should be reengineered as web applications on your intranet. The other documents, the ones in the middle, that is where you focus your migration effort.