Office XML versus OpenDocument: Everyone is missing the point

Ars Technica recently covered yet another update in the endless saga over the "Office XML versus OpenDocument" war.

I wrote a bunch of those articles myself, back when I was working full-time for Ars. Writing about the political infighting among standards committees was fun, but it was ultimately pointless. Positioning Microsoft versus Everybody Else is a great way to get page views, but in the real world, where some of us actually have to get real work done, the problems are far more complex.

The problem with these sorts of debates is that nobody really has a good understanding of the underlying issues involved in something like an "Office file format".

This is a good place to start, although Joel does get a few things wrong.

But the gist of his argument is correct: the file formats are complicated beasts because Office software itself is chock full of complicated features.

It's important to note that Microsoft itself realized this problem early on, which is why they came up with the RTF format. RTF is supported by absolutely EVERYTHING, is a huge boon to interoperability, but nobody uses it. Why? Because it doesn't support all those fancy features that everybody says they don't care about.

So we have a conundrum. People want to create documents in ridiculously complex applications like Word and OpenOffice. But they don't want the file format to be complicated at all: they want it to be simple, so that interoperability is easy.

Well, life doesn't work that way.

I've been to some interesting conferences about pure XML-based document creation workflows, where the file format's tag structure is really simple and the files can happily move around anywhere they want. It's interesting stuff.

But you still have to create the document somewhere, and for this demo it was XMetal, which (let's face it) is a pretty ugly way to create documents. And XMetal didn't support a lot of advanced features that people were used to in Word or FrameMaker, such as conditional text. So you had to go get another batch of proprietary software to add that functionality in, and so forth. It got very messy very quickly.

All the shouting and political bickering over which format is more "open" is just a distraction from the real issue. OpenOffice and ODF are nearly as crufty and complicated as Microsoft's XML efforts. The differences mostly boil down to the fact that OpenOffice didn't have to worry about backwards compatibility with legacy versions of the program.

None of this matters. The problem is that the human desire for document creation applications with millions of features runs head-first into the new desire for everything to be "connected" and "interoperable" and "open" and "XML" and so forth. The two desires simply cannot coexist. When you try to mash them together, you end up with ugly hacks like ODF and OOXML.

Exactly what the industry will do to solve this problem is not clear, but one thing is for sure: IBM is clearly hoping that it will involve millions of dollars spent in "infrastructure" consulting. Microsoft wouldn't mind a few of those consulting dollars either, but they'll be happy enough to sell you new versions of Office. Now with more features!

And people will buy them, hoping for some magic bullet to manage complexity. Good luck with that.