Why compete with .doc?

Given the sheer ubiquity of Microsoft Office documents, it may seem a bit quixotic to invent a competing set of document formats, and drag it through the standards bodies, all the way to ISO. The Open Document Format people have just accomplished that, and are now being hotly pursued by … Microsoft and its preferred Office Open XML specification.

If the creation of interoperability between similar but different programs is the sole purpose of a standard, office documents don’t look like much of a priority. In so far as there is any competition to Microsoft’s Office at all, the first requirement of such programs is to read from, and write to, Office’s file formats as if they are Office themselves. Most of these competitors have succeeded to such an extent that it has entrenched the formats even further. For example, if you’d want to use the same spreadsheet on a Palm and Google Spreadsheet, or send it to an OpenOffice using colleague as well as a complete stranger, Excel’s .xls is practically your only choice.

Yet the Open Document Format (ODF) has slowly wound its way from its origins in the OpenOffice file format through the OASIS specification body, and is now a full ISO standard; the first of its kind. But not necessarily the last: the confusingly named Office Open XML (OOXML) is already an Ecma specification, and the intention is to shift it to ISO too.

To understand why the ODF and OOXML standard steeple chase is happening at all, it is necessary to look beyond the basic interoperability scenario of lecturer A writing a handout that’s then downloaded and looked at by student B, and perhaps printed by lecturer C next year. That sort of user to user sharing was solved a long time ago, once Corel’s suite ceased to be a major factor. But there are some longer running issues with data exchange at enterprise level, and with document preservation.

Enterprise data exchange is perhaps the most pressing of the two. For an awful lot of information management purposes, it would be handy to have the same information in a predictable, structured format. Recent JISC project calls, for example, call for stores of course validation and course description data- preferably in XML. Such info needs to be written by practitioners, which usually means trying to extract that structured data from a pile of opaque, binary Word files. It’d be much easier to extract it from a set of predictable XML.

Provided you use a template or form, that’s precisely what the new XML office formats offer. Both ODF and OOXML try to separate information from presentation and metadata. Both store data as XML, and separate out other media such as images in separate directories, and stick the lot in a Zip compressed archive. Yes, that’s very similar indeed to IMS Content Packaging, so transforming either ODF and OOXML slides to that format for use in a VLE shouldn’t be that difficult either. It’d be much easier to automatically put both enterprise data and course content back into office documents as well.

The fact that the new formats are based on XML explains their suitability as a source for any number of data manipulation workflows, but it is the preservation angle that explains the standards bodies aspect. Even if access to current Office documents is not an issue for most users, that’s only because a number of companies are willing to do Microsoft’s bidding, and at least one set of open source developers have been prepared to spent countless tedious hours trying to reverse engineer the formats. One move by Microsoft and that whole expensive license or reverse engineer business could start again.

For most Governments, that’s not good enough. Access must be guaranteed over decades or more, to anyone who comes along without let or hindrance. It was this aspect that has driven most of the development of ODF in particular, and OOXML by extension. The Massachusetts state policy particularly, with its insistence on the storage of public information in non-proprietary formats, has led to Microsoft first giving much more complete description of its XML format, and later assurances that it wouldn’t assert discriminatory or royalty bearing patent licenses. The state is still going to use ODF, though; not OOXML.

On technical merit, you can see why that would be: OOXML is a pretty hideous concoction that looks like it closely encodes layers of Microsoft office legacy in the most obtuse XML possible. The spec is a 47 Mb whopper running to six thousand and thirty nine pages. On the upside, it’s single- or double-character terseness can make it more compact in certain cases, and Microsoft touts its support for legacy documents. That appears to be true mainly if the OOXML implementation is Microsoft Office, and if you believe legacy formats should be dealt with in a new format, rather than simply in a new converter.

ODF is much simpler, and far easier to read for anyone who has perused XHMTL or other common XML applications, and it is much more generalisable. It could be less efficient in some circumstances, though, because of its verbosity and because it allows mixed content (both characterdata and tags in a tag), and it doesn’t support some of the more esoteric programming extensions that, for example, Excel does.

All other things being equal, the simpler, comprehensible option should win every time. Alas for ODF, things aren’t equal, because OOXML is going to be the default file format in Microsoft Office 2007. That simple fact alone will probably ensure it succeeds. Whether it matters is probably going to depend on whether you need to work with the formats on a deeply technical level.

For most of us, it is probably good enough that the work we all create throughout our lives is that little bit more open and future proofed, and that little less tied to what one vendor chooses to sell us at the moment.

AJAX alliance to start interoperability work

Funny how, after an initial development rush, a community around a new technology will hit some interoperability issues, and then start to address it via some kind of specification initiative. AJAX, the browser-side interaction technique that brought you google maps, is in that phase right now.

Making Asynchronous JavaScript and XML (AJAX) work smoothly matters, not only because it can help make webpages more engaging and responsive, but also because it is one of the most rapid ways to build front ends to web services.

It may seem a bit peculiar at first that there should be any AJAX interoperability issues at all, since it is built on a whole stack of existing, mature, open standards: the W3C’s XML and DOM for data and data manipulation, and XHTML for webpages, ECMAScript (JavaScript) for scripting and much more besides. Though there a few compliance issues with those standards in modern browsers, that’s not the actually the biggest interoperability problem.

That lies more in the fact that most AJAX libraries have been written with the assumption that they’ll be the only ones on the page. That is, in a typical AJAX application, an ECMAScript library is loaded along with the webpage, and starts to control the fetching and sending of data, and the recording of user clicks, drags, drops and more, depending on how exactly the whole application is set up.

This is all nice and straightforward unless there’s another library loaded that also assumes that it’s the only game in town, and starts manipulating the state of objects before the other library can do its job, or starts manipulating completely different objects that happen to have the same name.

Making sure that JavaScript libraries play nice is the stated aim of the OpenAjax alliance. Formed earlier this year, the alliance now has a pretty impressive roster of all the major open source projects in the area as well as major IT vendors such as Sun, IBM, Adobe and Google (OpenAjax Alliance). Pretty much everyone but Microsoft…

The main, concrete way in which the alliance wants to make sure that AJAX JavaScript libraries play nice with each other is by building the OpenAjax hub. This is a set of standard JavaScript functions that address issues such as load order and component naming, but also means of addressing each other’s functionality in a standard way.

For that to happen, the alliance first intends to build an open source reference implementation of the hub (OpenAjax Alliance). This piece of software is meant to control the load and execution order of libraries, and serve as a runtime registry of the libraries’ methods so that each can call on the other. This software is promised to appear in early 2007 (Infoworld), but the SourceForge filestore and subversion tree are still eerily empty (SourceForge).

It’d be a shame if the hub would remain vapourware, because it is easy to see the benefits of a way to get a number of mature and focussed JavaScript libraries to work in a single Ajax application. Done properly, it would make it much easier to string together such components rather then write all that functionality from scratch. This, in turn, could make it much easier to realise the mash-ups and composite applications made possible by the increasing availability of webservices.

Still, at least the white paper (OpenAjax Alliance) is well worth a look for a thorough non-techy introduction to AJAX.

When software becomes too social

Here’s an idea that’s equal parts compelling and frightening: a presence service that is integrated not with a instant message client, but with your mobile’s contact book.

A presence service is the bit of an Instant Messaging (IM) application such as MSN, AIM or Yahoo Messenger that makes it possible to see which of your buddies is online, usually with a mood message thrown in (“bored”, “working from home” etc.).

It is what makes IM so compelling, to the point that many kids think email is just for old people (ars technica). Mobiles, of course, are not so stratified by age, but a quick glance in any public place in Western Europe will demonstrate that the under 25s are particularly attached to theirs. Hence the notion that tying presence information to a mobile’s contact list could prove to be very popular indeed with that crowd.

As befits the place, a couple of Finns have had that brainwave, and are now busy implementing it in, you guessed it, Nokia phones (Jaiku). More than that, they’ve clocked that a mobile can show much more “social peripheral vision” than simple presence online: it can also show location.

Once you’ve got that, you can let your imagination run riot. No more “I’m on the train” type conversations, because your mate can simply see that before even calling you. You could see how close, physically, your mates are to you when you look for them in a busy place. You can also see where all your mates are at any given point- like, all together, but not with you. And since these are mobiles, not PCs, even not seeing someone online can become quite significant.

With most modern tech, there’s usually a period of grace before abuse starts happening and rules are required. Think email before spam, or the web before people figured out how to make money on it. With this one, even the pioneers are keenly aware of the massive potential for abuse and a possible need to put some limits on the technology.

It might still prove too compelling to stop, though.