SOA only really works webscale

Just sat through a few more SOA talks today, and, as usual, the presentations circled ’round to governance pretty quick and stayed there.

The issue is this: soa promises to make life more pleasant by removing duplication of data and functionality. Money is saved and information is more accurate and flows more freely, because we tap directly into the source systems, via their services.

So far the theory. The problem is that organisations in soa exercises have a well documented tendency to re-invent their old monolithic applications as sets of isolated services that make most sense to themselves. And here goes the re-use argument: everyone uses their own set of services, with lots of data and functionality duplication.

Unless, of course, your organisation has managed to set up a Governance Police that makes everyone use the same set of centrally sanctioned services. Which is, let’s say, not always politically feasible.

Which made me think of how this stuff works on the original service oriented architecture: the web. The most obvious attribute of the web, of course, is that there is no central authority over service provision and use. People just use what is most useful to them- and that is precisely the point. Instead of governance, the web has survival of the fittest: the search engine that gives the best answers gets used by everyone.

Trying to recreate that sort of Darwinian jungle within the enterprise seems both impossible and a little misguided. No organisation has the resources to just punt twenty versions of a single service in the full knowledge that at least nineteen will fail.

Or does it? Once you think about the issue webscale, such a trial-and-error approach begins to look more do-able. For a start, an awful lot of current services are commodities that are the same across the board: email, calendars, CRM etc. These are already being sourced from the web, and there are plenty more that could be punted by entrepreneurial -shared – service providers with a nous for the education system (student record system, HR etc.)

That leaves the individual HE institutions to concentrate on those services that provide data and functionality that are unique to themselves. Those services will survive, because users need them, and they’re also so crucial that institutions can afford to experiment before a version is found that does the job best.

I’ll weasel out of naming what those services will be: I don’t know. But I suspect it will be those that deal with the institution’s community (‘social network’ if you like) itself.

If Enterprise Architecture is about the business, where are the business people?

The Open Group Enterprise Architecture conference in Munich last month saw a first meeting of Dutch and British Enterprise Architecture projects in Higher Education.

Probably the most noticeable aspect of the enterprise architecture in higher education session was the commonality of theme not just between the Dutch and British HE institutions, but also between the HE contingent and the enterprise architects of the wider conference. There are various aspects to the theme, but it really boils down to one thing: how does a bunch of architects get a grip on and then re-fashion the structure of a whole organisation?

In the early days, the answer was simply to set the scope of an architecture job to the expertise and jurisdiction of the typical enterprise architect team: IT systems. In both the notional goal of architecting work as well as its practice, that focus on just IT seems to be too limiting. Even a relatively narrow interpretation of the frequently cited goal of enterprise architecture – to better align systems to the business – presupposes a heavy involvement of all departments and the clout to change practices across the organisation.

A number of strategies to overcome the conundrum were reported on by the HE projects. One popular method is to focus on one concrete project development at a time, and evolve an architecture iteratively. Another is to involve everyone by letting them determine and agree on a set of principles that underpin the architecture work before it starts. Yet other organisations tackle the scope and authority issue head-on and sort governance structures before tackling the structure of the organisation; much like businesses tend to do.

In either of these cases, though, architects remain mostly focussed on IT systems, while remaining wholly reliant on the rest of the organisation for what the systems actually look like and clues about what they should do.

Presentations can be seen on the JISC website

Why compete with .doc?

Given the sheer ubiquity of Microsoft Office documents, it may seem a bit quixotic to invent a competing set of document formats, and drag it through the standards bodies, all the way to ISO. The Open Document Format people have just accomplished that, and are now being hotly pursued by … Microsoft and its preferred Office Open XML specification.

If the creation of interoperability between similar but different programs is the sole purpose of a standard, office documents don’t look like much of a priority. In so far as there is any competition to Microsoft’s Office at all, the first requirement of such programs is to read from, and write to, Office’s file formats as if they are Office themselves. Most of these competitors have succeeded to such an extent that it has entrenched the formats even further. For example, if you’d want to use the same spreadsheet on a Palm and Google Spreadsheet, or send it to an OpenOffice using colleague as well as a complete stranger, Excel’s .xls is practically your only choice.

Yet the Open Document Format (ODF) has slowly wound its way from its origins in the OpenOffice file format through the OASIS specification body, and is now a full ISO standard; the first of its kind. But not necessarily the last: the confusingly named Office Open XML (OOXML) is already an Ecma specification, and the intention is to shift it to ISO too.

To understand why the ODF and OOXML standard steeple chase is happening at all, it is necessary to look beyond the basic interoperability scenario of lecturer A writing a handout that’s then downloaded and looked at by student B, and perhaps printed by lecturer C next year. That sort of user to user sharing was solved a long time ago, once Corel’s suite ceased to be a major factor. But there are some longer running issues with data exchange at enterprise level, and with document preservation.

Enterprise data exchange is perhaps the most pressing of the two. For an awful lot of information management purposes, it would be handy to have the same information in a predictable, structured format. Recent JISC project calls, for example, call for stores of course validation and course description data- preferably in XML. Such info needs to be written by practitioners, which usually means trying to extract that structured data from a pile of opaque, binary Word files. It’d be much easier to extract it from a set of predictable XML.

Provided you use a template or form, that’s precisely what the new XML office formats offer. Both ODF and OOXML try to separate information from presentation and metadata. Both store data as XML, and separate out other media such as images in separate directories, and stick the lot in a Zip compressed archive. Yes, that’s very similar indeed to IMS Content Packaging, so transforming either ODF and OOXML slides to that format for use in a VLE shouldn’t be that difficult either. It’d be much easier to automatically put both enterprise data and course content back into office documents as well.

The fact that the new formats are based on XML explains their suitability as a source for any number of data manipulation workflows, but it is the preservation angle that explains the standards bodies aspect. Even if access to current Office documents is not an issue for most users, that’s only because a number of companies are willing to do Microsoft’s bidding, and at least one set of open source developers have been prepared to spent countless tedious hours trying to reverse engineer the formats. One move by Microsoft and that whole expensive license or reverse engineer business could start again.

For most Governments, that’s not good enough. Access must be guaranteed over decades or more, to anyone who comes along without let or hindrance. It was this aspect that has driven most of the development of ODF in particular, and OOXML by extension. The Massachusetts state policy particularly, with its insistence on the storage of public information in non-proprietary formats, has led to Microsoft first giving much more complete description of its XML format, and later assurances that it wouldn’t assert discriminatory or royalty bearing patent licenses. The state is still going to use ODF, though; not OOXML.

On technical merit, you can see why that would be: OOXML is a pretty hideous concoction that looks like it closely encodes layers of Microsoft office legacy in the most obtuse XML possible. The spec is a 47 Mb whopper running to six thousand and thirty nine pages. On the upside, it’s single- or double-character terseness can make it more compact in certain cases, and Microsoft touts its support for legacy documents. That appears to be true mainly if the OOXML implementation is Microsoft Office, and if you believe legacy formats should be dealt with in a new format, rather than simply in a new converter.

ODF is much simpler, and far easier to read for anyone who has perused XHMTL or other common XML applications, and it is much more generalisable. It could be less efficient in some circumstances, though, because of its verbosity and because it allows mixed content (both characterdata and tags in a tag), and it doesn’t support some of the more esoteric programming extensions that, for example, Excel does.

All other things being equal, the simpler, comprehensible option should win every time. Alas for ODF, things aren’t equal, because OOXML is going to be the default file format in Microsoft Office 2007. That simple fact alone will probably ensure it succeeds. Whether it matters is probably going to depend on whether you need to work with the formats on a deeply technical level.

For most of us, it is probably good enough that the work we all create throughout our lives is that little bit more open and future proofed, and that little less tied to what one vendor chooses to sell us at the moment.

AJAX alliance to start interoperability work

Funny how, after an initial development rush, a community around a new technology will hit some interoperability issues, and then start to address it via some kind of specification initiative. AJAX, the browser-side interaction technique that brought you google maps, is in that phase right now.

Making Asynchronous JavaScript and XML (AJAX) work smoothly matters, not only because it can help make webpages more engaging and responsive, but also because it is one of the most rapid ways to build front ends to web services.

It may seem a bit peculiar at first that there should be any AJAX interoperability issues at all, since it is built on a whole stack of existing, mature, open standards: the W3C’s XML and DOM for data and data manipulation, and XHTML for webpages, ECMAScript (JavaScript) for scripting and much more besides. Though there a few compliance issues with those standards in modern browsers, that’s not the actually the biggest interoperability problem.

That lies more in the fact that most AJAX libraries have been written with the assumption that they’ll be the only ones on the page. That is, in a typical AJAX application, an ECMAScript library is loaded along with the webpage, and starts to control the fetching and sending of data, and the recording of user clicks, drags, drops and more, depending on how exactly the whole application is set up.

This is all nice and straightforward unless there’s another library loaded that also assumes that it’s the only game in town, and starts manipulating the state of objects before the other library can do its job, or starts manipulating completely different objects that happen to have the same name.

Making sure that JavaScript libraries play nice is the stated aim of the OpenAjax alliance. Formed earlier this year, the alliance now has a pretty impressive roster of all the major open source projects in the area as well as major IT vendors such as Sun, IBM, Adobe and Google (OpenAjax Alliance). Pretty much everyone but Microsoft…

The main, concrete way in which the alliance wants to make sure that AJAX JavaScript libraries play nice with each other is by building the OpenAjax hub. This is a set of standard JavaScript functions that address issues such as load order and component naming, but also means of addressing each other’s functionality in a standard way.

For that to happen, the alliance first intends to build an open source reference implementation of the hub (OpenAjax Alliance). This piece of software is meant to control the load and execution order of libraries, and serve as a runtime registry of the libraries’ methods so that each can call on the other. This software is promised to appear in early 2007 (Infoworld), but the SourceForge filestore and subversion tree are still eerily empty (SourceForge).

It’d be a shame if the hub would remain vapourware, because it is easy to see the benefits of a way to get a number of mature and focussed JavaScript libraries to work in a single Ajax application. Done properly, it would make it much easier to string together such components rather then write all that functionality from scratch. This, in turn, could make it much easier to realise the mash-ups and composite applications made possible by the increasing availability of webservices.

Still, at least the white paper (OpenAjax Alliance) is well worth a look for a thorough non-techy introduction to AJAX.