PROD; a practical case for Linked Data

Georgi wanted to know what problem Linked Data solves. Mapman wanted a list of all UK universities and colleges with postcodes. David needed a map of JISC Flexible Service Delivery projects that use Archimate. David Sherlock and I got mashing.

Linked Data, because of its association with Linked Open Data, is often presented as an altruistic activity, all about opening up public data and making it re-usable for the benefit of mankind, or at least the tax-payers who facilitated its creation. Those are a very valid reasons, but they tend to obscure the fact that there are some sound selfish reasons for getting into the web of data as well.

In our case, we have a database of JISC projects and their doings called PROD. It focusses on what JISC-CETIS focusses on: what technologies have been used by these projects, and what for. We also have some information on who was involved with the projects, and were they worked, but it doesn’t go much beyond bare names.

In practice, many interesting questions require more information than that. David’s need to present a map of JISC Flexible Service Delivery projects that use Archimate is one of those.

This presents us with a dilemma: we can either keep adding more info to PROD, make ad-hoc mash-ups, or play in the Linked Data area.

The trouble with adding more data is that there is an unending amount of interesting data that we could add, if we had infinite resources to collect and maintain it. Which we don’t. Fortunately, other people make it their business to collect and publish such data, so you can usually string something together on the spot. That gets you far enough in many cases, but it is limited by having to start from scratch for virtually every mashup.

Which is where Linked Data comes in: it allows you to link into the information you want but don’t have.

For David’s question, the information we want is about the geographical position of institutions. Easily the best source for that info and much more besides is the dataset held by the JISC’s Monitoring Unit. Now this dataset is not available as Linked Data yet, but one other part of the Linked Data case is that it’s pretty easy to convert a wide variety of data into RDF. Especially when it is as nicely modelled as the JISC MU’s XML.

All universities with JISC Flexible Delivery projects that use Archimate. Click for the google map

All universities with JISC Flexible Delivery projects that use Archimate. Click for the google map

Having done this once, answering David’s question was trivial. Not just that, answering Mapmans’ interesting question on a list of UK universities of colleges with postcodes was a piece of cake too. That answer prompted Scott and Tony’s question on mapping UCAS and HESA codes, which was another five second job. As was my idle wonder whether JISC projects by Russell group universities used Moodle more or less than those led by post ’92 institutions (answer: yes, they use it more).

Russell group led JISC projects with and without Moodle

Russell group led JISC projects with and without Moodle

Post 92 led JISC projects with or without Moodle

Post '92 led JISC projects with or without Moodle

And it doesn’t need to stop there. I know about interesting datasets from UKOLN and OSSWatch that I’d like to link into. Links from the PROD data to the goodness of Freebase.com and dbpedia.org already exist, as do links to MIMAS’ names project. And each time such a link is made, everyone else (including ourselves!) can build on top of what has already been done.

This is not to say that creating Linked Data out of PROD was for free, nor that no effort is involved in linking datasets. It’s just that the effort seems less than with other technologies, and the return considerably more.

Linked Data also doesn’t make your data automatically usable in all circumstances or bug free. David, for example, expected to see institutions on the map that do use the Archimate standard, but not necessarily as a part of a JISC project. A valid point, and a potential improvement for the PROD dataset. It may also have never come to light if we hadn’t been able to slice and dice our data so readily.

An outline with recipes and ingredients is to follow.

2 thoughts on “PROD; a practical case for Linked Data

  1. Pingback: Enhanced programme maps « JISC Curriculum Design & Delivery

  2. Pingback: #OpenDataEDB: the results | Open Knowledge Foundation Blog