CETIS OER Visualisation Project

As part of our work in the areas of open educational resources and data analysis CETIS are undertaking a new project to visualise the outputs of the JISC / HEA Open Educational Resource Programmes and we are very lucky to have recruited data wrangler extraordinaire Martin Hawksey to undertake this work. Martin’s job will be to firstly develop examples and workflows for visualising OER project data stored in the JISC CETIS PROD database, and secondly to produce visualisations around OER content and collections produced by the JISC / HEA programmes. Oh, and he’s only got 40 days to do it! You can read Martin’s thoughts on the task ahead over at his own blog MASHe:

40 days to let you see the impact of the OER Programme #ukoer

PROD Data Analysis

A core aspect of CETIS support for the OER Phase 1 and 2 Programmes has been the technical analysis of tools and systems used by the projects. The primary data collection tool used for this purpose is the PROD database. An initial synthesis of this data has already been completed by R. John Robertson, however there is potential for further analysis to uncover potentially richer information sets around the technologies used to create and share OERs.
This part of the project will aim to deliver:

  • Examples of enhanced data visualisations from OER Phase 1 and 2.
  • Recommendations on use and applicability of visualisation libraries with PROD data to enhance the existing OER dataset.
  • Recommendations and example workflows including sample data base queries used to create the enhanced visualisations.

And we also hope this work will uncover some general issues including:

  • Issues around potential workflows for mirroring data from our PROD database and linking it to other datasets in our Kasabi triple store.
  • Identification of other datasets that would enhance PROD queries, and some exploration of how transform and upload them.
  • General recommendations on wider issues of data, and observed data maintenance issues within PROD.

Visualising OER Content Outputs

The first two phases of the OER Programme produced a significant volume of content, however the programme requirements were deliberately agnostic about where that content should be stored, aside from a requirement to deposit or reference it in Jorum. This has enabled a range of authentic practices to surface regarding the management and hosting of open educational content; but it also means that there is no central directory of UKOER content, and no quick way to visualise the programme outputs. For example, the content in Jorum varies from a single record for a whole collection, to a record per item. Jorum is working on improved ways to surface content and JISC has funded the creation of a prototype UKOER showcase, in the meantime though it would be useful to be able to visualise the outputs of the Programmes in a compelling way. For example:

  • Collections mapped by geographical location of the host institution.
  • Collections mapped by subject focus.
  • Visualisations of the volume of collections.

We realise that the data that can be surfaced in such a limited period will be incomplete, and that as a result these visualisations will not be comprehensive, however we hope that the project will be able to produce compelling attractive images that can be used to represent the work of the programme.

The deliverables of this part of the project will be:

  • Blog posts on the experience of capturing and using the data.
  • A set of static or dynamic images that can be viewed without specialist software, with the raw data also available.
  • Documentation/recipes on the visualisations produced.
  • Recommendations to JISC and JISC CETIS on visualising content outputs.

Briefing Paper: the Semantic Web, Linked and Open Data

CETIS has published a new Briefing Paper on the Semantic Web, Linked and Open Data. This briefing paper provides a high level overview of key concepts relating to the Semantic Web, semantic technologies, linked and open data; along with references to relevant examples and standards. The briefing is intended to provide a starting point for those within the teaching and learning community who may have come across the concept of semantic technologies and the Semantic Web but who do not regard themselves as experts and wish to learn more. Examples and links are provided as starting points for further exploration. The briefing paper is supplemented by the blog post When is Linked Data not Linked Data? which provides a summary of the debate surrounding the definition and characteristics of Linked Data.

The briefing paper can be downloaded in pdf format here or you can pick up printed copy at a CETIS event near you soon!

….more what you’d call “guidelines”….

Owen Stephen’s has written a helpful post which makes a very useful contribution to the debate regarding the interpretation of Tim Berners Lee’s Linked Data Design Issues. See my earlier post for a summary of the debate. With all these attempts to clarify the ambiguity I couldn’t help being reminded of the infamous Pirate Code from Pirates of the Caribbean:

“And thirdly, the code is more what you’d call “guidelines” than actual rules. Welcome aboard!”

Sorry, couldn’t resist it ;)

When is Linked Data not Linked Data? – A summary of the debate

One of the activities identified during last December’s Semantic Technology Working Group meeting to be taken forward by CETIS was the production of a briefing paper that disambiguated some of the terminology for those that are less familiar with this domain. The following terms in particular were highlighted:

  • Semantic Web
  • semantic technologies
  • Linked Data
  • linked data
  • linkable data
  • Open Data

I’ve finally started drafting this briefing paper and unsurprisingly defining the above terms is proving to be a non-trivial task! Pinning down agreed definitions for Linked Data, linked data and linkable data is particularly problematic. And I’m not the only one having trouble. If you look up Semantic Web and Linked Data / linked data on wikipedia you will find entries flagged as having multiple issues. It does rather feel like we’re edging close to holy war territory here. But having said that I do enjoy a good holy war as long as I’m watching safely from the sidelines.

So what’s it all about? As far as I can make out much of the debate boils down to whether Linked Data must adhere to the four principles outlined in Tim Berners Lee’s Linked Data Design Issues, and in particular whether use of RDF and SPARQL is mandatory. Some argue that RDF is integral to Linked Data, other suggest that while it may be desirable, use of RDF is optional rather than mandatory. Some reserve the capitalized term Linked Data for data that is based on RDF and SPARQL, preferring lower case “linked data”, or “linkable data”, for data that uses other technologies.

The fact that the Linked Data Design Issues paper is a personal note by Tim Berners Lee, and is not formally endorsed by W3C also contributes to the ambiguity. The note states:

  1. Use URIs as names for things
  2. Use HTTP URIs so that people can look up those names.
  3. When someone looks up a URI, provide useful information, using the standards (RDF, SPARQL)
  4. Include links to other URIs. so that they can discover more things.

I’ll refer to the steps above as rules, but they are expectations of behaviour. Breaking them does not destroy anything, but misses an opportunity to make data interconnected. This in turn limits the ways it can later be reused in unexpected ways. It is the unexpected re-use of information which is the value added by the web. (Berners Lee, http://www.w3.org/DesignIssues/LinkedData.html)

In the course of trying to untangle some of the arguments both for and against the necessity of using RDF and SPARQL I’ve read a lot of very thoughtful blog posts which it may be useful to link to here for future reference. Clearly these are not the only, or indeed the most recent, posts that discuss this most topical of topics, these happen to be the ones I have read and which I believe present a balanced over view of the debate in such a way as to be of relevance to the JISC CETIS community.

Linked data vs. Web of data vs. …
– Andy Powell, Eduserv, July 2009

The first useful post I read on this particular aspect of the debate is Andy Powell’s from July 2009. This post resulted from the following question Andy raised on twitter;

is there an agreed name for an approach that adopts the 4 principles of #linkeddata minus the phrase, “using the standards (RDF, SPARQL)” ??

Andy was of the opinion that Linked Data “implies use of the RDF model – full stop” adding:

“it’s too late to re-appropriate the “Linked Data” label to mean anything other than “use http URIs and the RDF model”.”

However he is unable to provide a satisfactory answer to his own question, i.e. what do you call linked data that does not use the RDF model, and despite exploring alternative models he concludes by professing himself to be worried about this.

Andy returned to this theme in a more recent post in January 2010, Readability and linkability which ponders the relative emphasis given to readability and linkability by initiatives such as the JISC Information Environment. Andy’s general principles have not changed but he presents term machine readable data (MRD) as a potential answer to the question he originally asked in his earlier post.

Does Linked Data need RDF?
– Paul Miller, The Cloud of Data, July 2009

Paul Miller’s post is partially a response to Andy’s query. Paul begins by noting that while RDF is key to the Semantic Web and

“an obvious means of publishing — and consuming — Linked Data powerfully, flexibly, and interoperably.”

he is uneasy about conflating RDF with Linked Data and with assertions that

“‘Linked Data’ can only be Linked Data if expressed in RDF.”

Paul discusses the wording an status of Tim Berners Lee’s Linked Data Design Issues and suggest that it can be read either way. He then goes on to argue that by elevating RDF from the best mechanism for achieving Linked Data to the only permissible approach we risk barring a large group

“with data to share, a willingness to learn, and an enthusiasm to engage.”

Paul concludes by asking the question:

“What are we after? More Linked Data, or more RDF? I sincerely hope it’s the former.”

No data here – just Linked Concepts and Linked, open, semantic?
– Paul Walk, UKOLN, July & November 2009

Paul Walk has published two useful posts on this topic; the first summarising and commenting on the debate sparked by the two posts above, and the second following the Giant Global Graph session at the CETIS 2009 Conference. This latter post presents a very useful attempt at disambiguating the terms Open data , Linked Data and Semantic Web. Paul also tries to untangle the relationship between these three memes and helpfully notes:

  • data can be open, while not being linked
  • data can be linked, while not being open
  • data which is both open and linked is increasingly viable
  • the Semantic Web can only function with data which is both open and linked

So What Is It About Linked Data that Makes it Linked Data™?
– Tony Hirst, Open University, March 2010

Much more recently Tony Hirst published this post which begins with a version of the four Linked Data principles cut from wikipedia. This particular version makes no mention of either RDF or SPARQL. Tony goes on to present a very neat example of data linked using HTTP URI and Yahoo Pipes and asks

“So, the starter for ten: do we have an example of Linked Data™ here?”

Tony broadly believes the answer is yes and is of a similar opinion to Paul Miller that too rigid adherence to RDF and SPARQL

“will put a lot of folk who are really excited about the idea of trying to build services across distributed (linkable) datasets off…”

Perhaps more controversially Tony questions the necessity of universal unique URIs that resolve to content suggesting that:

“local identifiers can fulfil the same role if you can guarantee the context as in a Yahoo Pipe or a spreadsheet”

Tony signs off with:

“My name’s Tony Hirst, I like linking things together, but RDF and SPARQL just don’t cut it for me…”

Meshing up a JISC e-learning project timeline, or: It’s Linked Data on the Web, stupid
– Wilbert Kraan, JISC CETIS, March 2009

Back here at CETIS Wilbert Kraan has been experimenting with linked data meshups of JISC project data held in our PROD system. In contrast to the approach taken by Tony, Wilbert goes down the RDF and SPARQL route. Wilbert confesses that he originally believed that:

“SPARQL endpoints were these magic oracles that we could ask anything about anything.”

However his attempts to mesh up real data sets on the web highlighted the fact that SPARQL has no federated search facility.

“And that the most obvious way of querying across more than one dataset – pulling in datasets from outside via SPARQL’s FROM – is not allowed by many SPARQL endpoints. And that if they do allow FROM, they frequently cr*p out.”

Wilbert concludes that:

“The consequence is that exposing a data set as Linked Data is not so much a matter of installing a SPARQL endpoint, but of serving sensibly factored datasets in RDF with cool URLs, as outlined in Designing URI Sets for the UK Public Sector (pdf).”

And in response to a direct query regarding the necessity of RDF and SPARQL to Linked Data Wilbert answered

“SPARQL and RDF are a sine qua non of Linked Data, IMHO. You can keep the label, widen the definition out, and include other things, but then I’d have to find another label for what I’m interested in here.”

Which kind of brings us right back to the question that Andy Powell asked in July 2009!

So there you have it. A fascinating but currently inconclusive debate I believe. Apologies for the length of this post. Hopefully one day this will go on to accompany our “Semantic Web and Linked Data” briefing paper.

Semantic Technologies: Which Way Now – outputs and activities

Rather belatedly we have finally found time to synthesise the outputs of the “Semantic Technologies: Which Way Now” event CETIS hosted at the University of Strathclyde at the beginning of December. All the presentations from the event are available from the wiki page and you can read Sheila’s liveblog from the day here.

Based on the discussions that took place throughout the event we have identified the following activities that could potentially be taken forward by JISC and CETIS.

1. Briefing paper
There is considerable ambiguity regarding the use of terminology in this space, particularly among those who are less familiar with the semantic technology domain. There was general agreement that it would be useful to attempt to disambiguate some of this terminology including: Semantic Web, semantic technologies, linked data, linkable data and open data with relevant examples where possible and to identify the role that different standards play in this domain.

Output
A short CETIS briefing paper targeted at the teaching and learning community and referencing the forthcoming JISC Linked Data Horizon Scan. This briefing paper could be supplemented by a Delicious page linking to a collection of relevant resources.

2. Business cases
Based on the outputs of the JISC SemTech project, the Linked Data Horizon Scan and other relevant resources, develop a series of business cases for institutional senior managers and information systems directors outlining the potential benefits of investing time and developing expertise in exposing semantic data.

Output
Business cases for senior managers and IS directors.

3. Tools and services
Develop tools, applications and services for consuming and manipulating existing linked data sources to show how they might benefit the domain of teaching and learning and to demonstrate how the business cases identified by Activity 2 might be addressed. E.g. a linked directory of teaching expertise as an exemplar of the sort of service that could be built from foaf and doap.

Output
Tools and services for consuming and manipulating existing linked data sources.

4. Affordances for curriculum design and course approval
Exploration of the affordances of linked data within institutions to facilitate a number of institutional processes including curriculum design and course description. Identify areas in the course approval process where open and or linkable data could be exploited. There may be potential to work with existing programmes and initiatives such as Curriculum Design and XCRI to start exploring affordances and barriers in this area and possibly to begin scoping requirements for XCRI phase 2.

5. Developer events
One or more technical events, possibly similar to Google’s summer of code, open to developers, and students, to produce implementations and resources relevant to education based on open and linked data. These activities could potentially be developed around existing JISC and Talis events.

Outputs
Demonstrators, proof of concept implementations, etc.

We would welcome feedback on these or indeed other activities so please post comments below. CETIS will continue to explore the domains of semantic technologies, open and linked date with a view to facilitating further working groups in this area. We’ll look forward to hearing from you!

Semantic technologies: which way now?

Cast your mind back to the CETIS Conference 2007 and you may remember a session on Semantic Technologies for Teaching and Learning. This session sought to introduce current developments in semantic technologies, explore their potential application to the domain of teaching and learning and facilitate discussion between these two apparently disparate communities. The case for the relevance and potential of semantic technologies was ably presented by a range of international experts through a series of short position papers which formed the basis for a wide ranging discussion. Following this discussion there seemed to be general consensus that it would be valuable for JISC to facilitate further exploration of the affordances of semantic technologies to the domain of education.

JISC responded to this requirement by issuing an ITT for a scoping study to:

“…investigate how applications which use semantic technologies can add value to learning and teaching.”

This study was awarded to the SemTech Project at the University of Southampton and at the same time CETIS established the Semantic Technology Working Group. The remit of this group was firstly to act as an expert working group for the SemTech Project, and secondly to develop recommendations for potential future work based on the outputs of the project.

The SemTech project successfully concluded in July 2009 having undertaken an extensive survey of semantic technologies relevant to learning and teaching and an investigation of the use and uptake of related tools and services by UK HE institutions. In addition to producing a comprehensive report the SemTech Project has also drafted a roadmap for semantic technology adoption by the UK F/HE community.

Semantic technologies appeared again at this year’s CETIS Conference, this time in the guise of linked data which was discussed in both the Find and Seek and Giant Global Graph sessions. The latter session has already generated a number of blog posts by Adam Cooper, Paul Walk and Andy Powell.

In order to disseminate and discuss the SemTech roadmap, the outputs of the CETIS conference and potential future activities in the area of semantic technologies for teaching and learning CETIS are holding a public meeting of the Semantic Technologies Working Group on the 10th of December at the University of Strathclyde. This meeting will:

  • Review the outputs of the SemTech project.
  • Consider the roadmap and recommendations to JISC.
  • Respond to these recommendations and explore future directions.
  • Investigate ways that CETIS can raise awareness of the potential affordances of semantic technologies to the teaching and learning sector.
  • Discuss future activities in this areas that CETIS could potentially engage in.

The meeting is open to all those with an interest in semantic technologies and their potential application to the domain of teaching and learning. We will be actively seeking comments and feedback from the community and would encourage colleagues to join the discussion.

To register for this meeting and for further information please visit the CETIS events page.

Repository Fringe 2009

A few brief notes from the first day of the Repository Fringe (#rf09) event in Edinburgh. A lot of the presentations were somewhat orthogonal (can’t use that word without thinking of the late great Claude Ostyn!) to my main areas of interest. There were one or two mentions of using repositories to manage teaching and learning materials (two to be precise) but the main focus of the majority of the presentations was squarely on institutional repositories of scholarly works and the research publication workflow and lifecycle.

Having said that, Sally Rumsey and Ben O’Steen’s opening keynote raised some interesting general points which I’ve noted randomly below:

“Sir Thomas Bodely built an “ark to save learning from deluge” and instigated a “republic of lettered men”. Are we building the digital equivalent of the Bodleian?”

“Repository staff act as catalysts for community building.”

“The most successful repository is the internet. How can we make institutional repositories more like the internet? Adding urls to resources for example.”

“People search for “things” not documents. Things have names in real life, however not everything on the web has a name. We can give things names? We can certainly give them urls. It is key to know how a document relates to the thing. The real power comes from the relating of things.”

“We’ve reinvented too many wheels. We need to use the defacto standards of the web, they work, don’t fight them.”

“Almost anything can be regarded as a repository (e.g. flickr, youtube, eprints, etc) but these things don’t have much in common.”

“We need to cut the complexity and aim for one click deposit. We need a solution to the multiple repository deposit regime (MuRDeR) problem.”

“Preservation is useless without access. We should rename preservation – assured secure storage and permanent access.”

“Disproportionate feedback loop – the perception that a small effort brings enormous benefit. The ultimate feedback for the academic is peer review.” (I though that this particular disproportionate feedback loop sounded rather like harnessing the power of professional vanity to fill repositories.)

“Print on demand is going to be huge.” (Oh really??)

A few other notable, and in some cases questionable, quotes from the day:

“…..of course if we’re talking about people a strings….”

“Linked data is going to take over the world.”

“The Semantic Web isn’t just about better search, it’s about aggregation.”

“Institutional repositories are ultimately marketing tools really.”

One of the mentions of learning resources came from Richard Jones of Simplectic who said they were involved in a project that was developing a learning object repository based DSpace augmented with Mahara to facilitate communities of practice.

One last thing, one of the “novel” aspects of the Repository Fringe was the Pecha Kucha sessions. Some of these were notably more successful than others. Les Carr was excellent of course, as were William Nixon and his colleague from Glasgow University’s Enrich and Enlighten projects. However I couldn’t help being reminded of Alt-C panel sessions with three or four short rushed powerpoint presentations with very little time or inclination for comments at the end. More opportunity for discussion would have been greatly appreciated! As one of my colleagues diplomatically put it:

“….the message was somewhat hampered by the medium.”

I decided against attending the second day of the conference but was very sorry to miss Cliff Lynch’s closing keynote. Hopefully It’ll appear online sooner rather than later.

SemTech project draft questionnaire

The JISC SemTech Project, funded as a direct result of last year’s JISC CETIS Conference, has published a draft of the questionnaire that they will be using to gather information about examples of semantic technologies used in the domain of teaching and learning. Thanassis and the project team are seeking comments on the draft questionnaire which can be accessed from the project blog at http://blog.lsl.ecs.soton.ac.uk/semtech/

Semantic Technology Working Group

Last Friday saw the first meeting of the new CETIS Semantic Technology Working Group. CETIS Working Groups are a little different from the Special Interest Groups you all know and love in that they have a much tighter focus, a finite lifespan and a remit to produce one or more deliverables. I was particularly interested to attend the launch of the Semantic Technologies Working Group as it is a direct offshoot of the Semantic Technologies for Teaching and Learning session that Phil and I ran at last year’s CETIS Conference. Sheila has already written a short blog post about this meeting but here’s a little more detail.

The working group has two primary aims, firstly to act as an expert working group for the new JISC SemTech project, also funded as a result of the conference session, and secondly to develop recommendations for potential future work based on the outputs of the project. The first meeting of the working group was closed to enable us to focus in detail on the scope of the SemTech project however future meetings are likely to be open to the wider JISC community and all those with an interest in the use of semantic technologies for teaching and learning.

Participants at this initial meeting included Robin Wylie of Learning and Teaching Scotland, Michael Gardner from Essex, Sue Manuel from Loughborough, Tony Linde from Leicester, Simon Buckingham Schum from the OU, Helen Beetham from JISC, Hugh Davis and Thanasis Tiropanis from Southampton and Sheila, Wilbert, Phil and I from CETIS. And not forgetting, as Wilbert tweeted at the time, “iSight, conference phone, projector, 3g modems, ipod, mobile phone herd and the odd mouse.”

Thanasis Tiropanis opened the meeting with an enthusiastic and engaging introduction to the SemTech project which is based at the University of Southampton and will run until February 2009. The aims and objectives of the project are:

  1. Survey of the relevance and use of semantic tools and services in HE/FE, informal and exploratory learning. The impact of current work on semantic enhancement of successful Web 2.0 services will be reported.
  2. A roadmap for further developments in semantic technology adoption in HE/FE, informal learning and exploratory learning.
  3. The HE/FE institutional perspective of tools, services, relevance and quantifiable benefits.

Much of the rest of the meeting was taken up by a discussion of what constitutes “semantic technology” for the purpose of the project. Unsurprisingly this discussion was not entirely conclusive but there seemed to be some agreement that there should be some level of reasoning involved at the machine level. “Inference” was another term that kept cropping up. There was also general agreement that to be relevant to the project the technology must be used with some pedagogic intent and not simply for recording or resource discovery. For example mindmapping tools may not be regarded as semantic technologies for the purpose of the project however an application such as Omnigator which consumes topic maps and merges them on the fly is very much in scope. There’s still a lot of discussion to be had on these issues and it’ll be very intriguing to see what kind of technologies Thanasis and the SemTech project turn up.

For further information on the SemTech project please visit the project website at http://www.semtech.ecs.soton.ac.uk/ or to learn more about the CETIS Semantic Technologies Working Group contact Sheila or I.