Lorna Campbell » standards

inBloom to implement Learning Registry and LRMI

Lorna Campbell — Fri, 08 Feb 2013 10:36:09 +0000

There have been a number of reports in the tech press this week about inBloom a new technology integration initiative for the US schools’ sector launched by the Shared Learning Collective. inBloom is “a nonprofit provider of technology services aimed at connecting data, applications and people that work together to create better opportunities for students and educators,” and it’s backed by a cool $100 million dollars of funding from the Carnegie Corporation and the Bill and Melinda Gates Foundation. In the press release, Iwan Streichenberger, CEO of inBloom Inc, is quoted as saying:

“Education technology and data need to work better together to fulfill their potential for students and teachers. Until now, tackling this problem has often been too expensive for states and districts, but inBloom is easing that burden and ushering in a new era of personalized learning.”

This initiative first came to my attention when Sheila circulated a TechCruch article earlier in the week. Normally any article that quotes both Jeb Bush and Rupert Murdoch would have me running for the hills but Sheila is made of sterner stuff and dug a bit deeper to find the inBloom Learning Standards Alignment whitepaper. And this is where things get interesting, because inBloom incorporates two core technologies that CETIS has had considerable involvement with over the last while, the Learning Registry and the Learning Resource Metadata Initiative, which Phil Barker has contributed to as co-author and Technical Working Group member.

I’m not going to attempt to summaries the entire technical architecture of inBloom, however the core components are:

Data Store: Secure data management service that allows states and districts to bring together and manage student and school data and connect it to learning tools used in classrooms.

APIs: Provide authorized applications and school data systems with access to the Data Store.

Sandbox: A publicly-available testing version of the inBloom service where developers can test new applications with dummy data.

inBloom Index: Provides valuable data about learning resources and learning objectives to inBloom-compatible applications.

Optional Starter Apps: A handful of apps to get educators, content developers and system administrators started with inBloom, including a basic dashboard and data and content management tools.

Of the above components, it’s the inBloom index that is of most interest to me, as it appears to be a service built on top of a dedicated inBloom Learning Registry node, which in turn connects to the Learning Registry more widely as illustrated below.

inBloom Learning Resource Advertisement and Discovery

According to the Standards Alignment whitepaper, the inBloom index will work as follows (Apologies for long techy quote, it’s interesting, I promise you!):

The inBloom Index establishes a link between applications and learning resources by storing and cataloging resource descriptions, allowing the described resources to be located quickly by the users who seek them, based in part on the resources’ alignment with learning standards. (Note, in this context, learning standards refers to curriculum standards such as the Common Core.)

inBloom’s Learning Registry participant node listens to assertions published to the Learning Registry network, consolidating them in the inBloom Index for easy access by applications. The usefulness of the information collected depends upon content publishers, who must populate the Learning Registry with properly formatted and accurately “tagged” descriptions of their available resources. This information enables applications to discover the content most relevant to their users.

Content descriptions are introduced into the Learning Registry via “announcement” messages sent through a publishing node. Learning Registry nodes, including inBloom’s Learning Registry participant node, may keep the published learning resource descriptions in local data stores, for later recall. The registry will include metadata such as resource locations, LRMI-specified classification tags, and activity-related tags, as described in Section 3.1.

The inBloom Index has an API, called the Learning Object Dereferencing Service, which is used by inBloom technology-compatible applications to search for and retrieve learning object descriptions (of both objectives and resources). This interface provides a powerful vocabulary that supports expression of either precise or broad search parameters. It allows applications, and therefore users, to find resources that are most appropriate within a given context or expected usage.

inBloom’s Learning Registry participant node is peered with other Learning Registry nodes so that it can receive resource description publications, and filters out announcements received from the network that are not relevant.

In addition, it is expected that some inBloom technology-compatible applications, depending on their intended functionality, will contribute information to the Learning Registry network as a whole, and therefore indirectly feed useful data back into the inBloom Index. In this capacity, such applications would require the use of the Learning Registry participant node.

One reason that this is so interesting is that this is exactly the way that the Learning Registry was designed to work. It was always intended that the Learning Registry would provide a layer of “plumbing” to allow the data to flow, education providers would push any kind of data into the Learning Registry network and developers would create services built on top of it to process and expose the data in ways that are meaningful to their stakeholders. Phil and I have both written a number of blog posts on the potential of this approach for dealing with messy educational content data, but one of our reservations has been that this approach has never been tested at scale. If inBloom succeeds in implementing their proposed technical architecture it should address these reservations, however I can’t help noticing that, to some extent, this model is predicated on there being an existing network of Learning Registry nodes populated with a considerable volume of educational content data, and as far as I’m aware, that isn’t yet the case.

I’m also rather curious about the whitepaper’s assertion that:

“The usefulness of the information collected depends upon content publishers, who must populate the Learning Registry with properly formatted and accurately “tagged” descriptions of their available resources.”

While this is certainly true, it’s also rather contrary to one of the original goals of the Learning Registry, which was to be able to ingest data in any format, regardless of schema. Of course the result of this “anything goes” approach to data aggregation is that the bulk of the processing is pushed up to the services and applications layer. So any service built on top of the Learning Registry will have to do the bulk of the data processing to spit out meaningful information. The JLeRN Experiment at Mimas highlighted this as one of their concerns about the Learning Registry approach, so it’s interesting to note that inBloom appears to be pushing some of that processing, not down to the node level, but out to the data providers. I can understand why they are doing this, but it potentially means that they will loose some of the flexibility that the Learning Registry was designed to accommodate.

Another interesting aspect of the inBloom implementation is that the more detailed technical architecture in the voluminous Developer Documentation indicates that at least one component of the Data Store, the Persistent Database, will be running on MongoDB, as opposed to CouchDB which is used by the Learning Registry. Both are schema free databases but tbh I don’t know how their functionality varies.

inBloom Technical Architecture

In terms of the metadata, inBloom appears to be mandating the adoption of LRMI as their primary metadata schema.

When scaling up teams and tools to tag or re-tag content for alignment to the Common Core, state and local education agencies should require that LRMI-compatible tagging tools and structures be used, to ensure compatibility with the data and applications made available through the inBloom technology.

A profile of the Learning Registry paradata specification will also be adopted but as far as I can make out this has not yet been developed.

It is important to note that while the Paradata Specification provides a framework for expressing usage information, it may not specify a standardized set of actors or verbs, or inBloom.org may produce a set that falls short of enabling inBloom’s most compelling use cases. inBloom will produce guidelines for expression of additional properties, or tags, which fulfill its users’ needs, and will specify how such metadata and paradata will conform to the LRMI and Learning Registry standards, as well as to other relevant or necessary content description standards.

All very interesting. I suspect with the volume of Gates and Carnegie funding backing inBloom, we’ll be hearing a lot more about this development and, although it may have no direct impact to the UK F//HE sector, it is going to be very interesting to see whether the technologies inBloom adopts, and the Learning Registry in particular, can really work at scale.

PS I haven’t had a look at the parts of the inBloom spec that cover assessment but Wilbert has noted that it seems to be “a straight competitor to the Assessment Interoperability Framework that the Obama administration Race To The Top projects are supposed to be building now…”

Back to the Future – revisiting the CETIS codebashes

Lorna Campbell — Wed, 05 Dec 2012 15:08:36 +0000

As a result of a request from the Cabinet Office to contribute to a paper on the use of hackdays during the procurement process, CETIS have been revisiting the “Codebash” events that we ran between 2002 and 2007. The codebashes were a series of developer events that focused on testing the practical interoperability of implementations of a wide range of content specifications current at the time, including IMS Content Packaging, Question and Test Interoperability, Simple Sequencing (I’d forgotten that even existed!), Learning Design and Learning Resource Meta-data, IEEE LOM, Dublin Core Metadata and ADL SCORM. The term “codebash” was coined to distinguish the CETIS events from the ADL Plugfests, which tested the interoperability and conformance of SCORM implementations. Over a five year period CETIS ran four content codebashes that attracted participants from 45 companies and 8 countries. In addition to the content codebashes, CETIS also additional events focused on individual specifications such as IMS QTI, or the outputs puts of specific JISC programmes such as the Designbashes and Widgetbash facilitated by Sheila MacNeill. As there was considerable interest in the codebashes and we were frequently asked for guidance on running events of this kind, I wrote and circulated a Codebash Facilitation document. It’s years since I’ve revisited this document, but I looked it out for Scott Wilson a couple of weeks ago as potential input for the Cabinet Office paper he was in the process of drafting together with a group of independents consultants. The resulting paper Hackdays – Levelling the Playing Field can be read and downloaded here.

The CETIS codebashes have been rather eclipsed by hackdays and connectathons in recent years, however it appears that these very practical, focused events still have something to offer the community so I thought it might be worth summarising the Codebash Facilitation document here.

Codebash Aims and Objectives

The primary aim of CETIS codebashes was to test the functional interoperability of systems and applications that implemented open learning technology interoperability standards, specifications and application profiles. In reality that meant bringing together the developers of systems and applications to test whether it was possible to exchange content and data between their products.

A secondary objective of the codebashes was to identify problems, inconsistencies and ambiguities in published standards and specifications. These were then fed back to the appropriate maintenance body in order that they could be rectified in subsequent releases of the standard or specification. In this way codebashes offered developers a channel through which they could contribute to the specification development process.

A tertiary aim of these events was to identify and share common practice in the implementation of standards and specifications and to foster communities of practice where developers could discuss how and why they had taken specific implementation decisions. A subsidiary benefit of the codebashes was that they acted as useful networking events for technical developers from a wide range of backgrounds.

The CETIS codebashes were promoted as closed technical interoperability testing events, though every effort was made to accommodate all developers who wished to participate. The events were aimed specifically at technical developers and we tried to discourage companies from sending marketing or sales representatives, though I should add that we were not always scucessful! Managers who played a strategic role in overseeing the development and implementation of systems and specifications were encouraged to participate however.

Capturing the Evidence

Capturing evidence of interoperability during early codebashes proved to be extremely difficult so Wilbert Kraan developed a dedicated website built on a Zope application server to facilitate the recording process. Participants were able to register the tools applications that they were testing and to upload content or data generated by these application. Other participants could then take this content test it in their own applications, allowing “daisy chains” of interoperability to be recorded. In addition, developers had the option of making their contributions openly available to the general public or visible only to other codebash participants. All participants were encouraged to register their applications prior to the event and to identify specific bugs and issues that they hoped to address. Developers who could not attend in person were able to participate remotely via the codebash website.

IPR, Copyright and Dissemination

The IPR and copyright of all resources produced during the CETIS codebashes remained with the original authors, and developers were neither required nor expected to expose the source code of their tools and applications to other participants.

Although CETIS disseminated the outputs of all the codebashes, and identified all those that had taken part, the specific performance of individual participants was never revealed. Bug reports and technical issues were fed back to relevant standards and specifications bodies and a general overview on the levels of interoperability achieved was disseminated to the developer community. All participants were free to publishing their own reports on the codebashes, however they were strongly discouraged from publicising the performance of other vendors and potential competitors. At the time, we did not require participants to sign non-disclosure agreements, and relied entirely on developers’ sense of fair play not to reveal their competitors performance. Thankfully no problems arose in this regard, although one or two of the bigger commercial VLE developers were very protective of their code.

Conformance and Interoperability

It’s important to note that the aim of the CETIS codebashes was to facilitate increased interoperability across the developer community, rather than to evaluate implementations or test conformance. Conformance testing can be difficult and costly to facilitate and govern and does not necessarily guarantee interoperability, particularly if applications implement different profiles of a specification or standard. Events that enable developers to establish and demonstrate practical interoperability are arguably of considerably greater value to the community.

Although CETIS codebashes had a very technical focus they were facilitated as social events and this social interaction proved to be a crucial component in encouraging participants to work closely together to achieve interoperability.

Legacy

These days the value of technical developer events in the domain of education is well established, and a wide range of specialist events have emerged as a result. Some are general in focus such as the hugely successful DevCSI hackdays, others are more specific such as the CETIS Widgetbash, the CETIS / DecCSI OER Hackday and the EDINA Wills World Hack running this week which aims to build a Shakespeare Registry of metadata of digital resources relating to Shakespeare covering anything from its work and live to modern performance, interpretation or geographical and historical contextual information. At the time however, aside from the ADL Plugfests, the CETIS codebashes were unique in offering technical developers an informal forum to test the interoperability of their tools and applications and I think it’s fair to say that they had a positive impact not just on developers and vendors but also on the specification development process and the education technology community more widely.

Links

Facilitating CETIS CodeBashes paper
Codebash 1-3 Reports, 2002 – 2005
Codebash 4, 2007
Codebash 4 blog post, 2007
Designbash, 2009
Designbash, 2010
Designbash, 2011
Widgetbash, 2011
OER Hackday, 2011
QTI Bash, 2012
Dev8eD Hackday, 2012

A TAACCCTful mandate? OER, SCORM and the $2bn grant

Lorna Campbell — Tue, 25 Jan 2011 16:36:53 +0000

Last week’s announcement that the US Department of Labour is planning to allocate $2 billion in grant funds to the Trade Adjustment Assistance Community College and Career Training grants programme over the next four years, has already generated a huge response online. $2 billion is a lot of money #inthiscurrentclimate, or indeed in any climate, however the reason that this announcement has generated so much heat is that it has been billed as $2 billion for open educational resources and furthermore it mandates the use of SCORM. Although there has been almost universal approval that the TAACCCT call mandates the use of the CC-By license the inclusion of the SCORM mandate has stirred up a bit of a hornets nest. John Robertson of CETIS has helpfully curated the tweet storm as it escalated over the course of the day. You can follow it here To SCORM or not to SCORM.

Before attempting to summarise the arguments for and against this mandate, it is worth highlighting the following points from the Department of Labour’s Solicitation for Grant Applications:

The programme, which is releasing $500 million in the first instance, states its aim as follows:

“The TAACCCT provides community colleges and other eligible institutions of higher education with funds to expand and improve their ability to deliver education and career training programs that can be completed in two years or less, are suited for workers who are eligible for training under the Trade Adjustment Assistance for Workers program, and prepare program participants for employment in high-wage, high-skill occupations.”

It goes on to state that:

“The Department is interested in accessible online learning strategies that can effectively serve the targeted population. Online learning strategies can allow adults who are struggling to balance the competing demands of work and family to acquire new skills at a time, place and pace that are convenient for them.”

The SCORM mandate appears under the heading Funding Priorities:

“All successful applicants that propose online and technology-enabled learning projects will develop materials in compliance with SCORM, as referenced in Section I.B.4 of this SGA. These courses and materials will be made available to the Department for free public use and distribution, including the ability to re-use course modules, via an online repository for learning materials to be established by the Federal Government.”

And the Creative Commons mandate is covered in Funding Restrictions: Intellectual Property rights.

“In order to further the goal of career training and education and encourage innovation in the development of new learning materials, as a condition of the receipt of a Trade Adjustment Assistance Community College and Career Training Grant (“Grant”), the Grantee will be required to license to the public (not including the Federal Government) all work created with the support of the grant (“Work”) under a Creative Commons Attribution 3.0 License (“License”).”

It is interesting to note that although the call mandates license and content interoperability formats it does not mandate the use of a specific metadata standard:

“All grant products will be provided to the Department with meta-data (as described in Section III.G.4) in an open format mutually agreed-upon by the grantee and the Department.”

The section in question refers to an appendix of keywords and tags which grantees are advised to use. Although I am unclear from the call whether “grant products” refers to bids and documentation or actual educational resources.

To coincide with the publication of the call, Creative Commons issued a press release with the following endorsement from incoming CEO Cathy Casserly:

“This exciting program signifies a massive leap forward in the sharing of education and training materials. Resources licensed under CC BY can be freely used, remixed, translated, and built upon, and will enable collaboration between states, organizations, and businesses to create high quality OER. This announcement also communicates a commitment to international sharing and cooperation, as the materials will be available to audiences worldwide via the CC license.”

Some bloggers, including Dave Cormier, University of Prince Edward Island, and Stephen Downes, National Research Council of Canada, initially responded with cautious optimism, seeing this initiative as a possible step towards ending “the text book industry as we know it.”

Cormier commented:

“This kind of commitment from the government, money at that scale, that much commitment to the idea of creative commons… this tells me that we might be ready to rid ourselves of the $150 introductory textbook and move to open content.”

Downes concurred because:

“First, government support removes the risk from using a Creative Commons license. Second, it’s enough money. $2 billion will actually produce a measurable amount of educational content. And third, it’s not the only game in town.”

However Cormier was sufficiently incensed about the inclusion of the SCORM mandate to launch a petition on twitter titled “Educational professionals against the enforcement of SCORM by the US Department of Education.”

(In actual fact the TAACCCT call comes from the Department of Labour rather than the Department of Education.)

Rob Abel, CEO of IMS, also responded in no uncertain terms to the inclusion of the SCORM mandate. In a blog post and open letter of IMS members Abel quoted President Obama’s pledge to “remove outdated regulations that stifle job creation and make our economy less competitive,” adding that the inclusion of the SCORM mandate is a “clear violation” of this pledge. Abel claimed that the SCORM mandate is a “ticking time” bomb that will “add enormous cost to the creation of the courses and to the platforms that must deliver them” and “stifle the intended outcomes of the historic TAACCCT investment”. Abel provides a long and detailed critique of SCORM and points out that, “IMS has spent the last five years bringing to market standards that will actually deliver on what SCORM promised” namely Common Cartridge, Learning Tools Interoperability (LTI), and Learning Information Services (LIS).

Chuck Severence, University of Michigan School of Information and IMS consultant,
agreed with Abel’s comments while expanding his self-described rant into a critique of OER initiatives more generally. Severence argued that “this obsession with ‘making and publishing’ OER artefacts that are unsuitable for editing is why nearly all of this kind of work ends up dead and obsolete.” He adds that most OER initiatives “make some slick web site and then try to drive people to their site – virtually none of these efforts can demonstrate any real learning impact.” However Severence does believe that if educational resources are published in a remixable format with a creative commons license they can be of real value and cites his own book Python for Informatics by way of example. He also concedes that the problem is “difficult to solve” before concluding “IMS Common Cartridge is the best we have but it needs a lot more investment in both the specification and tools to support the specification fully.”

A rather more balanced argument was put forward by Michael Feldstein, Academic Enterprise Solutions, Oracle Corporation. While he agreed that mandating SCORM was a mistake he noted that SCORM and IMS CC have “substantially different affordances that are appropriate for substantially different use cases”. While recognizing that it is understandable that “the Federal government wants to mandate a particular standard for content reuse” he added that mandating any specific standard, whether SCORM, IMS CC, RSS or Atom, is likely to be problematic because “educational content re-use is highly context-dependent”. Instead Feldstein suggests:

“The better thing to do would be to require that grantees include in their proposal a plan for promoting re-use, which would include the selection of appropriate format standards.”

Which is exactly the approach taken by the JISC / Higher Education Academy OER Programmes.

Reflecting on these developments from across the pond I have to agree that mandating the use of SCORM for the creation of open educational resources does strike me as being somewhat curious to say the least. This is very much at odds with the approach taken by the JISC / HEA OER Programmes. UK OER does not mandate the use of any specific standards however there are detailed technical guidelines for the programme and CETIS provides technical support to all projects. However the TAACCCT programme is not an OER programme in the same sense as the JISC / HEA UK OER Programmes. It’s interesting to note that while the White House announcement by Hal Plotkin focuses squarely on open education resources, the call itself uses slightly different terminology, referring instead to “open-source courses”.

As CETIS Scott Wilson pointed out on twitter, given TAACCCT’s focus on adaptive self-paced interactive content, the initaitive appears to be more akin to the National Learning Network’s NLN Materials programme which ran for five years from 1999 and which also mandated the use of SCORM with a greater or lesser degree of success, depending on your perspective. This reflection led Amber Thomas of JISC to comment:

“It’s not that mandating standards for learning materials is always wrong, it was the right thing for the NLN Materials – its more nuanced than that. It’s about the point people are at and which direction things need taking in.”

At this stage and at this remove it’s difficult to comment further without knowing more about the rationale behind the Department of Labour’s decision to mandate the use of SCORM for this particular programme. Needless to say, CETIS will be following these developments with interest and will continue to disseminate any further developments.

Then and Now

Lorna Campbell — Fri, 16 Apr 2010 11:34:48 +0000

A position paper for the ADL Repositories and Registries Summit by Lorna M. Campbell, Phil Barker and R. John Robertson

Between 2002 and 2010 the UK Joint information Systems Committee (JISC)¹ funded a wide range of development programmes with the aim of improving access within the UK Further and Higher Education (F/HE) sector to content produced by F/HE institutions and to establish policies and technical infrastructure to facilitate its discovery and use. The Centre for Educational Technology and Interoperability Standards (CETIS)² is a JISC innovation support centre that provides technical and strategic support and guidance to the JISC development programmes and F/HE sector. CETIS contributed to scoping the technical requirements of the programmes summarised here.

Programmes such as Exchange for Leaning (X4L, 2002 – 2006)³ focused on the creation of reusable learning resources and tools to facilitate their production and management while Re-purposing and Re-use of Digital University-level Content⁴ (RePRODUCE, 2008 – 2009) aimed to encourage the re-use of high quality externally produced materials and to facilitate the transfer of learning content between institutions. At the same time the Digital Repositories⁵ (2005 – 2007) and Repositories Preservation Programmes⁶ (2006 – 2009) focused on establishing technical infrastructure within institutions and across the sector.

These programmes were informed by a strategic and technical vision which was expressed through initiatives including the e-Learning Framework⁷, the e-Framework⁸, the Information Environment Technical Architecture⁹ and the Digital Repositories Roadmap¹⁰. The IE Architecture for example sought to “specify a set of standards and protocols intended to support the development and delivery of an integrated set of networked services that allowed the end-user to discover, access, use and publish digital and physical resources as part of their learning and research activities.”

These programmes and initiatives have met with varying degrees of success across the different sectors of the UK F/HE community. The rapid growth in the number of open access institutional repositories of scholarly works including both journal papers and e-theses may be attributed directly to the impact of JISC funding and policy. The number of open access institutional repositories has approximately doubled since 2007 to 172¹¹ currently . Arguably there has been less success supporting and facilitating access to teaching and learning materials. Although the number of repositories of teaching and learning materials is growing slowly, few institutions have policies for managing these resources. Indeed one of the final conclusions of the Repositories and Preservation Programme Advisory Group, which advised the JISC repositories programmes, was that teaching and learning resources have not been served well by the debate about institutional repositories seeking to cover both open access to research outputs and management of teaching and learning materials as the issues relating to their use and management are fundamentally different¹². The late Rachel Heery also commented that greater value may be derived from programmes that focus more on achieving strategic objectives (e.g. improving access to resources) and less on a specific technology to meet these objectives (e.g. repositories). In addition the findings of the RePRODUCE Programme¹³ suggested that projects had significantly underestimated the difficulty of finding high quality teaching and learning materials that were suitable for copyright clearance and reuse.

Rather than a radical shift in policy these conclusions should be regarded as reflecting a gradual development in policy, licensing and technology right across the web. This includes the advent of web 2.0, the appearance of media specific dissemination platforms such as slideshare, youtube, flickr, iTunesU, interaction through RESTful APIs, OpenID, OAuth and other web-wide technologies, increasing acceptance of Creative Commons licenses and the rise of the OER movement. As a result there has been a movement away from developing centralised education specific tools services and towards the integration of institutional systems with applications and services scattered across the web. Furthermore there has been growing awareness of the importance of the web itself as a technical architecture as opposed to a simple interface or delivery platform.

These developments are reflected in current JISC development programmes where the priority is less on using a particular technology (e.g. repositories) or implementing a particular standard but rather to get useful, useable content out to the UK F/HE community and beyond by what ever means possible. The JISC Higher Education Academy Open Educational Resources Pilot Programme¹⁴ (OER, 2009 – 2010) is a case in point. To illustrate how both strategic policy and technology have developed it is interesting to compare and contrast the 2002 X4L Programme and the current OER Pilot Programme

X4L Programme 2002 – 2006

The X4L programme aimed to explore the re-purposing of existing content suitable for use in learning. Part of this activity was to explore the process of integrating interoperable learning objects with VLEs. A small number of tools projects were funded to facilitate this task: an assessment management system (TOIA), a content packaging tool (Reload) and a learning object repository (Jorum). Projects were given a strong steer to use interoperability standards such as IMS QTI, IMS Content Packaging, ADL SCORM and IEEE LOM. A mandatory application profile of the IEEE LOM was developed for the programme and formal subject classification vocabularies identified including JACS and Dewey. Projects were strongly recommended to deposit their content in the Jorum repository and institutions were required to sign formal licence agreements before doing so. Access to content deposited content in Jorum was restricted to UK F/HE institutions only.

OER Pilot Programme 2009 – 2010

The aim of the OER Pilot Programme is to make a significant volume of existing teaching and learning resources freely available online and licensed in such a way to enable them to be reused worldwide. Projects may release any kind of content in any format and although projects are encouraged to use open standards where applicable proprietary formats are also acceptable. CETIS advised projects on the type of information they should record about their resources but not how to go about recording it. There is no programme specific metadata application profile and no formal metadata standard or vocabularies have been recommended. The only mandatory metadata that projects were directed to record was the programme tag #ukoer. Projects were given free rein to use any dissemination platform they chose provided that the content is freely available and under an open licence. In addition, projects must also represent their resources in JorumOpen either by linking or through direct deposit. All resources represented in JorumOpen are freely available worldwide and released under Creative Commons licences.

During the course of the OER Pilot Programme CETIS have interviewed all 29 projects to record their technical choices and the issues that have surfaced. This information has been recorded in the CETIS PROD¹⁵ system and has been synthesised in a series of blog posts¹⁶. CETIS is also undertaking additional exploratory work to investigate different methods of aggregating and tracking resources produced by the OER Programme. The contrast between the two programmes is marked and the success or otherwise of the technical approach adopted by the OER Pilot Programme remains to be seen. The programme concludes in April 2010 and a formal programme level synthesis and evaluation is already underway.

References
1. Centre for Educational Technology and Interoperability Standards, CETIS, http://www.cetis.org.uk
2. Exchange for Learning Programme, X4L, http://www.jisc.ac.uk/whatwedo/programmes/x4l.aspx
3. Re-purposing and Re-use of Digital University-level Content Programme, RePRODUCE, http://www.jisc.ac.uk/whatwedo/programmes/elearningcapital/reproduce.aspx
4. Digital Repositories Programme, http://www.jisc.ac.uk/whatwedo/programmes/digitalrepositories2005.aspx
5. Repositories Preservation Programmes http://www.jisc.ac.uk/whatwedo/programmes/reppres.aspx
6. E-Learning Framework, http://www.elframework.org/
7. E-Framework, http://www.e-framework.org/
8. JISC Information Environment Technical Architecture http://www.jisc.ac.uk/whatwedo/themes/informationenvironment/iearchitecture.aspx
9. Digital Repositories Roadmap, http://www.jisc.ac.uk/whatwedo/themes/informationenvironment/reproadmaprev.aspx
10. The Directory of Open Access Repositories, OpenDOAR, http://www.opendoar.org/
11.Exclude Teaching and Learning Materials from the Open Access Repositories Debate. Discuss, http://blogs.cetis.org.uk/lmc/2008/10/27/exclude-teaching-and-learning-materials-from-the-open-access-repositories-debate-discuss/
12. RePRODUCE Programme Summary Report, http://www.jisc.ac.uk/media/documents/programmes/elreproduce/jisc_programme_summary_report_reproduce.doc
13. JISC Academy Open Educational Resources Pilot Progamme, http://www.jisc.ac.uk/whatwedo/programmes/elearning/oer.aspx
14. CETIS PROD, monitoring projects, software and standards, http://prod.cetis.org.uk/query.php?theme=UKOER
15. John’s JISC CETIS Blog, http://blogs.cetis.org.uk/johnr/category/ukoer/
16. OER Synthesis and Evaluation Project, http://www.caledonianacademy.net/spaces/oer/

When is Linked Data not Linked Data? – A summary of the debate

Lorna Campbell — Tue, 16 Mar 2010 11:32:07 +0000

One of the activities identified during last December’s Semantic Technology Working Group meeting to be taken forward by CETIS was the production of a briefing paper that disambiguated some of the terminology for those that are less familiar with this domain. The following terms in particular were highlighted:

Semantic Web
semantic technologies
Linked Data
linked data
linkable data
Open Data

I’ve finally started drafting this briefing paper and unsurprisingly defining the above terms is proving to be a non-trivial task! Pinning down agreed definitions for Linked Data, linked data and linkable data is particularly problematic. And I’m not the only one having trouble. If you look up Semantic Web and Linked Data / linked data on wikipedia you will find entries flagged as having multiple issues. It does rather feel like we’re edging close to holy war territory here. But having said that I do enjoy a good holy war as long as I’m watching safely from the sidelines.

So what’s it all about? As far as I can make out much of the debate boils down to whether Linked Data must adhere to the four principles outlined in Tim Berners Lee’s Linked Data Design Issues, and in particular whether use of RDF and SPARQL is mandatory. Some argue that RDF is integral to Linked Data, other suggest that while it may be desirable, use of RDF is optional rather than mandatory. Some reserve the capitalized term Linked Data for data that is based on RDF and SPARQL, preferring lower case “linked data”, or “linkable data”, for data that uses other technologies.

The fact that the Linked Data Design Issues paper is a personal note by Tim Berners Lee, and is not formally endorsed by W3C also contributes to the ambiguity. The note states:

Use URIs as names for things

Use HTTP URIs so that people can look up those names.

When someone looks up a URI, provide useful information, using the standards (RDF, SPARQL)

Include links to other URIs. so that they can discover more things.

I’ll refer to the steps above as rules, but they are expectations of behaviour. Breaking them does not destroy anything, but misses an opportunity to make data interconnected. This in turn limits the ways it can later be reused in unexpected ways. It is the unexpected re-use of information which is the value added by the web. (Berners Lee, http://www.w3.org/DesignIssues/LinkedData.html)

In the course of trying to untangle some of the arguments both for and against the necessity of using RDF and SPARQL I’ve read a lot of very thoughtful blog posts which it may be useful to link to here for future reference. Clearly these are not the only, or indeed the most recent, posts that discuss this most topical of topics, these happen to be the ones I have read and which I believe present a balanced over view of the debate in such a way as to be of relevance to the JISC CETIS community.

Linked data vs. Web of data vs. …
– Andy Powell, Eduserv, July 2009

The first useful post I read on this particular aspect of the debate is Andy Powell’s from July 2009. This post resulted from the following question Andy raised on twitter;

is there an agreed name for an approach that adopts the 4 principles of #linkeddata minus the phrase, “using the standards (RDF, SPARQL)” ??

Andy was of the opinion that Linked Data “implies use of the RDF model – full stop” adding:

“it’s too late to re-appropriate the “Linked Data” label to mean anything other than “use http URIs and the RDF model”.”

However he is unable to provide a satisfactory answer to his own question, i.e. what do you call linked data that does not use the RDF model, and despite exploring alternative models he concludes by professing himself to be worried about this.

Andy returned to this theme in a more recent post in January 2010, Readability and linkability which ponders the relative emphasis given to readability and linkability by initiatives such as the JISC Information Environment. Andy’s general principles have not changed but he presents term machine readable data (MRD) as a potential answer to the question he originally asked in his earlier post.

Does Linked Data need RDF?
– Paul Miller, The Cloud of Data, July 2009

Paul Miller’s post is partially a response to Andy’s query. Paul begins by noting that while RDF is key to the Semantic Web and

“an obvious means of publishing — and consuming — Linked Data powerfully, flexibly, and interoperably.”

he is uneasy about conflating RDF with Linked Data and with assertions that

“‘Linked Data’ can only be Linked Data if expressed in RDF.”

Paul discusses the wording an status of Tim Berners Lee’s Linked Data Design Issues and suggest that it can be read either way. He then goes on to argue that by elevating RDF from the best mechanism for achieving Linked Data to the only permissible approach we risk barring a large group

“with data to share, a willingness to learn, and an enthusiasm to engage.”

Paul concludes by asking the question:

“What are we after? More Linked Data, or more RDF? I sincerely hope it’s the former.”

No data here – just Linked Concepts and Linked, open, semantic?
– Paul Walk, UKOLN, July & November 2009

Paul Walk has published two useful posts on this topic; the first summarising and commenting on the debate sparked by the two posts above, and the second following the Giant Global Graph session at the CETIS 2009 Conference. This latter post presents a very useful attempt at disambiguating the terms Open data , Linked Data and Semantic Web. Paul also tries to untangle the relationship between these three memes and helpfully notes:

data can be open, while not being linked
data can be linked, while not being open
data which is both open and linked is increasingly viable
the Semantic Web can only function with data which is both open and linked

So What Is It About Linked Data that Makes it Linked Data™?
– Tony Hirst, Open University, March 2010

Much more recently Tony Hirst published this post which begins with a version of the four Linked Data principles cut from wikipedia. This particular version makes no mention of either RDF or SPARQL. Tony goes on to present a very neat example of data linked using HTTP URI and Yahoo Pipes and asks

“So, the starter for ten: do we have an example of Linked Data™ here?”

Tony broadly believes the answer is yes and is of a similar opinion to Paul Miller that too rigid adherence to RDF and SPARQL

“will put a lot of folk who are really excited about the idea of trying to build services across distributed (linkable) datasets off…”

Perhaps more controversially Tony questions the necessity of universal unique URIs that resolve to content suggesting that:

“local identifiers can fulfil the same role if you can guarantee the context as in a Yahoo Pipe or a spreadsheet”

Tony signs off with:

“My name’s Tony Hirst, I like linking things together, but RDF and SPARQL just don’t cut it for me…”

Meshing up a JISC e-learning project timeline, or: It’s Linked Data on the Web, stupid
– Wilbert Kraan, JISC CETIS, March 2009

Back here at CETIS Wilbert Kraan has been experimenting with linked data meshups of JISC project data held in our PROD system. In contrast to the approach taken by Tony, Wilbert goes down the RDF and SPARQL route. Wilbert confesses that he originally believed that:

“SPARQL endpoints were these magic oracles that we could ask anything about anything.”

However his attempts to mesh up real data sets on the web highlighted the fact that SPARQL has no federated search facility.

“And that the most obvious way of querying across more than one dataset – pulling in datasets from outside via SPARQL’s FROM – is not allowed by many SPARQL endpoints. And that if they do allow FROM, they frequently cr*p out.”

Wilbert concludes that:

“The consequence is that exposing a data set as Linked Data is not so much a matter of installing a SPARQL endpoint, but of serving sensibly factored datasets in RDF with cool URLs, as outlined in Designing URI Sets for the UK Public Sector (pdf).”

And in response to a direct query regarding the necessity of RDF and SPARQL to Linked Data Wilbert answered

“SPARQL and RDF are a sine qua non of Linked Data, IMHO. You can keep the label, widen the definition out, and include other things, but then I’d have to find another label for what I’m interested in here.”

Which kind of brings us right back to the question that Andy Powell asked in July 2009!

So there you have it. A fascinating but currently inconclusive debate I believe. Apologies for the length of this post. Hopefully one day this will go on to accompany our “Semantic Web and Linked Data” briefing paper.