Lorna Campbell » paradata

New Activity Data and Paradata Briefing Paper

Lorna Campbell — Wed, 01 May 2013 14:53:08 +0000

Cetis have published a new briefing paper on Activity Data and Paradata. The paper presents a concise overview of a range of approaches and specifications for recording and exchanging data generated by the interactions of users with resources.

Such data is a form of Activity Data, which can be defined as “the record of any user action that can be logged on a computer”. Meaning can be derived from Activity Data by querying it to reveal patterns and context, this is often referred to as Analytics. Activity Data can be shared as an Activity Stream, a list of recent activities performed by an individual. Activity Streams are often specific to a particular platform or application, e.g. facebook, however initiatives such as OpenSocial, ActivityStreams and Tin Can API have produced specifications and APIs to share Activity Data across platforms and applications.

While Activity Streams record the actions of individual users and their interactions with multiple resources and services, other specifications have been developed to record the actions of multiple users on individual resources. This data about how and in what context resources are used is often referred to as Paradata. Paradata complements formal metadata by providing an additional layer of contextual information about how resources are being used. A specification for recording and exchanging paradata has been developed by the Learning Registry, an open source content-distribution network for storing and sharing information about learning resources.

The briefing paper provides an overview of each of these approaches and specifications along with examples of implementations and links to further information.

The Cetis Activity Data and Paradata briefing paper written by Lorna M. Campbell and Phil Barker can be downloaded from the Cetis website here: http://publications.cetis.org.uk/2013/808

CETIS at OER13

Lorna Campbell — Thu, 21 Mar 2013 10:59:18 +0000

I was really encouraged to hear from our CETIS13 keynote speaker Patrick McAndrew that next week’s OER13 conference in Nottingham is shaping up to be the biggest yet. In our Open Practice and OER Sustainability session Patrick mentioned that the organising committee had expected numbers to be down from last year as the 2012 conference had been run in conjunction with OCWC and attracted a considerable number of international delegates and UKOER funding has come to an end. In actually fact numbers have risen significantly. I can’t remember the exact figure Patrick quoted but I’m sure he said that over 200 delegates were expected to attend this year. This is good news as it does rather suggest that the UKOER programmes have had some success in developing and embedding open educational practice. It’s also good new for us because CETIS are presenting three (count ‘em!) presentations at this year’s conference :}

The Learning Registry: social networking for open educational resources?
Authors: Lorna M. Campbell, Phil Barker, CETIS; Sarah Currier, Nick Syrotiuk, Mimas,
Presenters: Lorna M. Campbell, Sarah Currier
Tuesday 26 March, 14:00-14:30, Room: B52
Full abstract here.

This presentation will reflect on CETIS’ involvement with the Learning Registry, JISC’s Learning Registry Node Experiment at Mimas (The JLeRN Experiment), and their potential application to OER initiatives. Initially funded by the US Departments of Education and Defense, the Learning Registry (LR) is an open source network for storing and distributing metadata and curriculum, activity and social usage data about learning resources across diverse educational systems. The JLeRN Experiment was commissioned by JISC to explore the affordances of the Learning Registry for the UK F/HE community within the context of the HEFCE funded UKOER programmes.

An overview of approaches to the description and discovery of Open Educational Resources
Authors: Phil Barker, Lorna M. Campbell and Martin Hawksey, CETIS
Presenter: Phil Barker
Tuesday 26 March, 14:30-15:00, Room: B52
Full abstract here.

This presentation will report and reflect on the innovative technical approaches adopted by UKOER projects to resource description, search engine optimisation and resource discovery. The HEFCE UKOER programmes ran for three years from 2009 – 2012 and funded a large number and variety of projects focused on releasing OERs and embedding open practice. The CETIS Innovation Support Centre was tasked by JISC with providing strategic advice, technical support and direction throughout the programme. One constant across the diverse UKOER projects was their desire to ensure the resources they released could be discovered by people who might benefit from them -i f no one can find an OER no one will use it. This presentation will focus on three specific approaches with potential to achieve this aim: search engine optimisation, embedding metadata in the form of schema.org microdata, and sharing “paradata” information about how resources are used.

Writing in Book Sprints
Authors: Phil Barker, Lorna M Campbell, Martin Hawksey, CETIS; Amber Thomas, University of Warwick.
Presenter: Phil Barker
Wednesday 27 March, 11:00-11:15, Room: A25
Full abstract here.

This lightning talk will outline a novel approach taken by JISC and CETIS to synthesise and disseminate the technical outputs and findings of three years of HEFCE funded UK OER Programmes. Rather than employing a consultant to produce a final synthesis report, the authors decided to undertake the task themselves by participating in a three-day book sprint facilitated by Adam Hyde of booksprints.net. Over the course of the three days the authors wrote and edited a complete draft of a 21,000 word book titled “Technology for Open Educational Resources: Into the Wild – Reflections of three years of the UK OER programmes”. While the authors all had considerable experience of the technical issues and challenges surfaced by the UK OER programmes, and had blogged extensively about these topics, it was a challenge to write a large coherent volume of text in such a short period. By employing the book sprint methodology and the Booktype open source book authoring platform the editorial team were able to rise to this challenge.

inBloom to implement Learning Registry and LRMI

Lorna Campbell — Fri, 08 Feb 2013 10:36:09 +0000

There have been a number of reports in the tech press this week about inBloom a new technology integration initiative for the US schools’ sector launched by the Shared Learning Collective. inBloom is “a nonprofit provider of technology services aimed at connecting data, applications and people that work together to create better opportunities for students and educators,” and it’s backed by a cool $100 million dollars of funding from the Carnegie Corporation and the Bill and Melinda Gates Foundation. In the press release, Iwan Streichenberger, CEO of inBloom Inc, is quoted as saying:

“Education technology and data need to work better together to fulfill their potential for students and teachers. Until now, tackling this problem has often been too expensive for states and districts, but inBloom is easing that burden and ushering in a new era of personalized learning.”

This initiative first came to my attention when Sheila circulated a TechCruch article earlier in the week. Normally any article that quotes both Jeb Bush and Rupert Murdoch would have me running for the hills but Sheila is made of sterner stuff and dug a bit deeper to find the inBloom Learning Standards Alignment whitepaper. And this is where things get interesting, because inBloom incorporates two core technologies that CETIS has had considerable involvement with over the last while, the Learning Registry and the Learning Resource Metadata Initiative, which Phil Barker has contributed to as co-author and Technical Working Group member.

I’m not going to attempt to summaries the entire technical architecture of inBloom, however the core components are:

Data Store: Secure data management service that allows states and districts to bring together and manage student and school data and connect it to learning tools used in classrooms.

APIs: Provide authorized applications and school data systems with access to the Data Store.

Sandbox: A publicly-available testing version of the inBloom service where developers can test new applications with dummy data.

inBloom Index: Provides valuable data about learning resources and learning objectives to inBloom-compatible applications.

Optional Starter Apps: A handful of apps to get educators, content developers and system administrators started with inBloom, including a basic dashboard and data and content management tools.

Of the above components, it’s the inBloom index that is of most interest to me, as it appears to be a service built on top of a dedicated inBloom Learning Registry node, which in turn connects to the Learning Registry more widely as illustrated below.

inBloom Learning Resource Advertisement and Discovery

According to the Standards Alignment whitepaper, the inBloom index will work as follows (Apologies for long techy quote, it’s interesting, I promise you!):

The inBloom Index establishes a link between applications and learning resources by storing and cataloging resource descriptions, allowing the described resources to be located quickly by the users who seek them, based in part on the resources’ alignment with learning standards. (Note, in this context, learning standards refers to curriculum standards such as the Common Core.)

inBloom’s Learning Registry participant node listens to assertions published to the Learning Registry network, consolidating them in the inBloom Index for easy access by applications. The usefulness of the information collected depends upon content publishers, who must populate the Learning Registry with properly formatted and accurately “tagged” descriptions of their available resources. This information enables applications to discover the content most relevant to their users.

Content descriptions are introduced into the Learning Registry via “announcement” messages sent through a publishing node. Learning Registry nodes, including inBloom’s Learning Registry participant node, may keep the published learning resource descriptions in local data stores, for later recall. The registry will include metadata such as resource locations, LRMI-specified classification tags, and activity-related tags, as described in Section 3.1.

The inBloom Index has an API, called the Learning Object Dereferencing Service, which is used by inBloom technology-compatible applications to search for and retrieve learning object descriptions (of both objectives and resources). This interface provides a powerful vocabulary that supports expression of either precise or broad search parameters. It allows applications, and therefore users, to find resources that are most appropriate within a given context or expected usage.

inBloom’s Learning Registry participant node is peered with other Learning Registry nodes so that it can receive resource description publications, and filters out announcements received from the network that are not relevant.

In addition, it is expected that some inBloom technology-compatible applications, depending on their intended functionality, will contribute information to the Learning Registry network as a whole, and therefore indirectly feed useful data back into the inBloom Index. In this capacity, such applications would require the use of the Learning Registry participant node.

One reason that this is so interesting is that this is exactly the way that the Learning Registry was designed to work. It was always intended that the Learning Registry would provide a layer of “plumbing” to allow the data to flow, education providers would push any kind of data into the Learning Registry network and developers would create services built on top of it to process and expose the data in ways that are meaningful to their stakeholders. Phil and I have both written a number of blog posts on the potential of this approach for dealing with messy educational content data, but one of our reservations has been that this approach has never been tested at scale. If inBloom succeeds in implementing their proposed technical architecture it should address these reservations, however I can’t help noticing that, to some extent, this model is predicated on there being an existing network of Learning Registry nodes populated with a considerable volume of educational content data, and as far as I’m aware, that isn’t yet the case.

I’m also rather curious about the whitepaper’s assertion that:

“The usefulness of the information collected depends upon content publishers, who must populate the Learning Registry with properly formatted and accurately “tagged” descriptions of their available resources.”

While this is certainly true, it’s also rather contrary to one of the original goals of the Learning Registry, which was to be able to ingest data in any format, regardless of schema. Of course the result of this “anything goes” approach to data aggregation is that the bulk of the processing is pushed up to the services and applications layer. So any service built on top of the Learning Registry will have to do the bulk of the data processing to spit out meaningful information. The JLeRN Experiment at Mimas highlighted this as one of their concerns about the Learning Registry approach, so it’s interesting to note that inBloom appears to be pushing some of that processing, not down to the node level, but out to the data providers. I can understand why they are doing this, but it potentially means that they will loose some of the flexibility that the Learning Registry was designed to accommodate.

Another interesting aspect of the inBloom implementation is that the more detailed technical architecture in the voluminous Developer Documentation indicates that at least one component of the Data Store, the Persistent Database, will be running on MongoDB, as opposed to CouchDB which is used by the Learning Registry. Both are schema free databases but tbh I don’t know how their functionality varies.

inBloom Technical Architecture

In terms of the metadata, inBloom appears to be mandating the adoption of LRMI as their primary metadata schema.

When scaling up teams and tools to tag or re-tag content for alignment to the Common Core, state and local education agencies should require that LRMI-compatible tagging tools and structures be used, to ensure compatibility with the data and applications made available through the inBloom technology.

A profile of the Learning Registry paradata specification will also be adopted but as far as I can make out this has not yet been developed.

It is important to note that while the Paradata Specification provides a framework for expressing usage information, it may not specify a standardized set of actors or verbs, or inBloom.org may produce a set that falls short of enabling inBloom’s most compelling use cases. inBloom will produce guidelines for expression of additional properties, or tags, which fulfill its users’ needs, and will specify how such metadata and paradata will conform to the LRMI and Learning Registry standards, as well as to other relevant or necessary content description standards.

All very interesting. I suspect with the volume of Gates and Carnegie funding backing inBloom, we’ll be hearing a lot more about this development and, although it may have no direct impact to the UK F//HE sector, it is going to be very interesting to see whether the technologies inBloom adopts, and the Learning Registry in particular, can really work at scale.

PS I haven’t had a look at the parts of the inBloom spec that cover assessment but Wilbert has noted that it seems to be “a straight competitor to the Assessment Interoperability Framework that the Obama administration Race To The Top projects are supposed to be building now…”