CETIS Gathering

At the end of June we ran an event about technical approaches to gathering open educational resources. Our intent was that we would provide space and facilities for people to some and talk about these issues, but we would not prescribe anything like a schedule of presentations or discussion topics. So, people came but what did they talk about?

In the morning we had a large group discussing approaches to aggregating resources and information about them through feeds such RSS or ATOM, and another smaller group discussing tracking what happens to OER resources once they are released.

I wasn’t part of the larger discussion, but I gather than they were interested in the limits of what can be brought in by RSS and difficulties due to the (shall we say) flexible semantics of the elements typically used in RSS even when extended in the typical way with Dublin Core. They would like to bring in information which was more tightly defined and also information from a broader range of sources relating to the actual use of the resource. They would also like to identify the contents of resources at a finer granularity (e.g. an image or movie rather than a lesson) while retaining the context of the larger resource. These are perennial issues, and bring to my mind technologies such as OAI-PMH with metadata richer than the default Dublin Core, Dublin Core Terms (in contrast to Dublin Core Element Set), OAI-ORE, and projects such as PerX and TicToCs (see JournalToCs) (just to mention two which happened to be based in the same office as me). At CETIS we will continue to explore these issues, but I think it is recognised that the solution is not as simple as using a new metadata standard that is in some way better than what we have now.

The discussion on tracking resources (summarized here by Michelle Bachler) was prompted by some work from the Open University’s OLNet on Collective Intelligence, and also some CETIS work on tracking OERs. For me the big “take-home” idea was that many individual OER providers and services must have information about the use of their resources which, while interesting in themselves, would become really useful if made available more widely. So how about, for example, open usage information about open resources? That could really give us some data to analyse.

There were some interesting overlaps between the two discussions: for example how to make sure that a resource is identified in such a way that you can track it and gather information about it from many sources, and what role can usage information play in the full description of a resources.

After lunch we had a demo of a search service built by cross-searching web 2 resource hosts via their APIs, which has been used by the Engineering Subject Centre’s OER pilot project. This lead on to a discussion of the strengths and limitations of this approach: essentially it is relatively simple to implement and can be used to provide a tailored search for an specialised OER collection so long as the number of targets being searched is reasonably low and their APIs stable reliable. The general approach of pulling in information via APIs could be useful in pulling in some of the richer information discussed in the morning. The diversity of APIs lead on to another well-rehearsed discussion mentioning SRU and OpenSearch as standard alternatives.

We also had a demonstration of the iCOPER search / metadata enrichment tool which uses REST, Atom and SPI to allow annotation of metadata records–very interesting as a follow-on from the discussions above which were beginning to see metadata not as a static record but as an evolving body of information associated with a resource.

Throughout the day, but especially after these demos, people were talking in twos and threes, finding out about QTI, Xerte, cohere, and anything else that one person knew about and others wanted to. I hope people who came found it useful, but it’s very difficult as an organiser of such and event to provide a single definitive summary!

OASIS release Content Management Interoperability Services standard

The OASIS interoperability standards body have recently released the Content Management Interoperability Services specification (CMIS) 1.0. Why are we interested? Well, though it is aimed at content management services, the distinction between an enterprise content management service, a repository and a virtual learning environment is somewhat superficial: a quick look at CMIS will show that the services involved are applicable to both repositories of educational content and VLEs.

From the specification abstract:

The Content Management Interoperability Services (CMIS) standard defines a domain model and Web Services and Restful AtomPub bindings that can be used by applications to work with one or more Content Management repositories/systems.

The domain model conceptualizes CMIS as providing “an interface for an application to access a Repository. To do so, CMIS specifies a core data model that defines the persistent information entities that are managed by the repository”. Transient state information, event information, workflows, user administration are not modelled where these are considered internal to the working of a specific CMS. The data model does define the following entities: repositories (which manage objects), document objects (stand-alone information assets), folder objects (hierarchical collections of objects), relationship objects, policy objects, access control, versioning, query (based on a subset of SQL) and a change log.

A large number of services are defined relating to these objects, but these are pretty much what one would expect: getRepositories returns a list of repositories in the system, getRepositoryInfo returns information about a specified repository; getObject and getProperties do similar for objects; navigation is facilitated by a range of services such as getFolderTree, getObjectParents, getChildren which relate to folders and relationships. There are also services for creating, moving and deleting objects or adding them to folders. Core content management is catered for by services for checking objects in and out, retrieving versions of an object and applying policies.

There is a RESTful binding for these services using ATOM feeds, AtomPub, and the underlying HTTP GET, PUT and DELETE methods. There is also a Web Services / WSDL binding, which references WS-I, WS-SECURITY and SOAP. To conform to CMIS, clients must implement either the REST or the Web Service binding, repositories however must implement both.

There are already some implementations, for example Alfresco. Open Text, working with SAP, have pledged to support CMIS, while EMC, IBM and Microsoft have made a similar announcement.

It will be interesting to see what influence CMIS has on the education sector, e.g. whether there is any uptake of parts of it by VLE vendors or repositories used by the sector. I say “parts of it” because the core spec is a 200+ page document, which in turn references several others, especially for the binding. One can imagine the reaction when it lands on the desk of a developer for one of our niche repositories or VLEs, commercial or open source, especially when they learn that they have to do both REST and WS bindings. Aside from full-on or partial implementations it does give us a further point of comparison for interoperability specs in the education domain to check that we remain in line with the wider world; for example how does CMIS’s use of ATOMPub compare with SWORD and SPI, and how the use of SQL for query contrasts with those standards which come more from the library domain (e.g. CQL in SRU).

see also
The specification docs HTML, PDF or MS DOC.
Full description on Cover Pages
OASIS CMIS Technical Committee page
WikiPedia article

Additional technical work for UKOER

CETIS has been funded by JISC to do some additional technical work relevant to the the UKOER programme. The work will cover three topics: deposit via RSS feeds, aggregation of OERs, and tracking & analysis of OER use.

Feed deposit
There is a need for services hosting OERs to provide a mechanism for depositors to upload multiple resources with minimal human intervention per resource. One possible way to meet this requirement that has already identified by some projects is “feed deposit”. This approach is inspired by the way in which metadata and content is loaded onto user devices and applications in podcasting. in short, RSS and ATOM feeds are capable, in principle, of delivering the metadata required for deposit into a repository and in addition can provide either a pointer to the content or that content itself may be embedded into the feed. There are a number of issues with this approach that would need to be overcome.

In this work we will: (1) Identify projects, initiatives, services, etc. that are engaged in relevant work [–if that’s you, please get in touch]. (2) Identify and validate the issues that would arise with respect to feed deposit, starting with those outlined in the Jorum paper linked to above. (3) Identify current approaches used to address these issues, and identify where consensus may be readily achieved.

Aggregation of OERs
There is interest in facilitating a range of options for the provision of aggregations of resources representing the whole or a subset of the UKOER programme output (possibly along with resources from other sources). There have been some developments that implement solutions based on RSS aggregation, e.g. Ensemble and Xpert; and the UKOLN tagometer measures the number of resources on various sites that are tagged as relevant to the UKOER programme.

In this work we will illustrate and report on other approaches, namely (a) Google custom search, (b) query and result aggregation through Yahoo pipes and (c) querying through the host service APIs. We will document the benefits and affordances as well as drawbacks and limitations of each of these approaches. These include the ease with which they may be adopted, and the technical expertise necessary for their development, their dependency on external services (which may still be in beta), their scalability, etc.

Tracking and analysis of OER use
Monitoring the release of resources through various channels, how those resources are used and reused and the comments and ratings associated with them, through technical means is highly relevant to evaluating the uptake of OERs. CETIS have already described some of the options for resource tracking that are relevant to the UKOER programme.

In this work we will write and commission case studies to illustrate the use of these methods, and synthesise the results learnt from this use.

Who’s involved in this work
The work will be managed by me, Phil Barker, and Lorna M Campbell.

Lisa J Rogers will be doing most of the work related to feed deposit and aggregation of OERs

R John Robertson will be doing most of the work relating to Tracking and analysis of OER use.

Please do contact us if you’re interested in this work.

Repositories and the Open Web: report

The CETISROW event took place at Birkbeck college, London, on the 19 April 2010, and I have to say it wasn’t really much of a row. There seemed to me to be more agreement on common themes than disagreement, so I’ll try to pull those together in this report, and if anyone disagrees with them there’s a “comment” form at the bottom of this page :-)

Focus on our aims not the means and by “means” I mean repositories. The sort of aims I have in mind are hosting, disseminating (or sharing), organising, and managing resources, facilitating social interaction around resources, and facilitating resource discovery. I was struck by how the Sheen Sharing project (about which Sarah Currier reported) had started by building what their community of users actually wanted and could use at that time, and not working with early adopters in the hope that they could somehow persuade the mainstream and laggards to follow. Roger Greenhalgh illustrated how wider aims such as social cohesion and knowledge transfer could be fostered through sites focussed on meeting community needs.

One of the participants mentioned at the end how pleased she was that we had progressed to talking in these terms rather than hitting people over the head with all the requirements that come from running a repository. I hope this would please Rachel Heery who, reflecting on various JISC Repositories programmes, made the point a while back that we might get better value from a focus on strategic objectives rather than a specific technology supposed to achieve those objectives.

So, what’s to do if we want to progress with this? We need to be clear about what the requirements are, so there is work to do building on and extending the survey on what people look for when they search online for learning resources from the MeDeV Organising OERs project presented by David Davies, and more work on getting systems to fit with needs–what the EdShare participants call cognitive ergonomics.

There was also a broad theme of working with what is already there, which I think this came through in a couple of sub themes of about web-scale systems and web-wide standards.

Firstly there were several accounts of working with existing services to provide hosting or community. Sheen Sharing (see above) did this, as did the Materials and Engineering subject centres’ OER projects that Lisa J Rogers reported on. Joss Winn also reported on using existing tools and communities saying

I don’t think it’s worth developing social features for repositories when there is already an abundance of social software available. It’s a waste of time and effort and the repository scene will never be able to trump the features that the social web scene offers and that people increasingly expect to use.

Perhaps this where we get closest to disagreement, since the EdShare team have been developing social features for ePrints that mirror those found on Web 2.0 sites. (The comment form is at the bottom…)

Related to this was the second theme of working with the technologies and specifications of web 2.0 sites, most notably RSS/ATOM syndication feeds. Patrick Lockley’s presentation on the Xpert repository was entirely about this, and Lisa Rogers and Sarah Currier both emphasised the importance of RSS (and in Lisa’s case easily-implemented APIs in general) in getting what they had done to work.

So, again, what do we need to do to continue this? Firstly there was a call to do more to synthesise and disseminate information about what approaches people are trying and what is working, so that other projects can follow the successful pioneers. Secondly there is potentially work to be done in smoothing over path that is taken, for example the Xpert project has found many complexities and irregularities in syndication feeds that could perhaps be avoided if we could provide some norms and guidelines for how to use them.

A theme that didn’t quite get discussed, but is nonetheless interesting was around openness. Joss Winn made a very valid distinction between the open web and the social web, one which I had blurred in the build up to the event. So facebook is part of the social web but is by no means open. There was some discussion about whether openness is important in achieving the goals of, e.g., disseminating learning resources. For example, iTunesU is used successfully by many to disseminate pod- and videocasts of lectures, and arguably the vertical integration offered by Apple’s ownership/control of all the levels leads to a better user experience than is the case for some of the alternatives.

All in all, I think we found ourselves broadly in agreement with the outcomes of the ADL Repository and Registries summit, as summarised by Dan Rehak, especially in: the increase in interest in social media and web 2.0 rather than conventional, formal repositories; the focus on understanding what we are really trying to do and finding out what users really want; and in not wanting new standards, especially not new repository-specific standards.

Finally, thanks to Roger Greenhalgh, I now know that there is a world carrot museum online.

Resource profiles for learning materials

Stephen Downes has written a position paper which builds on his idea of Resource Profiles from 2003. The abstract runs:

Existing learning object metadata describing learning resources postulates descriptions contained in a single document. This document, typically authored in IEEE-LOM, is intended to be descriptively complete, that is, it is intended to contain all relevant metadata related to the resource. Based on my 2003 paper, Resource Profiles, an alternative approach is proposed. Any given resource may be described in any number of documents, each conforming to a specification relevant to the resource. A workflow is suggested whereby a resource profiles engine manages and combines this data, producing various views of the resource, or in other words, a set of resource profiles.

I’ve been interested in the idea of resource profiles since I first read about them, but more recently had them in mind while doing the Learning Materials Application Profile Scoping Study. Throughout that work we found heterogeneity to be a big theme: different metadata requirements and standards for different resource types, different requirements for supporting different activities, and information (potentially) available from a diverse range of systems. These all align well with what Downes’ says about resource profiles (and I wish I had said more along those lines in the report).

One thing I’ld like to see demonstrated is how you link all the information together. The same resource is going to be identified differently in different systems, and sometimes not at all. So if you have a video of a lecture you might want to pull in technical metadata from one system (and remember the same video may be available in different technical formats), licence metadata from another system which uses a different identifier, and link it to information about the course for which the lecture was delivered held in a system that doesn’t know about the video at all. How do you make and maintain these links? Some of the semantic web ideas will help here (e.g. providing ways of saying that the resource identified here and the resource identified there are the same as each other; or providing typed relations, “this resource was used in that course”). One of the positive things I’ve seen in the DC-Ed Application Profile and ISO-MLR Education work is that they are both building domain models that make these relationships explicit (see the DC-Ed model and ISO MLR model).

This work also reminds us that much of the educational information that we would like to gather relates not so much to the resource per se as to the use (intended or actual) or the resource in an educational setting. Maybe some of the work on tracking OER use could be helpful here: one of the challenges with tracking OERS is to discover when an OER has been by someone and what they used it for. If (and it is a very big if, it won’t happen accidentally) that leads you on to metadata about their use, then perhaps you could start to gather meaningful information about what education level the resource relates to, etc.

Repositories and the Open Web

On the 19 April, in London CETIS are holding a meeting in London on Repositories and the Open Web. The theme of the meeting is how repositories and social sharing / web 2.0 web sites compare as hosts for learning materials: how well does each facilitate the tasks of resource discovery and resource management; what approaches to resource description do the different approaches take; and are there any lessons that users of one approach can draw from the other?

Both the title of the event (does the ‘and’ imply a distinction? why not repositories on the open web?) and the tag CETISROW may be taken as slightly provocative. Well, the tag is meant lightheartedly, of course, and yes there is a rich vein of work on how repositories can work as part of the web. Just looking back are previous CETIS events I would like to highlight these contributions to previous meetings:

  • Lara Whitelaw presented on the PROWE Project, about using wikis and blogs as shared repositories to support part-time distance tutors in June 2006.
  • David Davies spoke about RSS, Yahoo! Pipes and mashups in June 2007.
  • Roger Greenhalgh, talking about the National Rural Knowledge Exchange, in the May 2008 meeting. And many of us remember his “what’s hot in pigs” intervention in an earlier meeting.
  • Richard Davis talking about SNEEP (social network extensions for ePrints) at the same meeting

Most recently we’ve seen a natural intersection between the aims of Open Educational Resources initiatives and the use of hosting on web 2 and social sharing sites, so, for example, the technical requirements suggested for the UKOER programme said this under delivery platforms:

Projects are free to use any system or application as long as it is capable of delivering content freely on the open web. However all projects must also deposit their content in JorumOpen. In addition projects should use platforms that are capable of generating RSS/Atom feeds, particularly for collections of resources e.g. YouTube channels. Although this programme is not about technical development projects are encouraged to make the most of the functionality provided by their chosen delivery platforms.

We have followed this up with some work looking at the use of distribution platforms for UKOER resources which treats web 2 platforms and repository software as equally useful for that task.

So, there’s a longstanding recognition that repositories live on the open web, and that formal repositories aren’t the only platform suitable for the management and dissemination of learning materials. But I would missing something I think important if I left it at that. For some time I’ve had misgivings about the direction that conceptualising your resource management and dissemination as a repository leads. A while back a colleague noticed that a description of some proposed specification work, which originated from repository vendors, developers and project managers, talked about content being “hidden inside repositories”, which we thought revealing. Similarly, I’ve written before that repository-think leads to talk of interoperability between repositories and repository-related services (I’m sure I’ve written that before). Pretty soon one ends up with a focus on repositories and repository-specific standards per se and not on the original problem of resource management and dissemination. A better solution, if you want to disseminate your resource widely, is not to “hide them in repositories” in the first place. Also, in repository-world the focus is on metadata, rather than resource description: the encoding of descriptive data into fields can be great for machines, but I don’t think that we’ve done a great job of getting that encoding right for educational characteristics of resources, and that this has been at the expense of providing suitable information for people.

Of course not every educational resource is open, and so the open web isn’t an appropriate place for all collections. Also, once you start using some of the web 2.0 social sharing sites for resource management you begin to hit some problems (no option for creative commons licensing, assumptions that the uploader created/owns the resource, limitations on export formats, etc.)–though there are some exceptions. It is, however, my belief that all repository software could benefit from the examples shown by the best of the social sharing websites, and my hope that we will see that in action during this meeting.

Detail about the meeting (agenda, location, etc.) will be posted on the CETIS wiki.

Registration is open, through the CETIS events system.

A short update on Ramlet

Ramlet, or Resource Aggregation Model for Learning, Education and Training (which is working group 13 of the IEEE Learning Technology Standards Subcommittee) is an ongoing piece of work which aims to define a conceptual model that includes an ontology and a nomenclature for enabling the interpretation of externalized representations of digital aggregates of resources for learning, education, and training applications. In other words, it will help show semantic relationships between content aggregation formats such as IMS CP, ATOM, MPEG 21 DID and OAI-ORE.

Like many standardization efforts, progress is slow and gradual so it’s difficult to know when it’s worth giving an update. But last week the RAMLET technical editor, Scott Lewis sent this message about the conceptual model:

This standard has taken a long time, but it is a complex standard that presents an ontology for resource aggregation and down-loadable files to help implement the ontology.

The good news is that virtually all of the technical work has been done for the standard and for a series of IEEE recommend practices that will be published after the standard is published. The working group expects to have a draft of the base standard for internal review by year’s end and a balloting draft submitted to IEEE in Q1 of 2010. The series of recommended practices that specify mappings for IMS CP, ATOM, METS, MPEG-21 DID, and OAI-PMH ORE will be published as soon as possible after the standard is published. Again, the technical work for these recommend practice has been done, and it is just a matter of converting that work to IEEE recommended practices after the base standard has been approved.

CETIS’s Wilbert Kraan is taking part in the RAMLET work, working on a proof of concept implementation using standard open source components.

Repository standards

Tore Hoel tweeted:

The most successful repository initiatives do not engage with LT standards EDRENE report concludes #icoper

pointing me to what looks like a very interesting report which also concludes

Important needs expressed by content users include:

  • Minimize number of repositories necessary to access

Of these, the first bullet point clearly relates to interoperability of repositories, and indicates the importance of focusing on repository federations, including metadata harvesting and providing central indexes for searching for educational content.

Coincidentally I had just finished an email replying to someone who asked about repository aggregation in the context of Open Educational Resources because she is “Trying to get colleagues here to engage with the culture of sharing learning content. Some of them are aware that there are open educational learning resources out there but they don’t want to visit and search each repository.” My reply covered Google advanced search (with the option to limit by licence type), Google custom search engines for OERs, OER Commons, OpenCourseWare Consortium search, the Creative Commons Search, the Steeple podcast aggregator and the similar-in-concept Ensemble Feed finder.

I concluded: you’ll probably notice that everything I’ve written above relies on resources being on the open web (as full text and summarized in RSS feeds) but not necessarily in repositories. If there are any OER discovery services built on repository standards like OAI-PMH or SRU or the like then they are pretty modest in their success. Of course using a repository is a fine way of putting resources onto the web, but you might want to think about things like search engine optimization, making sure Google has access to the full text resource, making sure you have a site map, encouraging (lots of) links from other domains to resources (rather than metadata records), making sure you have a rich choice of RSS feeds and so on.

I have some scribbled notes on 4 or 5 things that people think are good about repositories by which may also be harmful, a focus on interoperability between repositories and repository-related services (when it is at the expense of being part of the open web) is on there.

Tracking the Use of Open Educational Resources

As part of our support for the HEFCE, HE Academy, JISC UKOER programme CETIS are running a “2nd Tuesday” online seminar to discuss the tracking the use of OERs on Thursday 20 Nov (* Yes, I know, perhaps they should be called alternating 2nd Tuesday and 3rd Thursday seminars). Details about timing and how to join will be sent to UKOER projects through the usual strand mail lists; others who are interested should contact David Kernohan (d.kernohan /at/ JISC.AC.UK) about possible extra spaces.

Here’s the full description:

“As far as is possible projects will need to track the volume and use of the resources they make available”

At least that is what the call for projects for this programme said; the aim of this session is to help projects with this requirement. The rationale for tracking use from the funder’s perspective is clear: they want to know whether the resources being released with their money are useful to anyone apart from those who created them. Of course, as any who has tried to work with access statistics for a web site knows, we have to be cautious in interpreting such data. For example, how do we compare a simple “viewing” of a resource with someone taking the resource and embedding it in their own course site? Is it even possible to measure how often the latter happens? Another, perhaps more interesting, aspect of tracking use is what it tells us about what a resource is useful for. Being able to show how other people have used a resource might help someone considering using it themselves, but is there any way to capture this information.

As well as simple access logs and tools like Google analytics, tools similar to track-back on blog postings and the usage information provided by sites such as Flickr and Slide Share (i.e. counting the number of views on-site and number of embeds in other sites) are worth considering. Perhaps more contentious but also worth considering are the techniques such as re-direct URLs and web bugs.

We shall seek to clarify what tracking is required and pragmatically desirable and how it may be achieved. This session will be led by CETIS but we don’t pretend to know the answers to this problem, in fact we’re trying to learn from projects what is useful and achievable, so we will be relying on participants in the meeting to bring their own experiences and potential solutions. For this to work we would like to know in advance who has anything to say, so any project or individual with experience to share should contact Phil Barker philb@icbl.hw.ac.uk as soon as possible.