Comparing metadata requirements for OERs (part 2)

Posted on August 31, 2009 by johnr

In Comparing metadata requirements (part 1) I examined the required and suggested metadata for Open Educational Resources in the UKOER programme, for the Jorum deposit tool, and the DiscoverEd aggregator. In this second part of the comparison I’m going to try to capture some of our initial discussions fom the UKOER programme session about metadata and then very roughly compare the brainstorming we did as part of that event with the requirements of the programme and other initiatives.

To begin with let’s have a look at a graphical overview of the requirements from the three initiatives. The graph below displays an overview of the metadata requirements of the UKOER programme, the Jorum deposit tool, and the DiscoverEd aggregator service. Full height bars are manadatory elements, three quarter height bars are system generated elements and half height bars are recommended metadata elements.

Graphiical Overview of Metadata Requirements Relating to the UKOER Programme

Graphical overview of metadata requirements relating to the UKOER programme

In the elluminate session, we asked the participating projects to consider what metadata they would require to:

identify
find
select
use
cite
manage

a resource. Participants then shared their suggestions in the chat box. I’ve put the data together and were appropriate combined or split entries. The below graph represents the group’s suggestions of which possible pieces of information are important for each of the outlined functions. (Some caveats are in order. The exercise was not rigorous, the number of participants answering at a given time varied; the first question ‘identify’ had a higher broader repsonse rate – this may relate to how we explained the exercise. The answers were free text and not from a prior list. Unless specified the idea of a date is counted for both creation/ initial use and upload/ publication. It also became clear that for functional purposes rights mostly collapsed in to licence (thus it was dropped from the graph).

Condensed outputs of metadata responses

Graph displaying a summary of metadata requirements for functions from brainstorming in 2nd Tuesday session

The idea of educational level is implict in educational description /context and so it should probably be included with that category (this has not been done in this graph however, as educational level is singled out in the requirements). It is not entirely clear what was intended by some of the descriptions – e.g. coverage.

Examining the graph it is clear that there are some key pieces of infromation for particular uses. Across the entire set of functions the key information appears to be: author, date of publication, subject and description, educaitonal description/ intended use (if combined with educational level) and usage data.

The key information for each function was:

identify – subject
find – description
select – licence
use – licence
cite – date of publication
manage – usage data

By way of enabling a comparison with the metadata requirements, the top three responses for each function where collated (and then taken as factor of one) – this was done to provide an approximate indication of overall importance that could be compared to the requirements data.

OER required metadata compared to brainstorming

From this comparison it is interesting to note the following:

Of the two contentious mandatory metadata elements: file format/ mime type was actually considered to be functionally important.
Recording the Language of an OER was not considered to be critcial for any of the functions – though all the initiatives consider it important (this may be attributable to the programme-based context of our discussion).
The institution/ publisher is surprisingly unimportant functionally (unless you are, like Jorum, hosting materials).
Licence is probably the most important piece of information for an OER.
Usage data and user ratings are considered to be critical pieces of information – they are not however included in metadata profiles – however, it is likely that this information would be generated over time by the relevant host services

There’s probably more to say about this data and more to do with it but for now at least – that’s plenty to reflect on.

Comparing metadata requirements for OERs (part 1)

Posted on August 26, 2009 by johnr

In our elluminate session on metadata and aggregation for Open Educational Resources, Phil and I spent some time gettting everyone to think through the information required to interact with an educational resource in certain ways (such as: (re-)use, cite, find, identify, manage). this produced a lot of responses prioiritizing different bits of information that are needed. I’ve not gone through my notes thoroughly yet but on the whole particapants agreed that the metadata which the programme asked for was needed (the main element of contention was file format and size which thankfully are probably the most automatable of metadata).

With this in mind I was interested to read about ccLearn’s developments in developing a tool to provide an enhanced search of aggregated OERs and their metadata reccomendations for sources.

“DiscoverEd is an experimental project from ccLearn which attempts to provide scalable search and discovery for educational resources on the web. Metadata, including the license and subject information available, are exposed in the result set.” http://wiki.creativecommons.org/DiscoverEd_FAQ

There’s a lot more to be said about their work as I’m still trying to figure out how it is similar to and differs from all the previous work done on aggregating repositories (at first glance – it’s got the advantage of web friendly syndication/ transport standards but potentially less robust/ standardised descriptive standards). Today however, I thought it would be interesting to compare minimum metadata sets for OERs that I’m aware of and that are intended for multi-organisation/ insitutional use (i.e. not just what a given organisation has decided as a minmimal set for its metadata).

UKOER Mandatory Metadata:

from: http://blogs.cetis.org.uk/lmc/2009/03/30/metadata-guidelines-for-the-oer-programme/

programme tag
author
title
date (uploaded/ creation)
url
file format
file size
[I’m fairly sure rights is on some versions of this list but it doesn’t appear on this one]

Suggested metadata

language
subject classsifications
keywords
tags
comments
description

DiscoverEd metadata

http://wiki.creativecommons.org/CcLearn_Search_Metadata

All the metadata is optional but the following is highly recommended :

title
summary
language
education level
licence
subject

Jorum’s OER deposit tool

Gareth Waller summarized the Metadata requirements of the the Jorum OER deposit tool in comment on http://repositorynews.wordpress.com/2009/08/05/musing-about-metadata-for-oer/
“The profile is as follows:

Mandatory metadata set:

Title
Overview (Description)
Keywords
Author Name
Licence

Recommended metadata set:

Project name
Creation date
Classification (JACS subject classification)

System Generated metadata set:

Publisher
Contributed Date
Language
Identifier

The ‘keywords’ metadata is currently user generated and does not use a controlled vocabulary.”

Comparing the lists it’s obvious to see some of the reasoning behind the chosen metadata sets. For example, that Jorum’s deposit tool can take advantage of information from Shibboleth and user profiles. It is also very encouraging to see their overlap but I think for me these sets raise a few issues:

Knowing a file size is important, but are we reaching a point when this information is part fo the programme/ browser?
- I think we still need to record it but am not sure as I’m fairly certain that often when a file size is displayed to someone selecting/ downloading it’s being generated from the file/ by the browser not from the metadata.
Educational Level…
- I’m surprised to see this in ccLearn’s list – for all it’s simplicity it’s thus far proved a nightmare to agree on educational levels. not only is is nightmarish cross culturally but even within countries it’s not easy. I’ll pass over UK Educational Levels quickly and point out a project I’ve mentioned before Standard Connection – an NSF project trying to map curricula within the US. I’m not sure what progress they made but do know it certainly wasn’t straightforward.

The inclusion of educational level does however point to the difference between what educators think is necessary and what is easy to provide. I’ll come back to this in part 2 when I’ll try to wrangle some sense out of our elluminate session surveys.

I’ll note two things in passing by way of interim conclusion:

that OCW are discussing if they should have a minimal metadata set (http://cloudworks.ac.uk/cloud/view/1493).
that the suggested basic metadata for ccLearn is similar enough to the required and suggested metadata for DiscoverEd that there’s no reason that UKOER projects can’t (at no extra cost ) publish their collections there too. The University of Nottingham initiative UNOW is doing this already. [edit: the Open University’s initiative Open Learn is there too]

RSS, Yahoo Pipes, and UKOER projects

Posted on August 21, 2009 by johnr

or Making finding my way around the UK OER programme one feed at a time a little easier with some help from Yahoo.

In an effort to familarize myself with the OER programme I hunted for project websites and blogs to add their feeds to my netvibes. As not all projects are blogging this is only a partial method of engagement but the mixture of news and discussion of issues on the blogs that exist has helped me begin to find my way around. In the process of doing this I not only found blog rss feeds but one or two feeds of OER resources. Although projects will all be producing RSS feeds for their resources as part of the programme I hadn’t expected to find any yet.

As convenient as this all was now, I still had twenty or so new boxes in netvibes to scan along with all the others. However, one of the things that clearly emerged from the UKOER 2nd Tuesday on metadata that Phil and I ran was that that feeds of resources are going to be very important in this programme. I think we all knew this, and the programme had mandated that projects should produce a feed of their resources, but I was struck by what projects where already doing and some of their future plans. Coming from a repositories background I’ve tended to think of feeds for announcements or, with SWORD, deposit but thought of OAI-PMH or the like for ‘serious’ resource discovery or aggregation. I think OAI-PMH will have a role (and so do CCLearn: http://learn.creativecommons.org/wp-content/uploads/2009/07/discovered-paper-17-july-2009.pdf – more on that another time) but it came home to me how important RSS/Atom is – especially -in an environment were resources are being managed and made available using many different types of software with all that in mind it was time to finally try some pipes..

Simple Yahoo pipe of feeds from UKOER Individual strand projects

I pulled together by strand feeds from the project blogs I could find to make these feeds:

Institutional
http://pipes.yahoo.com/pipes/pipe.info?_id=e2288118932fa9ea996b7bb41120cfb7

Subject
http://pipes.yahoo.com/pipes/pipe.info?_id=baef584cd9fcb923605936dea916f47c

Individual
http://pipes.yahoo.com/pipes/pipe.info?_id=422a1e8bbd65b24a12421db4a91e25ee

And then the one feed to rule them all…
http://pipes.yahoo.com/pipes/pipe.info?_id=65080a2934b865342686e96deaf9add3

However, that feed could get quite busy – so for those days when life’s too short to hover over 10 new blog post titles – here a version that only gathers posts with the word metadata associated with them.
http://pipes.yahoo.com/pipes/pipe.info?_id=9bd5e2d97bb3267d52fc7e77103aacdb

It looks a bit like this:

Yahoo pipe for UKOER project blogs with word metadata associated

By the way, for those interested in a feed of UKOER / OER tagged posts from CETIS here’s a feed from our (John, Phil, Li, Lorna, Sheila, Rowin, Scott) blogs http://pipes.yahoo.com/pipes/pipe.info?_id=13b03a2e49b7eb7e7a57d1e1c8961916

With this done I decided to have use the time I’d saved (…) to try something else a pipe for the resource feeds.
http://pipes.yahoo.com/pipes/pipe.info?_id=d93dbd0965e0e9d8399b7818e446b0da

Now I need point out that this last feed is in many ways a ‘toy’ – I don’t know how the resource feeds I’m grabbing have been set up so I don’t know what coverage over time it’ll give. I also know it’s a very poor imitation of the much more careful work done on aggregating by the Steeple project http://www.steeple.org.uk/wiki/Ensemble and Scott in the Ensemble project http://galadriel.cetis.org.uk/ensemble/feeds?q=poetry. One thing I noticed in particular was that the feeds from different software seem to diverge in their use of different metadata fields within RSS.

That said, as of today, my imperfect aggregation is showing 112 OERs so far and the programme’s just getting started.

Changing projects: RRT to CETIS OER Support

Posted on August 19, 2009 by johnr

The Repositories Research Team finished up recently and from the begining of August I’ve been working on a new project providing some of CETIS’ support to the JISC and HE Academy’s Open Educational Resources programme (http://www.jisc.ac.uk/whatwedo/programmes/oer.aspx).

Lorna has already provided a great concluding review of the RRT project over on her blog, so I won’t attempt to repeat ground she’s covered. It was a great project to have worked on and, I think, a successful experiment in exploring a model of providing programme-level support. I’m very grateful to the different colleagues I got to work with within CETIS, UKOLN, and JISC for making my years on the project enjoyable and challenging (in a good way).

From the begining of this month I’ve been working on the CETIS support project for the UKOER programme. The project will provide a mixture of project and programme support. Along with other CETIS colleagues, I’ll be developing guides reviewing relevant standards for syndication, packaging, and metadata as they relate to OERs; we’ll also look at relevant tools and the experince of earlier projects. Alongside the projects, in this pilot programme, we’ll be figuring out best practice for Open Educational Resources in these areas and others.

We’ve begun this process by runing one of the programmes monthly training/ discussion events on Elluminate. Phil’s posted the slides from the session we ran: http://www.slideshare.net/philb/metadata-and-content-aggregation-for-ukoer.

Open Educational Resources, metadata, and self-description

Posted on December 8, 2008 by johnr

If we share learning materials, do we have a professional responsibility to describe them?

At the CETIS conference Open Educational Resources / Content session in the midst of the discussions about metadata someone, I think John Casey, made an offhand comment about embedded metadata. As valuable as his next statement was, it was the notion of what information is contained within an object that caught my attention.

There is a basic principle of identity and authorship in a world of distributed information that we don’t seem to be talking about – what elements of self-description is it reasonable to assume from an academic sharing their resources? What constitutes good practice for labelling the digital stuff we want to be professionally associated with? Let’s be clear – I’m not talking about academics creating metadata or the debate about whether metadata is embedded or bundled – I’m talking about the equivalent of title pages and referencing (for want of a better way to put it).

Most university courses include modules on how to write an academic paper, including how to put together the parts of a paper. Departments produce templates so that assignments/ term papers, and theses have a standard title page, format, and way of citing things. The front parts of a paper help: manage the process of attribution and avoid accidental plagiarism; promote more careful writing; assert authorship and/or rights over a work; navigate the work; and help manage collections of such papers. A title section typically contains the following information: a title, author(s), date (usuallly of submission or acceptance), and frequently a course and/or institutional affiliation. This provides the reader with enough information to know what something claims to be, and begins to allow them to judge if they should read it.

I’m not suggesting title pages should be standard for everything, or that everything casually shared needs all this information, but in the context of deliberately shared educational resources surely we should regard providing information of this type as a professional responsibility. Whether we see it as an obligation of the ‘guild’, an opportunity to self-publicise, or compliance with institutional branding requirements, this information should be as standard for educational resources as it is for theses and articles. Of course not all learning materials lend themselves to a title page but: text documents and presentations do and web sites allow for home or about pages. Audio and video files can support introductions but the editing process is more complex. Independent images and some other forms of learning material are not as suitable for title ‘pages’ – but i strongly suspect more than half the learning materials shared in through call will be document, presentation, or web site.

I guess I’m suggesting that, for relevant materials the following should be assumable: Title, Author, Date (of some relevant kind), Institution, Course (code or name).

There are good and valid debates about what, if any, metadata academics should be asked to create, but there is a more fundamental question about professional self-description and good practice. Our conversation about what metadata is needed and who should create it should start from the premise that basic bibliographic information should be contained within the resource.

I donâ€™t think anyone is suggesting resources should not have â€˜title pagesâ€™, I just think we need to be clear, before we start talking about metadata, that it is reasonable to expect this type information be there. It’s just good practice

John Robertson

Cetis Blogs

Category Archives: ukoer