Comparing metadata requirements for OERs (part 1)

In our elluminate session on metadata and aggregation for Open Educational Resources, Phil and I spent some time gettting everyone to think through the information required to interact with an educational resource in certain ways (such as: (re-)use, cite, find, identify, manage). this produced a lot of responses prioiritizing different bits of information that are needed. I’ve not gone through my notes thoroughly yet but on the whole particapants agreed that the metadata which the programme asked for was needed (the main element of contention was file format and size which thankfully are probably the most automatable of metadata).

With this in mind I was interested to read about ccLearn’s developments in developing a tool to provide an enhanced search of aggregated OERs and their metadata reccomendations for sources.

“DiscoverEd is an experimental project from ccLearn which attempts to provide scalable search and discovery for educational resources on the web. Metadata, including the license and subject information available, are exposed in the result set.” http://wiki.creativecommons.org/DiscoverEd_FAQ

There’s a lot more to be said about their work as I’m still trying to figure out how it is similar to and differs from all the previous work done on aggregating repositories (at first glance – it’s got the advantage of web friendly syndication/ transport standards but potentially less robust/ standardised descriptive standards). Today however, I thought it would be interesting to compare minimum metadata sets for OERs that I’m aware of and that are intended for multi-organisation/ insitutional use (i.e. not just what a given organisation has decided as a minmimal set for its metadata).

UKOER Mandatory Metadata:

from: http://blogs.cetis.org.uk/lmc/2009/03/30/metadata-guidelines-for-the-oer-programme/

programme tag
author
title
date (uploaded/ creation)
url
file format
file size
[I’m fairly sure rights is on some versions of this list but it doesn’t appear on this one]

Suggested metadata

language
subject classsifications
keywords
tags
comments
description

DiscoverEd metadata

http://wiki.creativecommons.org/CcLearn_Search_Metadata

All the metadata is optional but the following is highly recommended :

title
summary
language
education level
licence
subject

Jorum’s OER deposit tool

Gareth Waller summarized the Metadata requirements of the the Jorum OER deposit tool in comment on http://repositorynews.wordpress.com/2009/08/05/musing-about-metadata-for-oer/
“The profile is as follows:

Mandatory metadata set:

Title
Overview (Description)
Keywords
Author Name
Licence

Recommended metadata set:

Project name
Creation date
Classification (JACS subject classification)

System Generated metadata set:

Publisher
Contributed Date
Language
Identifier

The ‘keywords’ metadata is currently user generated and does not use a controlled vocabulary.”

Comparing the lists it’s obvious to see some of the reasoning behind the chosen metadata sets. For example, that Jorum’s deposit tool can take advantage of information from Shibboleth and user profiles. It is also very encouraging to see their overlap but I think for me these sets raise a few issues:

Knowing a file size is important, but are we reaching a point when this information is part fo the programme/ browser?
- I think we still need to record it but am not sure as I’m fairly certain that often when a file size is displayed to someone selecting/ downloading it’s being generated from the file/ by the browser not from the metadata.
Educational Level…
- I’m surprised to see this in ccLearn’s list – for all it’s simplicity it’s thus far proved a nightmare to agree on educational levels. not only is is nightmarish cross culturally but even within countries it’s not easy. I’ll pass over UK Educational Levels quickly and point out a project I’ve mentioned before Standard Connection – an NSF project trying to map curricula within the US. I’m not sure what progress they made but do know it certainly wasn’t straightforward.

The inclusion of educational level does however point to the difference between what educators think is necessary and what is easy to provide. I’ll come back to this in part 2 when I’ll try to wrangle some sense out of our elluminate session surveys.

I’ll note two things in passing by way of interim conclusion:

that OCW are discussing if they should have a minimal metadata set (http://cloudworks.ac.uk/cloud/view/1493).
that the suggested basic metadata for ccLearn is similar enough to the required and suggested metadata for DiscoverEd that there’s no reason that UKOER projects can’t (at no extra cost ) publish their collections there too. The University of Nottingham initiative UNOW is doing this already. [edit: the Open University’s initiative Open Learn is there too]

13 thoughts on “Comparing metadata requirements for OERs (part 1)”

This is really useful (and reassuring!). Some thoughts …

I wonder what proportion of OER content is shared as zipped files? Either because it’s mixed file types, because it’s big to download, and/or because it’s IMS/METS content packaged. If users want to see the file format in a list of search results (“I’m looking for a presentation format such as a .ppt”) then it would be useful to record the file format in the metadata, even if its downloadable as a compressed zip file? But then the field would need to allow multiple file types to be listed. And if we build this in to metadata requirements as a mandatory field to support this particular scenario, would that be elegance at the expense of uptake? Hmmm … feel free to unpick my woolly logic there.

Another thought is about rights. Rights should be mandatory metadata, and I guess it should include: rights holder + licence. that license could be a url or just a summary (e.g CC:BY:NonComm). But we’ve been discussing around the OER projects as well that the rights information should ideally be embedded in the resource itself in case the resource gets detached from the metadata (as happens if you download a resource to your C:drive). So in this case the metadata isn’t really enough, for practical purposes.

Educational level – I totally agree with you. Interested to hear further thoughts on that!

Completely agree re educational level. This is a very difficult field to populate especially for open educational resources that are likely to be used in a variety of different contexts and levels. Having experienced the pain and suffering that resulted from UKEL (how difficult could it be??) any mention of educational level makes me want to run for the hills. Having said that I can think of plenty of usecases where it would be very helpful to have some idea of the original intended educational level of a resource. However I would hesitate to make educational level mandatory or even highly recommended.

Also agree with Amber re embedding rights metadata within the resources, particularly for the OER Programme.

Hi Amber,

Zip files do present a case for having the file format(s) in the metadata. With metadata about the zip/ IMS CP/ METS file I guess that it depends how granular the assets and metadata are going to be. As I understand it both packaging formats allow the description of component assets as well as the package. Although even with zip files extracting file format metadata should be automatable as the repository will hold (or at least process) the unzipped files.

I agree we need clear rights / licence information – ideally in the metadata and on the resource. I think it needs to be both stated and a uri. Ideally resources shold have ‘cover pages’ where they can; beyond that, I think embedding the licence into the resource could be good and I can see a good case for rights provenance travelling with bits of resources but I’m wary of anything that tries to take that a step further and ‘enforce’ the movement of rights. As I understand it DRM has consistently proved double edged and a preservation headache.

Educational level – this will crop up agian in part 2 (whenever I get it written)

The importance of providing informaiton on educational levels and how difficult it is to do so both depend on the scope of the collection. If you’re collecting resources for everything from kindergarten through to postgraduate / professional development (as I guess ccLearn/DiscoverEd are) then you really need something by way of education level to allow people to filter down to just the primary or just University level. If your collection is focussed on just one sector, e.g. all your material is University level (as is pretty much the case for UKOER) then providing a finer level description becomes more difficult and less useful.

Pingback: John’s JISC CETIS blog » Comparing metadata requirements (part 2)

Educational level even within a sector e.g. HE may be very important to learners rather than teachers. Whether covered in metadata in additional information on site it was something we considered from the outset and much of the reasoning for waht we did is covered in:

Lane, A.B. From Pillar to Post: exploring the issues involved in re-purposing distance learning materials for use as Open Educational Resources, 25 pp, 2006, OpenLearn Working Paper No. 1, available from http://kn.open.ac.uk/person.cfm?userid=5861

Andy,

the link seesm to take me to an OU user log-in page – is that the right link?

Re: educational level – one could limit the available levels to ‘primary’, ‘secondary’, ‘further’, ‘higher’ – oh wait… no… you’re right… best forget it! Too problematic.

On filesize… I’m amazed anyone is even interested in this (however it is generated). I take the point about multiple resources embedded into a single zip – but might be better to encourage a move away from the whole ‘content packages’ thing anyway?

Pingback: John’s JISC CETIS blog » Comparing metadata requirements for OERs (part 3)

@JohnR Sorry I seem to have given an internal url not the public one for this http://kn.open.ac.uk/public/document.cfm?docid=9724

In response to Andy, I’d be happy to let file size go but I know others feel quite strongly that this is critical info.

Pingback: Phil’s JISC CETIS blog» Blog Archive » About metadata & resource description (pt 1)

Pingback: Musings on the developing OER infrastructure « Repository News

amber thomas says:

August 27, 2009 at 10:35 am

This is really useful (and reassuring!). Some thoughts …

I wonder what proportion of OER content is shared as zipped files? Either because it’s mixed file types, because it’s big to download, and/or because it’s IMS/METS content packaged. If users want to see the file format in a list of search results (“I’m looking for a presentation format such as a .ppt”) then it would be useful to record the file format in the metadata, even if its downloadable as a compressed zip file? But then the field would need to allow multiple file types to be listed. And if we build this in to metadata requirements as a mandatory field to support this particular scenario, would that be elegance at the expense of uptake? Hmmm … feel free to unpick my woolly logic there.

Another thought is about rights. Rights should be mandatory metadata, and I guess it should include: rights holder + licence. that license could be a url or just a summary (e.g CC:BY:NonComm). But we’ve been discussing around the OER projects as well that the rights information should ideally be embedded in the resource itself in case the resource gets detached from the metadata (as happens if you download a resource to your C:drive). So in this case the metadata isn’t really enough, for practical purposes.

Educational level – I totally agree with you. Interested to hear further thoughts on that!

Lorna M. Campbell says:

August 27, 2009 at 3:18 pm

Completely agree re educational level. This is a very difficult field to populate especially for open educational resources that are likely to be used in a variety of different contexts and levels. Having experienced the pain and suffering that resulted from UKEL (how difficult could it be??) any mention of educational level makes me want to run for the hills. Having said that I can think of plenty of usecases where it would be very helpful to have some idea of the original intended educational level of a resource. However I would hesitate to make educational level mandatory or even highly recommended.

Also agree with Amber re embedding rights metadata within the resources, particularly for the OER Programme.

JohnR says:

August 27, 2009 at 3:54 pm

Hi Amber,

Zip files do present a case for having the file format(s) in the metadata. With metadata about the zip/ IMS CP/ METS file I guess that it depends how granular the assets and metadata are going to be. As I understand it both packaging formats allow the description of component assets as well as the package. Although even with zip files extracting file format metadata should be automatable as the repository will hold (or at least process) the unzipped files.

I agree we need clear rights / licence information – ideally in the metadata and on the resource. I think it needs to be both stated and a uri. Ideally resources shold have ‘cover pages’ where they can; beyond that, I think embedding the licence into the resource could be good and I can see a good case for rights provenance travelling with bits of resources but I’m wary of anything that tries to take that a step further and ‘enforce’ the movement of rights. As I understand it DRM has consistently proved double edged and a preservation headache.

Educational level – this will crop up agian in part 2 (whenever I get it written)

Phil Barker says:

August 27, 2009 at 4:18 pm

The importance of providing informaiton on educational levels and how difficult it is to do so both depend on the scope of the collection. If you’re collecting resources for everything from kindergarten through to postgraduate / professional development (as I guess ccLearn/DiscoverEd are) then you really need something by way of education level to allow people to filter down to just the primary or just University level. If your collection is focussed on just one sector, e.g. all your material is University level (as is pretty much the case for UKOER) then providing a finer level description becomes more difficult and less useful.

Pingback: John’s JISC CETIS blog » Comparing metadata requirements (part 2)
Andy Lane says:

September 1, 2009 at 8:44 am

Educational level even within a sector e.g. HE may be very important to learners rather than teachers. Whether covered in metadata in additional information on site it was something we considered from the outset and much of the reasoning for waht we did is covered in:

Lane, A.B. From Pillar to Post: exploring the issues involved in re-purposing distance learning materials for use as Open Educational Resources, 25 pp, 2006, OpenLearn Working Paper No. 1, available from http://kn.open.ac.uk/person.cfm?userid=5861

JohnR says:

September 2, 2009 at 10:38 am

Andy,

the link seesm to take me to an OU user log-in page – is that the right link?

Andy Powell says:

September 2, 2009 at 11:42 am

Re: educational level – one could limit the available levels to ‘primary’, ‘secondary’, ‘further’, ‘higher’ – oh wait… no… you’re right… best forget it! Too problematic.

On filesize… I’m amazed anyone is even interested in this (however it is generated). I take the point about multiple resources embedded into a single zip – but might be better to encourage a move away from the whole ‘content packages’ thing anyway?

Pingback: John’s JISC CETIS blog » Comparing metadata requirements for OERs (part 3)
Andy Lane says:

September 2, 2009 at 2:23 pm

@JohnR Sorry I seem to have given an internal url not the public one for this http://kn.open.ac.uk/public/document.cfm?docid=9724

Lorna says:

September 3, 2009 at 2:50 pm

In response to Andy, I’d be happy to let file size go but I know others feel quite strongly that this is critical info.

Pingback: Phil’s JISC CETIS blog» Blog Archive » About metadata & resource description (pt 1)
Pingback: Musings on the developing OER infrastructure « Repository News

John Robertson

Cetis Blogs