John Robertson » cetis-support http://blogs.cetis.org.uk/johnr Cetis Blogs Mon, 15 Jul 2013 13:26:48 +0000 en-US hourly 1 http://wordpress.org/?v=4.1.22 RSS for deposit, Jorum and UKOER: part 2 commentary http://blogs.cetis.org.uk/johnr/2010/02/04/rss-for-deposit-jorum-and-ukoer-part-2-commentary/ http://blogs.cetis.org.uk/johnr/2010/02/04/rss-for-deposit-jorum-and-ukoer-part-2-commentary/#comments Thu, 04 Feb 2010 14:23:17 +0000 http://blogs.cetis.org.uk/johnr/?p=812 Following on from part 1 which reviewed Jorum’s requirements for RSS-based deposit, this section synthesises the comments and feedback emerging in response to it.

Community views

In response to the requirements and position papers a number of feeds where submitted for testing and there has been some thoughtful reflection on the issues in the blogs and by email. This is a brief summary of responses to the key issues:

Issues around generating the RSS

Although most platforms in use can easily create RSS feeds and some can create a feed from any search result, it has become clear that, the RSS profile that is created is frequently fixed and does not match the profile requested by Jorum (which is very similar to the profile suggested by OCWC).

Irrespective of repository software of other OER management ‘platforms’ in use, adjusting the RSS output profile has proved to be a non-trivial task. Emerging issues in adjusting the RSS outputs include:

  • Users of commercial platforms may have to rely on the company’s developers and development schedule.
  • Open source platforms may require additional local coding or at the least will require adjusting an XLST.
  • It is likely that the RSS output of web 2.0 platforms will simply not be editable.

In all three cases there may also be possible solutions that utilise independent tools, such as Yahoo Pipes, to process the feed after production or create a feed from another interface. However, such an approach to adjust the RSS profile is either still reliant on the information present in the original source feed or is dependent on adding standard profile information or extracting additional information and creating a new feed. See for example http://repositorynews.wordpress.com/2010/01/07/really-not-so-simple-syndication/.

Xml validity

Xpert note that of the 60 feeds they harvest, 5 aren’t valid xml and 20 aren’t valid RSS (see comment on Lorna’s post). It is worth noting that aggregators can and do deal with poorly formed data, however, in the timescale of the programme the kind of manual effort involved in dealing with poorly formed feeds (and the quality of the item metadata they would generate) is not likely to be supportable.

Conclusion

With the exception of some web 2.0 platforms and (potentially) some commercial repositories a revised or reprocessed RSS feed to meet Jorum’s requirements is, in theory, a possibility. However, few projects currently produce RSS meeting Jorum’s required RSS profile. From the project trials thus far, one project has so far been able to conform sufficiently to permit successful harvest. In the context of UKOER, adjusting the RSS profile requires the right congruence of platform(s), skills, and time within the project team. Hence it is unlikely that this solution will work for all projects who need a bulk upload option. In terms of the longer term feasibility of using RSS to facilitate bulk deposit, this which may change over time, particularly if the OCWC profile is adopted more widely.

Issues around metadata

The following issues were noted about the feed content:

  1. There is tentative agreement that it would be good if RSS feeds used a DC namespace where possible and ideally supporting the OCWC profile.
  2. The addition of custom elements (for example for the purposes of tracking OER currency) is not regarded as a good idea.
  3. It cannot be taken for granted that the item identifier in a feed is the same as the identifier of the OER within the platform.
  4. It cannot currently be assumed that the item identifier in a feed is either unique or persistent. This is a critical issue for processing the feed.
  5. There may be multiple feed identifiers for a given OER.
  6. Feeds may contain more than one namespace and / or feed formatting
  7. In a number of repository platforms the identifier supplied frequently points to the splash page rather than the OER itself. This is an issue if the resource itself is to be harvested.
  8. Few feeds have rights information for the items or for the feed itself. Including this information is regarded as good practice.
    1. Feeds that use one of the variants of Creative Commons encoding may allow aggregators and Jorum to provide enhanced services.
    2. Projects should clearly license their feeds and underlying items.
    3. In the last resort projects should state clearly on their site or by telling Jorum what rights and licensing exists in connection to their feeds.
  9. Metadata quality (including completeness) in feeds is variable.

Issues around processing the RSS

There are a number of issues about feed size, currency, updating, identifiers and OER deletion but these depend on whether the service is collecting information to help point to current OERs (like an aggregator) or whether it is seeking to provide a central collection of OERs (a library – even if some of those OERs are actually elsewhere). These distinctions are not clearly made in much of the discussion.

Feed classification:

Pushing everything into one classification seems to negate the classification work done by projects and many projects will be producing OERs with more than one JACS code. This could be addressed by having multiple feeds per platform (either from the platform or by subsequently dividing the feed) but there are potential duplication management issues with multiple feeds.

Feed setup

News feeds are the most common use of RSS or Atom, they typically are limited to a fixed number of most recent items (this does not preclude multiple subject based feeds from a repository as mentioned above). They can, however, contain the entire contents of the repository or all the results for the search term.

If the feed contains the entire contents of the repository (or search) it inevitably becomes very large. Large feeds tend to time out in browsers and can be difficult to ingest (as outlined in Xpert’s paper). However, many aggregator services prefer this approach as it provides a straightforward way to maintain currency, avoid duplication, and not have to consider partial deletion. This is because each time the feeds are polled/ gathered the previous index created by the aggregator is deleted. Only the content currently in the feed is indexed.

Magazine type feeds are the most common form of RSS and are more likely to be the default feed produced by repositories or other platforms ; they are usually small. However, to build an aggregation service or collection based upon them would require items from feeds to be stored in an incrementally built index (i.e. new items from feeds are added to a persistent index that retains their information even after they are no longer present in the feed). This works if there are unique and persistent identifiers for feed items or OERs included in the feed record and OERS do not end up with multiple feed identifiers.

OER currency

I’d suggest that the discussion about how to tell if a given OER has updated is a management and policy question to do with versioning and should be out of scope for this discussion. If a uri/url is provided for an OER, I think subsequent versions of the OER should have different urls as they are different things! There is a difference between an academic’s view of an OER as constantly in flux and a digital asset management perspective which needs a clear notion of the persistence or fixity of an released OER.

Feed currency

The discussion of how often feeds should be polled to check for new items is something which has to be agreed. It will impact on a number of issues and is affected by the type of feeds being consumed (magazine feeds will need to be polled more frequently) and will impact on the performance of the index.

Upload

Jorum have currently indicated that uploading OERs via RSS is out of scope. Upload would probably require some form of persistent and locally unique identifier for each OER to be included in the feed.

Deletion

There are wider questions in connection to deletion from Jorum, but in the context of RSS link deposit, deletion is only an issue if Jorum opts for some form of incremental built index. RSS is not designed to manage the deletion of items.

Overview of combinations of RSS options

I’ve created this table to try to pull together some of the interdependent issues relating to feed processing.

  A B C D E F
  Feed of all OERS Feed of all OERS Subject feed of OERS Subject feed of OERS Magazine feed of OERS Magazine feed of OERS
Feed size Very Big Very Big ‘Medium’ ‘Medium’ Small Small
Update Replace Incremental addition Update Incremental addition Incremental addition Update
Coverage Whole current collection Whole cumulative collection Current subject collection Cumulative subject collection Whole collection (gradually) Transient snapshot of collection
Deletion occurs as a feed is replaced does not occur automatically occurs as a feed is replaced does not occur automatically does not occur automatically occurs as a feed is replaced
OER Deduplication? Not significant Issue Minor issue Issue Issue Not significant

Other options

OAI-PMH

As a precursor, JorumOpen does not currently act as an OAI harvester so this is a somewhat moot point (Note: the software required for OAI-PMH consumption is distinct from the software needed for harvesting).

Within the programme not all OER producers are using repositories and of those that are not all repositories have OAI-PMH enabled. So, although there is some established practice of harvesting metadata via OAI-PMH it would be at best a partial solution.
There are pieces of software which can add support for OAI-PMH export but adapting and implementing them creates an additional development task for projects.

OAI-PMH harvesting has some built in support for resumption (incremental harvesting of metadata from large repositories) and has some support for record deletion but this is not always well supported.
OAI-PMH harvesting services have a mixed record – a key point of note is that they invariably need time to set up.
OAI-PMH harvesting will face many similar issues to RSS harvesting – in that identifiers will point to splash pages and that resources themselves are not harvested.

SRU

As a precursor: JorumOpen does not support SRU based harvesting so this is another moot point.
Considering SRU would require this functionality in both contributing repositories other platforms and into Jorum. It is, however, not yet widely supported or used. Though some commercial repositories do implement it and there are open source clients to bolt-on to repositories or other platforms. This is requires developer time and is likely to be a partial solution only.

Deposit API?

The current deposit tool is based on third party software MRCUTE which runs through Moodle – as such it cannot easily be adapted by Jorum to provide an API.
Jorum are however, exploring the addition of a SWORD deposit endpoint. Suitable SWORD deposit tools would need to be identified (ie those that can handle the right metadata and cope with something that isn’t a research paper – given the research focus of SWORD tool development these are likely to be some of the less-developed tools ).

]]>
http://blogs.cetis.org.uk/johnr/2010/02/04/rss-for-deposit-jorum-and-ukoer-part-2-commentary/feed/ 2
RSS for deposit, Jorum and UKOER: part 1 review http://blogs.cetis.org.uk/johnr/2010/02/04/rss-for-deposit-jorum-and-ukoer-part-1-review/ http://blogs.cetis.org.uk/johnr/2010/02/04/rss-for-deposit-jorum-and-ukoer-part-1-review/#comments Thu, 04 Feb 2010 14:17:20 +0000 http://blogs.cetis.org.uk/johnr/?p=798 Over the past few months CETIS and Jorum have been discussing approaches to bulk deposit to support the projects in the UKOER programme as they deposit or represent their OERs in Jorum. Based on feedback from projects gathered through our technical reviews of projects, we’ve investigated approaches which might work for the programme.

One option we have investigated is the use of RSS. Gareth Waller from Jorum produced a set of feed requirements and a discussion paper suggesting possible issues with the use of RSS. A number of projects have trialled their feeds and provided feedback on Lorna’s blog post introducing Gareth’s paper and outling the issues. The Xpert project has also produced a briefing paper looking at issues around RSS –based deposit. (Considerations and evaluations of the development of distributed repositories when using RSS aggregation as a submission protocol. By Pat Lockley, The University of Nottingham http://webapps.nottingham.ac.uk/elgg/xpert/files/-1/803/xpert+metadata+final.pdf )

Many thanks to Gareth and Laura from Jorum and everyone else who’s contributed to the discussion thus far. This post is a summary of that discussion and comment about other options suggested.

Please note this conversation is shaped by the constraints of the programme. The discussion below focuses on the relation of a single OER project producing a feed or feeds of resources to contribute to Jorum. Issues of how Jorum addresses and combines data feeds from different projects and provides standardised data are a separate discussion.

RSS

Suitability

Although submission to a repository isn’t the primary purpose of RSS, it does have functionality and features that may make it suitable for such a purpose. The investigation of RSS as an option for submitting content to Jorum began with the observations:

Jorum’s requirements

Jorum produced an outline of their minimum requirements for feed-based ingest and a briefing paper summarizing their current take on issues around RSS for deposit.

Feed format and content

Jorum‘s current requirements are:

  1. RSS version 2.0 feed
  2. At least one element belonging to one of the following namespace directly under the channel element. Metadata for all items must be represented in elements belonging to this namespace.
    1. http://www.imsglobal.org/xsd/imsmd_v1p2
    2. http://ltsc.ieee.org/xsd/LOM
    3. http://purl.org/dc/elements/1.1/
  3. Licence information on each item (in the relevant metadata element). This must contain a v2 Eng & Wales CC licence url e.g.
    1. DC : rights e.g. Licensed under a Creative Commons Attribution – NonCommercial-ShareAlike 2.0 Licence – see http://creativecommons.org/licenses/by-nc-sa/2.0/uk/
    2. IMSMD : rights/description/langstring
    3. LOM: rights/description/string”

Feed processing

Jorum currently processes the feed as follows:

  1. “The feed is *not* continually polled for new content. […] The current functionality simply reads the feed when it is deposited and all the items are created in DSpace. It’s a snapshot in time of that RSS feed. If you add in the same feed again, it will store duplicates.
  2. The physical data of a resource in the feed is not stored in JorumOpen. A link is simply created pointing to the resource as indicated by the RSS feed (the “link” element).”
  3. “The feed MUST be valid XML – if the XML coming back isn’t valid in the first place then we cannot process it (neither can any validator, XML reader etc). ”
  4. “Items within a feed are not auto classified within Jorum. In other words, every item in a feed is stored within a single collection as chosen by the admin user i.e. a top level JACS or LearnDirect classification. Having individual feed for each classification such as the OpenLearn model would ensure that items are classified correctly as these feeds can be deposited separately.”

Possible issues about the use of feeds

In his paper Gareth raises a number of issues and questions including the following:

  1. RSS items need to contain the unique id of the OER
  2. It’s not yet clear how to tell from the feed if an OER has changed or been deleted
  3. Feeds should not contain the whole repository contents
  4. There is the possibility that OERs might fall between arbitrary limits for feed creation (50 most recent items polled everyday misses resources above this number)
  5. The richness of metadata which exists within the platform creating the RSS may be restricted to using subset of the fields they have available by the feed creation process or feed consumption process.
  6. Feed deposit needs to make assumptions about licensing
  7. Current exploration of feed deposit relates only to harvesting metadata, not to harvesting resources.

Part 2 of this post will look at the community responses to this proposal and look at emerging issues.

]]>
http://blogs.cetis.org.uk/johnr/2010/02/04/rss-for-deposit-jorum-and-ukoer-part-1-review/feed/ 3
Comparing metadata requirements for OERs (part 3) http://blogs.cetis.org.uk/johnr/2009/09/02/comparing-metadata-requirements-for-oers-part-3/ http://blogs.cetis.org.uk/johnr/2009/09/02/comparing-metadata-requirements-for-oers-part-3/#comments Wed, 02 Sep 2009 13:42:39 +0000 http://blogs.cetis.org.uk/johnr/?p=496 The first two parts of this foray into metadata requirements for Open Educational Resources examined: 1) how the required information for the UKOER programme compared with the requirements for the Jorum deposit tool and the DiscoverEd aggegator 2) how the UKOER requirements compared to the information projects thought would be necessary for particular activities (find, identify, use, cite, manage, select). In this final part I’ll offer some personal reflections on the implications of these comparisons and comment on the role of educational metadata and annotations.

It was perhaps predictable, though not essential, that there would be close correspondence between the programme requirements and Jorum’s requirements but it was good to see that the UKOER’s metadata requirements were comparable to those of ccLearn’s aggregator. I’m glad to note that alongside The University of Nottingham initiative UNOW and the Open University’s Open Learn,  Leeds Met’s Unicycle project are also thinking about this (http://twitter.com/mrnick/statuses/3575276663 ). As outlined in part 2, what proved more interesting is that, when as a programme we thought about some of the information that the users of our resources would need, the programme requirements were a subset of that list.

One thing I’d highlight here, before I do that, is that the one piece of information that we agreed was essential to use an OER was clear licence information. I’m sure that this will get discussed a lot more but in the wider discussions going around the programme it is becoming clear that this information needs to be available as part of the asset for people to read (for example, as a cover page statement), in the metadata (to support licence specific searching), and in the RSS feed – so that it’s clear to aggregators.

Educational description

Part 1 noted DiscoverEd’s use of educational content and sparked some comments about the use of educational context – specifically focused on the issue of educational level; in part 2 information about  educational context emerged from our discussions aboutwhat we would want to know to interact with the OERs.

Developing best practice guidelines for eduational metadata in the UKOER programme is an ongoing process and one in which we’ll probably be tracking what the projects find useful as much as, if not more than, we make recommendations. CETIS has existing guides to metadata at http://wiki.cetis.org.uk/Guides_to_metadata and summaries of relevant metadata standards at http://wiki.cetis.org.uk/Educational_metadata_standards . We’re begining to gather specific guidance, links, and best practice information at http://wiki.cetis.org.uk/Educational_Content_OER but these guidelines are very much just beginning.

As Phil and Andy L noted on part 1 -if you’re dealing with resources ranging from primary school to postgraguate or CPD, the ability to quickly filter by broad educational level is quite important and for the purpose of wider interoperability recording educational level allows a richer service in some aggregators (currently DiscoverEd). But it’s hard to know what an appropriate and useful granularity would be – especially given that the audience of an OER is global. Perhaps, stating on the resource what class or course something was used for is a good idea as it provides the user (though not the system) with an understandable point of reference. In terms of metadata – if Jorum uses UKEL (I can’t remember and can’t access it) this is probably the right vocabulary to use. Given the focus of the programme on HE the appropriate (multiple) UKELs may be able to be added to batches of project resources – though broad this would give aggregators with a wider remit something to work with.

In terms of wider educational description – intended use, context of use, requirements for use, instructional method, are a few of the candidates. However, providing this sort of information has the potential to rapidly move away from the light touch approach to metadata that has characterised the programme thus far, and significantly add to the ‘cataloguing’. A quick glance at the practice of many of the successful OER initiatives suggests limited educational metadata may be the way to go; OCW & MERLOT record the item type but with vocabularies geared to their collections.

The upcoming technical discussions with projects will begin to establish what educational information they are recording and help frame some guidelines.

Usage information

Some other types of information that our discussions suggested would be really helpful (especially in the context of managing OER collections) was usage information, user ratings, and comments. This presents somewhat of a challenge. As:

  1. Usage statistics are often application specific and not part of structured metadata.
  2. Annotations are a sort of metadata but they actually form resources in their own right and it’s both tricky and somewhat messy to include them in metadata – especially if the resource and metadata then move (as they are intended to). As I understand it one possible approach would be to use OAI-ORE to associate distributed annotations that you were aware of with a resource.

I’m not yet aware of best practice in this area, nor aware of what projects are planning to do about recording OER use or distribution, but suspect that:

  1. is beginning to move towards the bigger discussion about tracking of OERs.
  2. is going to depend a lot on the capabilities of the tools and systems projects are using and how they record anntoations or ratings.

More details of about both of these issues will again emerge through the technical discussions but I suspect best practice for statisitics or an investigation of tracking are getting outside of the scope of our work.

]]>
http://blogs.cetis.org.uk/johnr/2009/09/02/comparing-metadata-requirements-for-oers-part-3/feed/ 4
Comparing metadata requirements for OERs (part 2) http://blogs.cetis.org.uk/johnr/2009/08/31/comparing-metadata-requirements-part-2/ http://blogs.cetis.org.uk/johnr/2009/08/31/comparing-metadata-requirements-part-2/#comments Mon, 31 Aug 2009 15:47:20 +0000 http://blogs.cetis.org.uk/johnr/?p=443 In Comparing metadata requirements (part 1) I examined the required and suggested metadata for Open Educational Resources in the UKOER programme, for the Jorum deposit tool, and the DiscoverEd aggregator. In this second part of the comparison I’m going to try to capture some of our initial discussions fom the UKOER programme session about metadata and then very roughly compare the brainstorming we did as part of that event with the requirements of the programme and other initiatives.

To begin with let’s have a look at a graphical overview of the requirements from the three initiatives. The graph below displays an overview of the metadata requirements of the UKOER programme, the Jorum deposit tool, and the DiscoverEd aggregator service. Full height bars are manadatory elements, three quarter height bars are system generated elements and half height bars are recommended metadata elements.

Graphiical Overview of Metadata Requirements Relating to the UKOER Programme

Graphical overview of metadata requirements relating to the UKOER programme

In the elluminate session, we asked the participating projects to consider what metadata they would require to:

  • identify
  • find
  • select
  • use
  • cite
  • manage

a resource. Participants then shared their suggestions in the chat box. I’ve put the data together and were appropriate combined or split entries. The below graph represents the group’s suggestions of which possible pieces of information are important for each of the outlined functions. (Some caveats are in order. The exercise was not rigorous, the number of participants answering at a given time varied; the first question ‘identify’ had a higher broader repsonse rate – this may relate to how we explained the exercise. The answers were free text and not from a prior list. Unless specified the idea of a date is counted for both creation/ initial use and upload/ publication. It also became clear that for functional purposes rights mostly collapsed in to licence (thus it was dropped from the graph).

Condensed outputs of metadata responses

Graph displaying a summary of metadata requirements for functions from brainstorming in 2nd Tuesday Session

Graph displaying a summary of metadata requirements for functions from brainstorming in 2nd Tuesday session

The idea of educational level is implict in educational description /context and so it should probably be included with that category (this has not been done in this graph however, as educational level is singled out in the requirements). It is not entirely clear what was intended by some of the descriptions – e.g. coverage.

Examining the graph it is clear that there are some key pieces of infromation for particular uses. Across the entire set of functions the key information appears to be: author, date of publication, subject and description, educaitonal description/ intended use (if combined with educational level) and usage data.

The key information for each function was:

  • identify – subject
  • find – description
  • select – licence
  • use – licence
  • cite – date of publication
  • manage – usage data

By way of enabling a comparison with the metadata requirements, the top three responses for each function where collated (and then taken as factor of one) – this was done to provide an approximate indication of overall importance that could be compared to the requirements data.

OER required metadata compared to brainstorming

OER required metadata compared to brainstorming

From this comparison it is interesting to note the following:

  1. Of the two contentious mandatory metadata elements: file format/ mime type was actually considered to be functionally important.
  2. Recording the Language of an OER was not considered to be critcial for any of the functions – though all the initiatives consider it important (this may be attributable to the programme-based context of our discussion).
  3. The institution/ publisher is surprisingly unimportant functionally (unless you are, like Jorum, hosting materials).
  4. Licence is probably the most important piece of information for an OER.
  5. Usage data and user ratings are considered to be critical pieces of information – they are not however included in metadata profiles – however, it is likely that this information would be generated over time by the relevant host services

There’s probably more to say about this data and more to do with it but for now at least – that’s plenty to reflect on.

]]>
http://blogs.cetis.org.uk/johnr/2009/08/31/comparing-metadata-requirements-part-2/feed/ 8
Comparing metadata requirements for OERs (part 1) http://blogs.cetis.org.uk/johnr/2009/08/26/comparing-metadata-requirements-for-oers-part-1/ http://blogs.cetis.org.uk/johnr/2009/08/26/comparing-metadata-requirements-for-oers-part-1/#comments Wed, 26 Aug 2009 16:07:42 +0000 http://blogs.cetis.org.uk/johnr/?p=409 In our elluminate session on metadata and aggregation for Open Educational Resources, Phil and I spent some time  gettting everyone to think through the information required to interact with an educational resource in certain ways  (such as: (re-)use, cite, find, identify, manage). this produced a lot of responses prioiritizing different bits of information that are needed. I’ve not gone through my notes thoroughly yet but on the whole particapants agreed that the metadata which the programme asked for was needed (the main element of contention was file format and size which thankfully are probably the most automatable of metadata).

With this in mind I was interested to read about ccLearn’s developments in developing a tool to provide an enhanced search of aggregated OERs and their metadata reccomendations for sources.

“DiscoverEd is an experimental project from ccLearn which attempts to provide scalable search and discovery for educational resources on the web. Metadata, including the license and subject information available, are exposed in the result set.” http://wiki.creativecommons.org/DiscoverEd_FAQ

There’s a lot more to be said about their work as I’m still trying to figure out how it is similar to and differs from all the previous work done on aggregating repositories (at first glance – it’s got the advantage of web friendly syndication/ transport standards but potentially less robust/ standardised descriptive standards). Today however, I thought it would be interesting to compare minimum metadata sets for OERs that I’m aware of and that are intended for multi-organisation/ insitutional use (i.e. not just what a given organisation has decided as a minmimal set for its metadata).

UKOER Mandatory Metadata:

from: http://blogs.cetis.org.uk/lmc/2009/03/30/metadata-guidelines-for-the-oer-programme/

  • programme tag
  • author
  • title
  • date (uploaded/ creation)
  • url
  • file format
  • file size
  • [I’m fairly sure rights is on some versions of this list but it doesn’t appear on this one]

Suggested metadata

  • language
  • subject classsifications
  • keywords
  • tags
  • comments
  • description

DiscoverEd metadata

http://wiki.creativecommons.org/CcLearn_Search_Metadata

All the metadata is optional but the following is highly recommended :

  • title
  • summary
  • language
  • education level
  • licence
  • subject

Jorum’s OER deposit tool

Gareth Waller summarized the Metadata requirements of the the Jorum OER deposit tool in comment on http://repositorynews.wordpress.com/2009/08/05/musing-about-metadata-for-oer/
“The profile is as follows:

Mandatory metadata set:

  • Title
  • Overview (Description)
  • Keywords
  • Author Name
  • Licence

Recommended metadata set:

  • Project name
  • Creation date
  • Classification (JACS subject classification)

System Generated metadata set:

  • Publisher
  • Contributed Date
  • Language
  • Identifier

The ‘keywords’ metadata is currently user generated and does not use a controlled vocabulary.”

Comparing the lists it’s obvious to see some of the reasoning behind the chosen metadata sets. For example, that Jorum’s deposit tool can take advantage of information from Shibboleth and user profiles. It is also very encouraging to see their overlap but I think for me these sets raise a few issues:

  • Knowing a file size is important, but are we reaching a point when this information is part fo the programme/ browser?
    • I think we still need to record it but am not sure as  I’m fairly certain that often when a file size is displayed to someone selecting/ downloading it’s being generated from the file/ by the browser not from the metadata.
  • Educational Level…
    • I’m surprised to see this in ccLearn’s list – for all it’s simplicity it’s thus far proved a nightmare to agree on educational levels. not only is is nightmarish cross culturally but even within countries it’s not easy. I’ll pass over UK Educational Levels quickly and point out a project I’ve mentioned before Standard Connection – an NSF project trying to map curricula within the US. I’m not sure what progress they made but do know it certainly wasn’t straightforward.

The inclusion of educational level does however point to the difference between what educators think is necessary and what is easy to provide. I’ll come back to this in part 2 when I’ll try to wrangle some sense out of our elluminate session surveys.

I’ll note two things in passing by way of interim conclusion:

  • that OCW are discussing if they should have a minimal metadata set (http://cloudworks.ac.uk/cloud/view/1493).
  • that the suggested basic metadata for ccLearn is similar enough to the required and suggested metadata for DiscoverEd that there’s no reason that UKOER projects can’t (at no extra cost :) ) publish their collections there too. The University of Nottingham initiative UNOW is doing this already. [edit: the Open University’s initiative Open Learn is there too]
]]>
http://blogs.cetis.org.uk/johnr/2009/08/26/comparing-metadata-requirements-for-oers-part-1/feed/ 13
RSS, Yahoo Pipes, and UKOER projects http://blogs.cetis.org.uk/johnr/2009/08/21/rss-yahoo-pipes-and-ukoer-projects/ http://blogs.cetis.org.uk/johnr/2009/08/21/rss-yahoo-pipes-and-ukoer-projects/#comments Fri, 21 Aug 2009 11:36:41 +0000 http://blogs.cetis.org.uk/johnr/?p=141 or Making finding my way around the UK OER programme one feed at a time a little easier with some help from Yahoo.

In an effort to familarize myself with the OER programme I hunted for project websites and blogs to add their feeds to my netvibes. As not all projects are blogging this is only a partial method of engagement but the mixture of news and discussion of issues on the blogs that exist has helped me begin to find my way around. In the process of doing this I not only found blog rss feeds but one or two feeds of OER resources. Although projects will all be producing RSS feeds for their resources as part of the programme I hadn’t expected to find any yet.

As convenient as this all was now, I still had twenty or so new boxes in netvibes to scan along with all the others. However, one of the things that clearly emerged from the  UKOER 2nd Tuesday on metadata that Phil and I ran was that that feeds of resources are going to be very important in this programme. I think we all knew this, and the programme had mandated that projects should produce a feed of their resources, but I was struck by what projects where already doing and some of their future plans. Coming from a repositories background I’ve tended to think of feeds for announcements or, with SWORD, deposit but thought of OAI-PMH or the like for ‘serious’ resource discovery or aggregation. I think OAI-PMH will have a role  (and so do CCLearn: http://learn.creativecommons.org/wp-content/uploads/2009/07/discovered-paper-17-july-2009.pdf – more on that another time) but it came home to me how important RSS/Atom is – especially -in an environment were resources are being managed and made available using many different types of software with all that in mind it was time to finally try some pipes..

Simple Yahoo pipe of feeds from UKOER Individual strand projects

Simple Yahoo pipe of feeds from UKOER Individual strand projects

I pulled together by strand feeds from the project blogs I could find to make these feeds:

Institutional
http://pipes.yahoo.com/pipes/pipe.info?_id=e2288118932fa9ea996b7bb41120cfb7

Subject
http://pipes.yahoo.com/pipes/pipe.info?_id=baef584cd9fcb923605936dea916f47c

Individual
http://pipes.yahoo.com/pipes/pipe.info?_id=422a1e8bbd65b24a12421db4a91e25ee

And then the one feed to rule them all…
http://pipes.yahoo.com/pipes/pipe.info?_id=65080a2934b865342686e96deaf9add3

However, that feed could get quite busy – so for those days when life’s too short to hover over 10 new blog post titles – here a version that only gathers posts with the word metadata associated with them.
http://pipes.yahoo.com/pipes/pipe.info?_id=9bd5e2d97bb3267d52fc7e77103aacdb

It looks a bit like this:

Yahoo pipe for UKOER project blogs with word metadata associated

Yahoo pipe for UKOER project blogs with word metadata associated

By the way, for those interested in a feed of UKOER / OER tagged posts from CETIS here’s a feed from our (John, Phil, Li, Lorna, Sheila, Rowin, Scott) blogs http://pipes.yahoo.com/pipes/pipe.info?_id=13b03a2e49b7eb7e7a57d1e1c8961916

With this done I decided to have use the time I’d saved (…) to try something else a pipe for the resource feeds.
http://pipes.yahoo.com/pipes/pipe.info?_id=d93dbd0965e0e9d8399b7818e446b0da

Now I need point out that this last feed is in many ways a ‘toy’ – I don’t know how the resource feeds I’m grabbing have been set up so I don’t know what coverage over time it’ll give. I also know it’s a very poor imitation of the much more careful work done on aggregating by the Steeple project http://www.steeple.org.uk/wiki/Ensemble and Scott in the Ensemble project http://galadriel.cetis.org.uk/ensemble/feeds?q=poetry. One thing I noticed in particular was that the feeds from different software seem to diverge in their use of different metadata fields within RSS.

That said, as of today, my imperfect aggregation is showing 112 OERs so far and the programme’s just getting started.

]]>
http://blogs.cetis.org.uk/johnr/2009/08/21/rss-yahoo-pipes-and-ukoer-projects/feed/ 2
Notes from the web: metadata related reports http://blogs.cetis.org.uk/johnr/2009/03/11/notes-from-the-web-metadata-related-reports/ http://blogs.cetis.org.uk/johnr/2009/03/11/notes-from-the-web-metadata-related-reports/#comments Wed, 11 Mar 2009 12:09:56 +0000 http://blogs.cetis.org.uk/johnr/?p=140 There have been two reports relating to metadata released recently that I’ve been meaning to read and blog about: OCLC’s What We’ve Learned from the RLG Partners Metadata Creation Workflows Survey and DLF’s Future Directions in Metadata Remediation for Metadata Aggregators. However, I’m not going to get a chance to do more than skim these for a while – so while they’re fresh here’s the links.

What We’ve Learned from the RLG Partners Metadata Creation Workflows Survey

http://www.oclc.org/programs/publications/reports/2009-04.pdf
There’s an interesting comparasion of tools and people used to create MARC and non-MARC materials.
Of particular interest is how little libraries seem to involved in ‘educational’ metadata – 8 out of 78 respondents are invovled in creating metadata for learning objects. Now I know there’s a world of difference between learning materials with educational metadata and learning object metadata but looking at the metadata standards being used, educational description still seems to be out of scope. It is also of note how little of the respondents metadata is currently being pushed to/ used by newer forms of exposure – web2.0 tools and SRU.

Future Directions in Metadata Remediation for Metadata Aggregators

http://www.diglib.org/pubs/dlf110.pdf
Many repository and digital library services operate on the premise that exposing their metadata will enable information about thier content to be made available in larger resource discovvery services that aggregate metadata from many sources. Such services provide a valuable service but often have well documented problems with variation in the metadata which they harvest. Whether the metadata is of poor quality or simply designed to support local needs without consideration of a wider context, aggregated metadata needs to be cleaned and otherwise processed to provide a better discovery service. This report examines key services/ features that a aggregated search service would hope to provide and for each documents how metadata supports that service, what tools exist to ‘fix’ harvested metadata, what tools are desired and provides a bibliography and comments. As such this report should represent an overview state of the art (and I really need to read it soon…).

]]>
http://blogs.cetis.org.uk/johnr/2009/03/11/notes-from-the-web-metadata-related-reports/feed/ 0
From the web: tools for project management and how to contract developers http://blogs.cetis.org.uk/johnr/2008/11/07/from-the-web-tools-for-project-management-and-how-to-contract-developers/ http://blogs.cetis.org.uk/johnr/2008/11/07/from-the-web-tools-for-project-management-and-how-to-contract-developers/#comments Fri, 07 Nov 2008 14:39:01 +0000 http://blogs.cetis.org.uk/johnr/2008/11/07/from-the-web-tools-for-project-management-and-how-to-contract-developers/ Over on the IE blog Andy McGregor has a useful annotated list of some of the online tools that he has used to help with programme management.

Many could be just as useful projects – distributed or not.

http://infteam.jiscinvolve.org/2008/11/06/web-tools-for-programme-management/

On a related note David Flanders has a useful and extensive reflection and how to guide on contracting out software development.

 http://dfflanders.wordpress.com/2008/11/04/how-to-contract-consultant-developers/

]]>
http://blogs.cetis.org.uk/johnr/2008/11/07/from-the-web-tools-for-project-management-and-how-to-contract-developers/feed/ 0
Metadata in an Ecosystem of Presentation Dissemination http://blogs.cetis.org.uk/johnr/2008/09/25/metadata-in-an-ecosystem-of-presentation-dissemination/ http://blogs.cetis.org.uk/johnr/2008/09/25/metadata-in-an-ecosystem-of-presentation-dissemination/#comments Thu, 25 Sep 2008 11:36:18 +0000 http://blogs.cetis.org.uk/johnr/2008/09/25/metadata-in-an-ecosystem-of-presentation-dissemination/ Metadata in an Ecosystem of Presentation DisseminationR. John Robertson, Phil Barker, Mahendra Mahey

How and how why do academics disseminate their presentations? How does this relate to their other forms of dissemination? What academic and organisational influences affect their dissemination? What influences their choice of tool? What metadata is created about the various things that are disseminated? Who creates that metadata, is the duplication necessary?

Our poster at DC2008 (http://dc2008.de/programme/posters) is a case study of ‘one’ academic’s dissemination of their presentations. It uses the repository ecology approach we’ve been developing and the resulting poster allows some interesting questions to be raised.  Here’s a copy for reference…

Metadata in an Ecosystem of Presentation Dissemination image

Here’s a pdf: Metadata in an Ecosystem of Presentation Dissemination at Dublin Core 2008

]]>
http://blogs.cetis.org.uk/johnr/2008/09/25/metadata-in-an-ecosystem-of-presentation-dissemination/feed/ 1
Repositories Research Team http://blogs.cetis.org.uk/johnr/2006/10/23/repositories-research-team/ http://blogs.cetis.org.uk/johnr/2006/10/23/repositories-research-team/#comments Mon, 23 Oct 2006 15:40:38 +0000 http://blogs.cetis.org.uk/johnr/2006/10/23/repositories-research-team/ The short introduction to my job is that I’m the JISCCETIS representative on JISC’s Repositories Research Team (RRT).

Within the team I have a focus on e-learning and provide a degree of support for the e-learning projects in the DRP Programme. As this programme comes to an end and we begin to work more with the Repositories and Preservation strand projects the team’s focus will shift to more research and synthesis and we will have fewer direct support commitments. In the ongoing research of the team I provide a perspective from the JISCCETIS community and will also be focusing on liasing with the eframework.
The team’s self-description from the RRT wiki is:

Repositories Research Team
As part of the Digital Repositories Programme, JISC have established the Repositories Research Team. The remit for the work of the research team is quite wide and includes helping projects find and exploit synergies across the programme and beyond, gathering scenarios and use cases from projects, liaising with other national and international repositories activities, including liaison with the e-Framework, synthesizing project and programme outcomes, and engaging with interoperability standards activity and repository architectures.

The Repositories Research Team is a collaboration between UKOLN and CETIS. UKOLN have worked previously on repositories in a number of contexts including ePrints UK, the Open Archives Forum and Delos, and CETIS, the Centre for Educational Technology Interoperability Standards, has considerable experience in supporting the development of digital repositories for e-learning.
http://www.ukoln.ac.uk/repositories/digirep/index/JISC_Digital_Repository_Wiki

]]>
http://blogs.cetis.org.uk/johnr/2006/10/23/repositories-research-team/feed/ 0