Comments on: OER, RSS and JorumOpen http://blogs.cetis.org.uk/lmc/2009/12/09/oer-rss-and-jorumopen/ Cetis Blog Fri, 05 Jul 2013 07:17:37 +0000 hourly 1 http://wordpress.org/?v=4.1.22 By: OER repositories and preservation – the elephant (not in) the room? « Repository News http://blogs.cetis.org.uk/lmc/2009/12/09/oer-rss-and-jorumopen/#comment-121 Fri, 17 Sep 2010 13:58:30 +0000 http://blogs.cetis.org.uk/lmc/?p=265#comment-121 […] depending on the tools they were using and the types of resources they were releasing (see http://blogs.cetis.org.uk/lmc/2009/12/09/oer-rss-and-jorumopen/).  Responding to a clear demand from the ukoer community who understandably did not wish to dual […]

]]>
By: JIF10 at Royal Holloway « Repository News http://blogs.cetis.org.uk/lmc/2009/12/09/oer-rss-and-jorumopen/#comment-120 Fri, 06 Aug 2010 14:55:55 +0000 http://blogs.cetis.org.uk/lmc/?p=265#comment-120 […] the demo, I have, in fact, liaised with already in the context of RSS aggregation for ukoer (see http://blogs.cetis.org.uk/lmc/2009/12/09/oer-rss-and-jorumopen/).  I was also able to speak with Pat Lockley from the project who told me that Xpert can now […]

]]>
By: John’s JISC CETIS blog » RSS for deposit, Jorum and UKOER: part 2 commentary http://blogs.cetis.org.uk/lmc/2009/12/09/oer-rss-and-jorumopen/#comment-119 Thu, 04 Feb 2010 14:23:29 +0000 http://blogs.cetis.org.uk/lmc/?p=265#comment-119 […] note that of the 60 feeds they harvest, 5 aren’t valid xml and 20 aren’t valid RSS (see comment on Lorna’s post). It is worth noting that aggregators can and do deal with poorly formed data, however, in the […]

]]>
By: Lorna http://blogs.cetis.org.uk/lmc/2009/12/09/oer-rss-and-jorumopen/#comment-118 Mon, 11 Jan 2010 16:30:51 +0000 http://blogs.cetis.org.uk/lmc/?p=265#comment-118 Thanks for all the comments and discussion folks. There’s certainly no “one size fits all” solution emerging but there’s a huge amount of valuable information here. We will be attempting to synthesise some useful outputs from this discussion shortly.

]]>
By: Nick Sheppard http://blogs.cetis.org.uk/lmc/2009/12/09/oer-rss-and-jorumopen/#comment-117 Thu, 07 Jan 2010 12:51:17 +0000 http://blogs.cetis.org.uk/lmc/?p=265#comment-117 If you can face it, I’ve just posted more on this at http://repositorynews.wordpress.com/2010/01/07/really-not-so-simple-syndication/

]]>
By: Really (not so) Simple Syndication « Repository News http://blogs.cetis.org.uk/lmc/2009/12/09/oer-rss-and-jorumopen/#comment-116 Thu, 07 Jan 2010 12:40:49 +0000 http://blogs.cetis.org.uk/lmc/?p=265#comment-116 […] has been elevated in priority recently due to two separate, though similar, use-cases being explored by JorumOpen and the Xpert project at Nottingham University, that effectively seek to extend RSS from simply […]

]]>
By: Pat Lockley http://blogs.cetis.org.uk/lmc/2009/12/09/oer-rss-and-jorumopen/#comment-115 Mon, 04 Jan 2010 13:03:44 +0000 http://blogs.cetis.org.uk/lmc/?p=265#comment-115 Hello,

I am the aforementioned Pat.

Working from the topics listed in the paper / from my experience with xpert

Item identification
——————-

Xpert works on the basis that the URL for each item is unique (well it best be) so this is the “key” in database terms. This is how we distinguish between one item and another.

Item updates
————

If you’re harvesting URLs, I don’t think there is an issue with this. I would assume that all feeds are very much alive and continually changing or being modified.

Xpert empties all its metadata each night and reharvests each time, so it takes a new copy of the metadata each time. So if the metadata has been updated, xpert will reflect that.

I would agree with Jenny that any specific information is going to be confusing to end users, and possibly limit the ability to submit the same feed to other repositories.

Item deletions
————–
As xpert deletes its metadata every night, an item being taken from the feed will still be in the system (although we record when it was last found in a feed) but will be less likely to be found in a search result.

We’ve not explicitly had a problem with this though

Missing items
————-
I’m covering this later in a new section, but feeds can contain a tag on how often they should be harvested.

Polling period
————–
See above, but we harvest once a day. Each day brings about 10 or so new items.

Feed formats / Metadata formats
——————————-
DC seems by far the most widely used. I would worry more than people will stick to it and be consistent with data inputted.

Repository Required Metadata Profile
————————————
Xpert doesn’t have a minimum setting, other than it needs to have a link. Items with less metadata will just be found less often, but I wouldn’t not harvest just because metadata was sparse – people may still find it.

Licensing Content
—————–
You can have an overall license per feed, or information per item. I am not sure why subsequent release is an issue because they are already released?

Links or resource download
————————–
We only take links, no downloading occurs.

Other issues
————

Who is the end user? – I think it’s acceptable to have an RSS feed for harvesters and an RSS feed for people. Most feeds contain all their site’s content (I think only a few limit) and so are well suited to being harvested without any worries over missing items due to harvest frequency. That to me is better than trying to make one feed for all.

It might be logical that if we are making a second RSS for harvesting we might use some other technology instead.

Feeds that aren’t valid – Of all 60 feeds xpert takes, 5 aren’t valid xml, and approximately 20 aren’t valid RSS. The harvesting service has to be flexible to deal with this. It also needs to like lots of character sets, as a lot of content isn’t in English

Format of presentation – I think a link should go to a page. Some xpert links prompt for download, and some go off to a scorm package – which when accessed outside a scorm client / service looks absolutely awful to an end user (and would almost certainly put them off using it). It’s not an ideal situation.

Quality of metadata – It’s often very thin indeed

Author format – preferred way of displaying author

Making metadata consistent – one problem we have in xpert is the different forms of attribution – some organisations claim DC:Creator as them, some allow that for the individual creator. This makes presenting meaningful, consistent data to someone searching the data awkward.

DC nodes – Personally, i think they are a bit weak, and more categories are needed for more relevant information (level, accessibility, duration).

]]>
By: Julian Tenney http://blogs.cetis.org.uk/lmc/2009/12/09/oer-rss-and-jorumopen/#comment-114 Fri, 18 Dec 2009 11:56:36 +0000 http://blogs.cetis.org.uk/lmc/?p=265#comment-114 We have also hit many of those issues, but what strikes me is that many people are looking for a simple way to expose their resources, and, to their mind, RSS fits that bill – we know that because it is what many people are choosing to use.

OAI-PMH SWORD etc are big technical barriers for many people who have resources to expose – anyone can make a feed. Along with some good guidelines on how to best use the feeds, this surely presents a good opportunity for Jorum, especially while significant collections such as OER Recommeder continue to use feeds, and people find it easy to use?

]]>
By: Charles Duncan http://blogs.cetis.org.uk/lmc/2009/12/09/oer-rss-and-jorumopen/#comment-113 Fri, 18 Dec 2009 11:13:40 +0000 http://blogs.cetis.org.uk/lmc/?p=265#comment-113 Gareth’s paper raises a number of interesting issues. In the introductory part he reviews technologies for bulk transfer/deposit of metadata from one source to another. However most of the paper expands on his point “At first glance using syndicated feeds to achieve this would be an obvious choice”. I would like to examine whether this is still true at second glance and to suggest why the other technologies, which were developed expressly for this purpose, are more appropriate.

First let’s look at RSS feeds. In their favour we have:

– They are commonly used and familiar to everyone

– Many repositories can expose resources through RSS feeds

– They offer “continuous update”

However, The implementation suggested for JorumOpen negates many of these benefits:

– The standard form of an RSS feed is not suitable for ingest into JorumOpen because it requires suitable metadata and licence information to be added to the feed. This would require everyone who creates feeds to ensure they are enhanced for JorumOpen and will require clarity on what metadata formats are required and what minimum fields are needed for acceptance into JorumOpen. How are people to know if their feeds do not meet these minimum standards?

– Who can submit feeds? Presumably resources in feeds can be submitted on behalf of others. The metadata will need to make clear who are the owners of resources.

– The “continuous update” is not implemented as stated by Laura Shaw “The feed isn’t continually polled for new content (and obviously no functionality for deletes/updates within a feed)”

– “Items within a feed are not auto classified within Jorum.” Even if the classification information is provided within the metadata?

If no other alternatives are available then RSS feeds are perhaps a suitable quick solution.

However, given the availability of OAI-PMH and SWORD it could be quite straightforward to harvest metadata from one repository in the required metadata format (and to set up periodic updates) and to deposit these metadata records using SWORD, ensuring that all the metadata from the original resource is preserved – including classification metadata and ownership information.

Licence information is always likely to be more problematic and as a CC licence will need to be selected for JorumOpen.

]]>
By: Jenny Gray http://blogs.cetis.org.uk/lmc/2009/12/09/oer-rss-and-jorumopen/#comment-112 Fri, 18 Dec 2009 09:45:30 +0000 http://blogs.cetis.org.uk/lmc/?p=265#comment-112 I’m a bit late commenting on this so many of the things I would have said have already been said: RSS gets you into a number of different places like OCWC, OERRecommender; RSS and DC and CC extensions are easy so lowest common denominator; OpenLearn has too many resources for a manual upload.

The paper concludes that the OAI-PMH metadata harvesting tool is a better option for JorumOpen. I’d strongly agree. I realise it is technically more challenging to offer an OAI interface, but there are many opensource libraries available which you can plug in to java or php apps.

There’s no well-maintained LOM extension for RSS which makes it a risky implementation if you want your feeds to pass an external validation too. OAI would also allow you to have a custom xml format for import into Jorum if you felt it necessary.

I was concerned by the authors suggestions for item update identification. As he states, checksums etc will not work for website notification harvesting. But adding custom text like “jorum:update” to an RSS feed will annoy all the other users of my feed and pollute my data in their search results. Shouldn’t we just recommend people use the existing date tags appropriately and compare a date stored in the respository with the date on the feed?

Deletion of an asset is a tricky area. OAI would let you mark an item deleted. I’m more worried by the suggestion that a feed has a finite amount of items in it, and I can see you are too as many of the other problems listed in the drawback stem from this assumption. There’s no such restriction in the RSS standard and no expectation of a limited list in other OER aggregators like OERRecommender who expect you to list everything you’ve got. Have you talked to the developers of that tool to see how they handle updates, deletions, missing items etc? I know it was knocked together pretty quickly, but I expect they’ll have thought of these things. I can put you in touch if you need an email as Joel doesn’t work for USU COSL any more I don’t think.

One final thought – if you get everyone to offer a single feed format into JorumOpen, then one advantage of JorumOpen can be to offer all the other feed formats so that all the UKOER projects can be aggregated elsewhere.

]]>