CETIS Gathering

At the end of June we ran an event about technical approaches to gathering open educational resources. Our intent was that we would provide space and facilities for people to some and talk about these issues, but we would not prescribe anything like a schedule of presentations or discussion topics. So, people came but what did they talk about?

In the morning we had a large group discussing approaches to aggregating resources and information about them through feeds such RSS or ATOM, and another smaller group discussing tracking what happens to OER resources once they are released.

I wasn’t part of the larger discussion, but I gather than they were interested in the limits of what can be brought in by RSS and difficulties due to the (shall we say) flexible semantics of the elements typically used in RSS even when extended in the typical way with Dublin Core. They would like to bring in information which was more tightly defined and also information from a broader range of sources relating to the actual use of the resource. They would also like to identify the contents of resources at a finer granularity (e.g. an image or movie rather than a lesson) while retaining the context of the larger resource. These are perennial issues, and bring to my mind technologies such as OAI-PMH with metadata richer than the default Dublin Core, Dublin Core Terms (in contrast to Dublin Core Element Set), OAI-ORE, and projects such as PerX and TicToCs (see JournalToCs) (just to mention two which happened to be based in the same office as me). At CETIS we will continue to explore these issues, but I think it is recognised that the solution is not as simple as using a new metadata standard that is in some way better than what we have now.

The discussion on tracking resources (summarized here by Michelle Bachler) was prompted by some work from the Open University’s OLNet on Collective Intelligence, and also some CETIS work on tracking OERs. For me the big “take-home” idea was that many individual OER providers and services must have information about the use of their resources which, while interesting in themselves, would become really useful if made available more widely. So how about, for example, open usage information about open resources? That could really give us some data to analyse.

There were some interesting overlaps between the two discussions: for example how to make sure that a resource is identified in such a way that you can track it and gather information about it from many sources, and what role can usage information play in the full description of a resources.

After lunch we had a demo of a search service built by cross-searching web 2 resource hosts via their APIs, which has been used by the Engineering Subject Centre’s OER pilot project. This lead on to a discussion of the strengths and limitations of this approach: essentially it is relatively simple to implement and can be used to provide a tailored search for an specialised OER collection so long as the number of targets being searched is reasonably low and their APIs stable reliable. The general approach of pulling in information via APIs could be useful in pulling in some of the richer information discussed in the morning. The diversity of APIs lead on to another well-rehearsed discussion mentioning SRU and OpenSearch as standard alternatives.

We also had a demonstration of the iCOPER search / metadata enrichment tool which uses REST, Atom and SPI to allow annotation of metadata records–very interesting as a follow-on from the discussions above which were beginning to see metadata not as a static record but as an evolving body of information associated with a resource.

Throughout the day, but especially after these demos, people were talking in twos and threes, finding out about QTI, Xerte, cohere, and anything else that one person knew about and others wanted to. I hope people who came found it useful, but it’s very difficult as an organiser of such and event to provide a single definitive summary!

Additional technical work for UKOER

CETIS has been funded by JISC to do some additional technical work relevant to the the UKOER programme. The work will cover three topics: deposit via RSS feeds, aggregation of OERs, and tracking & analysis of OER use.

Feed deposit
There is a need for services hosting OERs to provide a mechanism for depositors to upload multiple resources with minimal human intervention per resource. One possible way to meet this requirement that has already identified by some projects is “feed deposit”. This approach is inspired by the way in which metadata and content is loaded onto user devices and applications in podcasting. in short, RSS and ATOM feeds are capable, in principle, of delivering the metadata required for deposit into a repository and in addition can provide either a pointer to the content or that content itself may be embedded into the feed. There are a number of issues with this approach that would need to be overcome.

In this work we will: (1) Identify projects, initiatives, services, etc. that are engaged in relevant work [–if that’s you, please get in touch]. (2) Identify and validate the issues that would arise with respect to feed deposit, starting with those outlined in the Jorum paper linked to above. (3) Identify current approaches used to address these issues, and identify where consensus may be readily achieved.

Aggregation of OERs
There is interest in facilitating a range of options for the provision of aggregations of resources representing the whole or a subset of the UKOER programme output (possibly along with resources from other sources). There have been some developments that implement solutions based on RSS aggregation, e.g. Ensemble and Xpert; and the UKOLN tagometer measures the number of resources on various sites that are tagged as relevant to the UKOER programme.

In this work we will illustrate and report on other approaches, namely (a) Google custom search, (b) query and result aggregation through Yahoo pipes and (c) querying through the host service APIs. We will document the benefits and affordances as well as drawbacks and limitations of each of these approaches. These include the ease with which they may be adopted, and the technical expertise necessary for their development, their dependency on external services (which may still be in beta), their scalability, etc.

Tracking and analysis of OER use
Monitoring the release of resources through various channels, how those resources are used and reused and the comments and ratings associated with them, through technical means is highly relevant to evaluating the uptake of OERs. CETIS have already described some of the options for resource tracking that are relevant to the UKOER programme.

In this work we will write and commission case studies to illustrate the use of these methods, and synthesise the results learnt from this use.

Who’s involved in this work
The work will be managed by me, Phil Barker, and Lorna M Campbell.

Lisa J Rogers will be doing most of the work related to feed deposit and aggregation of OERs

R John Robertson will be doing most of the work relating to Tracking and analysis of OER use.

Please do contact us if you’re interested in this work.

Repository standards

Tore Hoel tweeted:

The most successful repository initiatives do not engage with LT standards EDRENE report concludes #icoper

pointing me to what looks like a very interesting report which also concludes

Important needs expressed by content users include:

  • Minimize number of repositories necessary to access

Of these, the first bullet point clearly relates to interoperability of repositories, and indicates the importance of focusing on repository federations, including metadata harvesting and providing central indexes for searching for educational content.

Coincidentally I had just finished an email replying to someone who asked about repository aggregation in the context of Open Educational Resources because she is “Trying to get colleagues here to engage with the culture of sharing learning content. Some of them are aware that there are open educational learning resources out there but they don’t want to visit and search each repository.” My reply covered Google advanced search (with the option to limit by licence type), Google custom search engines for OERs, OER Commons, OpenCourseWare Consortium search, the Creative Commons Search, the Steeple podcast aggregator and the similar-in-concept Ensemble Feed finder.

I concluded: you’ll probably notice that everything I’ve written above relies on resources being on the open web (as full text and summarized in RSS feeds) but not necessarily in repositories. If there are any OER discovery services built on repository standards like OAI-PMH or SRU or the like then they are pretty modest in their success. Of course using a repository is a fine way of putting resources onto the web, but you might want to think about things like search engine optimization, making sure Google has access to the full text resource, making sure you have a site map, encouraging (lots of) links from other domains to resources (rather than metadata records), making sure you have a rich choice of RSS feeds and so on.

I have some scribbled notes on 4 or 5 things that people think are good about repositories by which may also be harmful, a focus on interoperability between repositories and repository-related services (when it is at the expense of being part of the open web) is on there.

Tracking the Use of Open Educational Resources

As part of our support for the HEFCE, HE Academy, JISC UKOER programme CETIS are running a “2nd Tuesday” online seminar to discuss the tracking the use of OERs on Thursday 20 Nov (* Yes, I know, perhaps they should be called alternating 2nd Tuesday and 3rd Thursday seminars). Details about timing and how to join will be sent to UKOER projects through the usual strand mail lists; others who are interested should contact David Kernohan (d.kernohan /at/ JISC.AC.UK) about possible extra spaces.

Here’s the full description:

“As far as is possible projects will need to track the volume and use of the resources they make available”

At least that is what the call for projects for this programme said; the aim of this session is to help projects with this requirement. The rationale for tracking use from the funder’s perspective is clear: they want to know whether the resources being released with their money are useful to anyone apart from those who created them. Of course, as any who has tried to work with access statistics for a web site knows, we have to be cautious in interpreting such data. For example, how do we compare a simple “viewing” of a resource with someone taking the resource and embedding it in their own course site? Is it even possible to measure how often the latter happens? Another, perhaps more interesting, aspect of tracking use is what it tells us about what a resource is useful for. Being able to show how other people have used a resource might help someone considering using it themselves, but is there any way to capture this information.

As well as simple access logs and tools like Google analytics, tools similar to track-back on blog postings and the usage information provided by sites such as Flickr and Slide Share (i.e. counting the number of views on-site and number of embeds in other sites) are worth considering. Perhaps more contentious but also worth considering are the techniques such as re-direct URLs and web bugs.

We shall seek to clarify what tracking is required and pragmatically desirable and how it may be achieved. This session will be led by CETIS but we don’t pretend to know the answers to this problem, in fact we’re trying to learn from projects what is useful and achievable, so we will be relying on participants in the meeting to bring their own experiences and potential solutions. For this to work we would like to know in advance who has anything to say, so any project or individual with experience to share should contact Phil Barker philb@icbl.hw.ac.uk as soon as possible.

About metadata & resource description (pt 2)

Trying to show how resource description on sites such as Flickr relates to metadata…

Some people have looked at the metadata requirements for the UK OER programme and taken them as a prescription for which LOM or Dublin Core elements they should use. After all that’s what metadata is, isn’t it? But UK OER projects are also encouraged to use Web2.0 or social sharing platforms (Flickr, YouTube, SlideShare etc.) to make their resources available, and these sites don’t know anything about metadata, do they?

Well, in my previous post I tried to distinguish between resource description and metadata, where resource description is pretty much any information about anything, and metadata is the structured information about a resource (acknowledging that the distinction is not always made by everyone). I think that some of the “metadata” requirements given for OER in various discussions are actually better seen at first as resource description requirements.

The second problem with seeing the UK OER metadata requirements as a prescriptions for which elements to use is that, to me at least, it misses the point of what metadata does best. I think that the best view of metadata is that it shows the relationship between resources. “Resources” here means anything — information resources like the OERs, people, places, things, organizations, abstract concepts — so long as the thing can be identified. What metadata does is express or assert a relationship such as “this OER was created by this person”.

So looking at an image’s “canonical” page on Flickr, we see a resource description which has a link to the photo stream of the person who uploaded it (me) and from there there is a link to my profile page on Flickr. That’s done with metadata, but how do we get at it?

Well, in the HTML for the image page the link is rendered as

<a href="/photos/philbarker/"
   title="Link to phil barker's photostream"
   rel="dc:creator cc:attributionURL">
       <b>phil barker</b>
</a>

the rel=”dc:creator cc:attributionURL” tell a computer what the relationship between this page and the URL is, i.e. that the URL identifies the creator of the page and should be used for attribution. That’s not great because I’m not my photostream, in fact my photostream doesn’t even describe me.

Things are better on the photostream page though, it has in its HTML

<link rel="alternate"
  type="application/rss+xml"
  title="Flickr: phil barker's Photostream RSS feed"
  href="http://api.flickr.com/services/feeds/photos_public.gne?id=56583935@N00&lang=en-us&format=rss_200">

which points any application that knows how to read HTML and RSS to the RSS feed for my photostream, where we see in the entry for that picture the following:

<author flickr:profile="http://www.flickr.com/people/philbarker/">nobody@flickr.com (phil barker)</author>

As well as the description of me (my name and not-my-email-address) there is the link to my profile page. Looking at the HTML for that profile page, not only does it generate a human readable rendering in a browser, but it includes the following


<div class="vcard">
    <span class="nickname">phil barker</span>
...
    <span class="RealName">/
        <span class="fn n">
           <span class="given-name">Phil</span>
           <span class="family-name">Barker</span>
        </span>
    </span>
...
</div>

That is a computer readable hCard microformat version of my contact information (coincidentally it’s the same underlying schema for person-data that is used in the LOM)

So there’s your Author metadata on Flickr. And I’ll note that all this happened without me ever thinking that I was “cataloguing”!

To generalise and idealise slightly, the resource pages (the canonical page for the image, the photostream page, my profile page) have embedded in them one or more of the following

  • links which describe the relationship of the resources described on those pages to each other in a computer-readable manner
  • links to alternative descriptions in machine readable metadata, e.g. an RSS or ATOM XML file for the resource described on the page
  • embedded computer readable metadata, e.g. vCard person-data embedded in the hCard microformat.

See also Adam’s post Objects in this Mirror are Closer than they Appear: Linked Data and the Web of Concepts.

Web2 vs iTunesU

There was an interesting discussion last week on the JISC-Repositories email list that kicked off after Les Carr asked

Does anyone have any experience with iTunes U? Our University is thinking of starting a presence on Apple’s iTunes U (the section of the iTunes store that distributes podcasts and video podcasts from higher education institutions). It looks very professional (see for example the OU’s presence at http://projects.kmi.open.ac.uk/itunesu/ ) and there are over 300 institutions who are represented there.

HOWEVER, I can’t shake the feeling that this is a very bad idea, even for lovers of Apple products. My main misgiving is that the content isn’t accessible apart from through the iTunes browser, and hence it is not Googleable and hence it is pretty-much invisible. Why would anyone want to do that? Isn’t it a much better idea to put material on YouTube and use the whole web/web2 infrastructure?

I’ld like to summarize the discussion here so that the important points raised get a wider airing; however it is a feature of these high quality discussions like this one that people learn and change their mind as a result, so I please don’t assume that people quoted below still hold the opinions attributed to them. (Fro example, invisibility on Google turned out to be far from the case for some resources.) If You would like to see the whole discussion look in the JISCMAIL archive

The first answers from a few posters was that it is not an either/or decision.

Patricia Killiard:

Cambridge has an iTunesU site. […] the material is normally deposited first with the university Streaming Media Service. It can then be made accessible through a variety of platforms, including YouTube, the university web pages and departmental/faculty sites, and the Streaming Media Service’s own site, as well as iTunesU.

Mike Fraser:

Oxford does both using the same datafeed: an iTunesU presence (which is very popular in terms of downloads and as a success story within the institution); and a local, openly available site serving up the same
content.

Jenny Delasalle and David Davis of Warwick and Brian Kelly of UKOLN also highlighted how iTunesU complemented rather than competed with other hosting options, and was discoverable on Google.

Andy Powell, however pointed out that it was so “Googleable” that a video from Warwick University on iTunesU video came higher in the search results for University of Warwick No Paradise without Banks than the same video on Warwick’s own site. (The first result I get is from Warwick, about the event, but doesn’t seem to give access to the video–at least not so easily that I can find it; the second result I get is the copy from iTunes U, on deimos.apple.com . Incidentally, I get nothing for the same search term on Google Videos.) He pointed out that this is “(implicitly) encouraging use of the iTunes U version (and therefore use of iTunes) rather than the lighter-weight ‘web’ version.” and he made the point that:

Andy also raised other “softer issues” about which ones will students be referred to that might reinforce one version rather than another as the copy of choice even if it wasn’t the best one for them.

Ideally it would be possible to refer people to a canonical version or a list of available version, (Graham Triggs mentioned Google’s canonical URLs, perhaps if if Google relax the rules on how they’re applied) but I’m not convinced that’s likely to happen. So there’s a compromise, variety of platforms for a variety of needs Vs possibly diluting the web presence for any give resource.

And a response from David Davies:

iTunesU is simply an RSS aggregator with a fancy presentation layer.
[…]
iTunesU content is discoverable by Google – should you want to, but as we’ve seen there are easier ways of discovering the same content, it doesn’t generate new URLs for the underlying content, is based upon a principle of reusable content, Apple doesn’t claim exclusivity for published content so is not being evil, and it fits within the accepted definition of web architecture. Perhaps we should simply accept that some people just don’t like it. Maybe because they don’t understand what it is or why an institution would want to use it, or they just have a gut feeling there’s something funny about it. And that’s just fine.

mmm, I don’t know about all these web architecture principles, I just know that I can’t access the only copy I find on Google. But then I admit I do have something of a gut feeling against iTunesU; maybe that’s fine, maybe it’s not; and maybe it’s just something about the example Andy chose: searching Google for University of Warwick slow poetry video gives access to copies at YouTube and Warwick, but no copy on iTunes.

I’m left with the feeling that I need to understand more about how using these services affects the discoverability of resources using Google–which is one of the things I would like to address during the session I’m organising for the CETIS conference in November.

Semantic Web in HE meeting in Nice

Just announced: “SemHE ’09: semantic technologies for teaching and learning support in higher education” a meeting co-located with the 4th European Conference on Technology Enhanced Learning in Nice 29 or 30 September (tbc). This isn’t a CETIS meeting, but it is part-organized by the JISC SemTech project at Southampton University, a project which is supported by a CETIS working group, and which had its origins in the Semantic Structures for Teaching and Learning session in the 2007 CETIS conference.

Full details an call for papers for the Nice meeting at http://www.semhe.org/.