Self description and licences

One of the things that I noticed when I was looking for sources of UKOERs was that when I got to a resource there was often no indication on it that it was open: no UKOER tag, no CC-license information or logo. There may have been some indication of this somewhere on the way, e.g. on a repository’s information page about that resource, but that’s no good if someone arrives from a Google search, a direct link to the resource, or once someone has downloaded the file and put it on their own VLE.

Naomi Korn, has written a very useful briefing paper on embedding metadata about creative commons licences into digital resources as part of the OER IPR Support project starter pack. All the advice in that is worth following, but please, also make sure that licence and attribution information is visible on the resource as well. John has written about this in general terms in his excellent post on OERs, metadata, and self-description where he points out that this type of self description “is just good practice” which is complemented not supplanted by technical metadata.

So, OER resources, when viewed on their own, as if someone had found them through Google or a direct link, should display enough information about authorship, provenance, etc. for the viewer to know that they are open without needing an application to extract the metadata. The cut and paste legal text and technical code generated by the licence selection form on the Creative Commons website is good for this. (Incidentally, for HTML resources this code also includes technical markup so that the displayed text works as encoded metadata, which has been exploited recently by the OpenAttribute browser addon. I know the OpenAttribute team are working on embedding tools for licence selection and code generation into web content management systems and blogs).

Images, videos and sounds present their own specific problem for including human-readable licence text. Following practice from the publishing industry would suggest that small amounts of text discreetly tucked away on the bottom or side of an image can be enough to help. That example was generate by the Xpert attribution tool from an image of a bridge found on flickr. The Xpert tool will also does useful work for sounds and videos; but for sounds it is also possible to follow the example of the BBC podcasts and provide spoken information at the beginning or end of the audio, and for videos of course one can have scrolling credits at the end.

UKOER Sources

I have been compiling a directory of how people can get at the resources released by the UKOER pilot phase projects: that is the websites for human users and the “interoperability end points” for machines–ie the RSS and ATOM feed URLs, SRU targets, OAI-PMH base URLs and API documentation. This wasn’t nearly as easy as it should have been: I would have hoped that just listing the main URL for each project would have been enough for anyone to get at the resources they wanted or the interoperability end point in a click or two, but that often wasn’t the case.

So here are some questions I would like OER providers to answer by way of self assessment, which will hopefully simplify this in the future.

Does your project website have a very prominent link to where the OERs you have released may be found?

The technical requirements for phase 1 for delivery platforms said:

Projects are free to use any system or application as long as it is capable of delivering content freely on the open web. … In addition projects should use platforms that are capable of generating RSS/Atom feeds, particularly for collections of resources

So: what RSS feeds do you provide for collections of resources and where do you describe these? Have you thought about how many items you have in each feed and how well described they are?

Are your RSS feed URLs and other interoperability endpoints easy to find?

Do your interoperability end points work? I mean, have you tested them? Have you spoken to people who might use them?

While you’re thinking about interoperability end points: have you ever thought of your URI scheme as one? If for example you have a coherent scheme that puts all your OERs under a base URI, and better, provides URIs with some easily identifiable pattern for those OERs that form some coherent collection, then building simple applications such as Google Custom Search Engines becomes a whole lot easier. A good example is how MIT OCW is arranged: all most of the URIs have a pattern http://ocw.mit.edu/courses/[department]/[courseName]/[resourceType]/[filename].[ext] (the exceptions are things like video recordings where the actual media file is held elsewhere).

JISC CETIS OER Technical Mini Projects Call

JISC has provided CETIS with funding to commission a series of OER Technical Mini Projects to explore specific technical issues that have been identified by the community during CETIS events such as #cetisrow and #cetiswmd and which have arisen from the JISC / HEA OER Programmes.

Mini project grants will be awarded as a fixed fee of £10,000 payable on receipt of agreed deliverables. Funding is not restricted to UK Higher and Further Education Institutions. This call is open to all OER Technical Interest Group members, including those outwith the UK. Membership of the OER TIG is defined as those members of oer-discuss@jiscmail.ac.uk who engage with the JISC CETIS technical discussions.

The CETIS OER Mini Projects are building on rapid innovation funding models already employed by the JISC. In addition to exploring specific technical issues these Mini Projects will aim to make effective use of technical expertise, build capacity, create focussed pre-defined outputs, and accelerate sharing of knowledge and practice. Open innovation is encouraged: projects are expected to build on existing knowledge and share their work openly.

It is expected that three projects will be funded in the first instance. If this model proves successful, additional funding may be made available for further projects.

Technical Mini Project Topics
Project 1: Analysis of Learning Resource Metadata Records

The aim of this mini project is to identify those descriptive characteristics of learning resources that are frequently recorded / associated with learning resources and that collection managers deem to be important.

The project will undertake a semantic analysis of a large corpus of educational metadata records to identify what properties and characteristics of the resources are being described. Analysis of textual descriptions within these records will be of particular interest e.g. free text used to describe licence conditions, educational levels and approaches.

The data set selected for analysis must include multiple metadata formats (e.g. LOM and DC) and be drawn from at least ten collections. The data set should include metadata from a number of open educational resource collections but it is not necessary for all records to be from oer collections.

For further background information on this topic and for a list of potential metadata sources please see Lorna’s blog post on #cetiswmd activities

Funding: £10,000 payable on receipt of agreed deliverables.

Project 2: Search Log Analysis

Many sites hosting collections of educational materials keep logs of the search terms used by visitors to the site when searching for resources. The aim for this mini project is to develop a simple tool that facilitates the analysis of these logs to classify the search terms used with reference to the characteristics of a resource that may be described in the metadata. Such information should assist a collection manager in building their collection (e.g. by showing what resources were in demand) and in describing their resources in such a way that helps users find them.

The analysis tool should be shown to work with search logs from a number of sites (we have identified some who are willing to share their data) and should produce reports in a format that are readily understood, for example a breakdown of how many searches were for “subjects” and which were the most popular subjects searched for. It is expected that a degree of manual classification will be required, but we would expect that the system is capable of learning how to handle certain terms and that this learning would be shared between users: a user should not have to tell the system that “Biology” is a subject once they or any other user has done so. The analysis tool should be free to use or install without restriction and should be developed as Open Source Software.

Further information on the sort of data that is available and what it might mean is outlined in my blog post Metadata Requirements from the Analysis of Search Logs

Funding: £10,000 payable on receipt of agreed deliverables.

Project 3: Open Call

Proposals are invited for one short technical project or demonstrator in any area relevant to the management, distribution, discovery, use, reuse and tracking of open educational resources. Topics that applicants may wish to explore include, but are not restricted to: resource aggregations, presentation / visualisation of aggregations, embedded licences, “activity data”, sustainable approaches to RSS endpoint registries, common formats for sharing search logs, analysis of use of advanced search facilities, use of OAI ORE.

Funding: £10,000 payable on receipt of agreed deliverables.

Guidelines

Proposals must be no more than 1500 words long and must include the following information:

  1. The name of the mini project.
  2. The name and affiliation and full contact details of the person or team undertaking the work plus a statement of their experience in the relevant area.
  3. A brief analysis of the issues the project will be addressing.
  4. The aims and objectives of the project.
  5. An outline of the project methodology and the technical approaches the project will explore.
  6. Identification of proposed outputs and deliverables.

Proposals are not required to include a budget breakdown, as projects will be awarded a fixed fee on completion.

All projects must be completed within six months of date of approval.

Submission Dates

In order to encourage open working practices project proposals must be submitted to the oer-discuss mailing list at oer-discuss@jicmail.ac.uk by 17.00 on Friday 8th April. List members will then have until the 17th of April to discuss the proposals and to provide constructive comments. Proposals will be selected by a panel of JISC and CETIS representatives who will take into consideration comments put forward by OER TIG members. Successful bidders will be notified by the 21st of April and projects are expected to start in May and end by 31st October 2011.

Successful bidders will be required to disseminate all project outputs under a relevant open licence, such as CC-BY. Projects must post regular short progress updates and all deliverables including a final report to the oer-discuss list and to JISC CETIS.

We encourage all list members to engage with the Mini Projects and to input comments suggestions and feedback through the list.

If you have any queries about this call please contact Phil Barker at phil.barker@hw.ac.uk

Metadata requirements from analysis of search logs

Many sites hosting collections of educational materials keep logs of the search terms used by visitors to the site who search for resources. Since it came up during the CETIS What Metadata (CETISWMD) event I have been think about what we could learn about metadata requirements from the analysis of these search logs. I’ve been helped by having some real search logs from Xpert to poke at with some Perl scripts (thanks Pat).

Essentially the idea is to classify the search terms used with reference to the characteristics of a resource that may be described in metadata. For example terms such as “biology” “English civil war” and “quantum mechanics” can readily be identified as relating to the subject of a resource; “beginners”, “101” and “college-level” relate to educational level; “power point”, “online tutorial” and “lecture” relate in some way to the type of the resource. We believe that knowing such information would assist a collection manager in building their collection (by showing what resources were in demand) and in describing their resources in such a way that helps users find them. It would also be useful to those who build standards for the description of learning resources to know which characteristics of a resource are worth describing in order to facilitate resource discovery. (I had an early run at doing this when OCWSearch published a list of top searches.)

Looking at the Xpert data has helped me identify some complications that will need to be dealt with. Some of the examples above show how a search phrase with more than one word can relate to a single concept, but in other cases, e.g. “biology 101″ and “quantum mechanics for beginners” the search term relates to more than one characteristic of the resource. Some search terms may be ambiguous: “French” may relate to the subject of the resource or the language (or both); “Charles Darwin” may relate to the subject or the author of a resource. Some terms are initially opaque but on investigation turn out to be quite rich, for example 15.822 is the course code for an MIT OCW course, and so implies a publisher/source, a subject and an educational level. Also, in real data I see the same search term being used repeatedly in a short period of time: I guess an artifact of how someone paging through results is logged as a series of searches: should these be counted as a single search or multiple searches?

I think these are all tractable problems, though different people may want to deal with them in different ways. So I can imagine an application that would help someone do this analysis. In my mind it would import a search log and allow the user to go through search by search classifying the results with respect to the characteristic of the resource to which the search term relates. Tedious work, perhaps, but it wouldn’t take too long to classify enough search terms to get an adequate statistical snap-shot (you might want to randomise the order in which the terms are classified in order to help ensure the snapshot isn’t looking at a particularly unrepresentative period of the logs). The interface should help speed things up by allowing the user to classify by pressing a single key for most searches. There could be some computational support: the system would learn how to handle certain terms and that this learning would be shared between users. A user should not have to tell the system that “Biology” is a subject once they or any other user has done so. It may also be useful to distinguish between broad top-level subjects (like biology) and more specific terms like “mitosis”, or alternatively to know that specific terms like “mitosis” relate to the broader term “biology”: in other words the option to link to a thesaurus might be useful.

This still seems achievable and useful to me.

Important advice on licensing

It frequently comes to the attention of the CETIS-pedantry department that certain among the people with whom we interact, while they have much to say and write that is worth heeding, do not know when to use “licence” and when to use “license”. Those of you who prefer to use US English can stop reading now, unless you’re intrigued by the convolutions of the UK variant of the language: this won’t ever be an issue for you.

It’s quite simple: licence is a noun, it’s the thing; license is a verb, it’s what you do. But how to remember that? Well, hopefully you’ll see that advice is a noun but advise is a verb; similarly device (noun), devise (verb); practice (noun), practise (verb). Words ending –ise are normally verbs[*]. So license/licence sticks to the pattern of c for noun, s for verb.

Hope this helps.

[* OK, you may prefer –ize, which isn’t just for US usage in some cases–but that’s a different rant story]

Sharing service information?

Over the past few weeks the question of how to find service end-points keeps coming up in conversation (I know, says a lot about the sort of conversations I have), for example we have been asked whether we can provide information about where are the RSS feed locations for the services/collections created by the all the UKOER projects. I would generalise this to service end points, by which I mean the things like the base URL for OAI-PMH or RSS/ATOM feed locations or SRU target locations, more generally the location of the web API or protocol implementations that provide machine-to-machine interoperability. It seems that these are often harder to find than they should be, and I would like to recommend one and suggest another approach to helping make them easier to find.

The approach I would like to recommend to those who provide service end points, i.e. those of you who have a web-based service (e.g. a repository or OER collection) that supports machine-to-machine interoperability (e.g. for metadata dissemination, remote search, or remote upload) is that taken by web 2.0 hosts. Most of these have reasonably easy-to-find sections of their website devoted to documenting their API, and providing “how-to” information for what can be done with it, with examples you can follow, and the best of them with simple step-by-step instructions. Here’s a quick list by way of providing examples

I’ll mention Xpert Labs as well because, while the “labs” or “backstage” approach in general isn’t quite what I mean by simple “how-to” information, it looks like Xpert are heading that way and “labs” sums up the experimental nature of what they provide.

That helps people wanting to interoperate with those services and sites they know about, but it begs a more fundamental question, which is how to find those services in the first place; for example, how do you find all those collections of OERs. Well, some interested third-party could build a registry for you, but that’s an extra effort for someone who is neither providing or using the data/service/API. Furthermore, once the information is in the registry it’s dead, or at least at risk of death. What I mean is that there is little contact between the service provider and the service registry: the provider doesn’t really rely on the service registry for people to use their services and the service registry doesn’t actually use the information that it stores. Thus, it’s easy for the provider to forget to tell the service registry when the information changes, and if it does change there is little chance of the registry maintainer noticing. So my suggestion is that those who are building aggregation services based on interoperating with various other sites provide access to information about the endpoints they use. An example of this working is the JournalToCs service, which is an RSS aggregator for research journal tables of contents but which has an API that allows you to find information for the Journals that it knows about (JOPML showed the way here, taking information from a JISC project that spawned JournalToCs and passing on lists of RSS feeds as OPML). Hopefully this approach of endpoint users proving information about what they used would only provide information that actually worked and was useful (at least for them).

CETIS “What metadata…?” meeting summary

Yesterday we had a meeting in London with about 25 people thinking about the question “What metadata is really useful?

My thinking behind having a meeting on this subject was that resource description can be a lot of effort; so we need to be careful that the decisions we make about how it is done are evidence-based. Given the right data we should be able to get evidence about what metadata is really used for, as opposed to what we might speculate that it is useful for (with the caveat that we need to allow for innovation, which sometimes involves supporting speculative usage scenarios). So, what data do we have and what evidence could we get that would help us decide such things as whether providing a description of a characteristic such as the “typical learning time for using a resource” either is or isn’t helpful enough to justify the effort? Pierre Far went to an even more basic level and asked in his presentation, why do we use XML for sharing metadata?–is it the result of a reasoned appraisal of the alternatives, such as JSON, or did just seem the right thing to do at some point?

Dan Rehak made the very useful point to me that we need a reason for wanting to answer such questions, i.e. what is it we want to do? what is the challenge? Most of the people in the room were interested in disseminating educational resources (often OERs): some have an interest in disseminating resources that had been provided by their own project or organization, others have an interest in services that help users find resources from a wide range of providers. So I had “help users find resources they needed” as the sort of reason for asking these questions; but I think Dan was after something new, less generic, and (though he would never say this) less vague and unimaginative. What he suggested as a challenge was something like “how do you build a recommender system for learning materials?” Which is a good angle, and I know it’s one that Dan is interested in at the moment; I hope that others can either buy into that challenge or have something equally interesting that they want to do.

I have suggested that user surveys, existing metadata and search logs are potential sources of data reflecting real use and real user behaviour, and no one has disagreed so I structured much of the meeting around discussion of those. We had short overviews of examples previous work on each each of these, and some discussion about that, followed by group discussions in more depth for each. I didn’t want this to be an academic exercise, I wanted the group discussions to turn up ideas that could be taken forward and acted on, and I was happy at the end of the day. Here’s a sampler of the ideas turned up during the day:
* continue to build the resources with background information that I gathered for the meeting.
* promote the use common survey tools, for example the online tool used by David Davies for the MeDeV subject centre (results here).
* textual analysis of metadata records to show what is being described in what terms.
* sharing search log in a common format so that they can be analysed by others (echoes here of Dave Pattern’s sharing of library usage data and subsequent work on business intelligence that can be extracted from it).
* analysis of search logs to show which queries yield zero hits which would identify topics on which there was unmet demand.

In the coming weeks we shall be working through the ideas generated at the meeting in more depth with the intention of seeing which can actually be brought to fruition. In the meantime keep an eye on the wikipage for the meeting which I shall be turning into a more detailed record of the event.

Jorum and Google ranking

Les Carr has posted an interesting analysis of Visibility of OER Material: the Jorum Learning and Teaching Competition. He searches for six resources on Google and compares the ranking in the results page of the resource on Google with the resource elsewhere. The results are mixed: sometimes Jorum has the top place sometimes some other site (institutional or author’s site) is top, though it should be said that with one exception we’re talking about which is first and which is second. In other words both would be found quite easily.

Les concludes:

Can we draw any general patterns from this small sample? To be honest, I don’t think so! The range of institutions is too diverse. Some of the alternative locations are highly visible, so it is not surprising that Jorum is eclipsed by their ranking (e.g. Cambridge, very newsworthy Gurkhas international organisation). Some 49% of Open Jorum’s records provide links to external sources rather than holding bitstream contents directly. It would be very interesting to see the bigger picture of OER visibility by undertaking a more comprehensive survey.

Yes it would be very interesting to see the bigger picture, and also it would be interesting to see a more thorough investigation of just the Jorum’s role (I don’t think Les will mind the implication that he has no more than scraped the surface).

Some random thoughts that this raises in my mind:

  • Title searches are too easy, the quality of resource description will only be tested by searching for the keywords that are really used by people looking for these resources. Some will know the title of the resource, but not many. Just have a play with using the most important one or two words from the title rather than the whole title and see how the results change.
  • To say that Jorum enhances/doesn’t enhance visibility depending on whether it comes above or below the alternative sites is too simplistic. If it links to the other site Jorum will enhance the visibility of that site even if it ranks below it; having the same resource represented twice in the search engine results page enhances its visibility no matter what the ordering; on the other hand, having links from elsewhere pointing to two separate sites probably reduces the visibility of both.
  • Sometimes Jorum hosts a copy of the resource, sometimes it just points to a copy elsewhere; that’s got to have an effect (hasn’t it?).
  • What is the cause of the difference? When I’ve tried similar (very superficial) comparisons, I’ve noticed that Jorum gets some of the basics of SEO right (e.g. using the resource’s title in the HTML Title element; curiously it doesn’t seem to use the HTML Description element). How does this compare to other hosts? I’ve noticed some other OER sites that don’t get this right, so we could see Jorum as guaranteeing a certain basic quality of resource discovery rather than as necessarily enhancing visibility. (Question: is this really necessary?)
  • What happens over time? Do people link to the copy in the Jorum or elsewhere. This will vary a lot, but there may be a trend. I’ll note in passing that choosing six resources that had been promoted by Jorum’s learning and teaching competition may have skewed the results.
  • Which should be highest ranked anyway? Do we want Jorum to be highly ranked to reflect its role as part of the national infrastructure, a place to showcase what you’ve produced; or do institutions see releasing OERs as part of a marketing strategy, and the best Jorum can do is quietly improve the ranking of the OERs on the institution’s site by linking to them? This surely relates to the choice between having Jorum host the resource or just having it link to the resource on the institutions site (doesn’t it?).

Sorry, all questions and no answers!

An open and closed case for educational resources

I gave a pecha kucha presentation (20 slides, 20 seconds per slide) at the Repository Fringe in Edinburgh last week. I’ve put the slides on slideshare, and there’s also a video of the presentation but since the slides are just pictures, and the notes are a bit disjointed, and my delivery was rather rushed, it seems to me that it would be useful to reproduce what I said here. Without the 20 second per slide constraint.

The main thrust of the argument is that Open Educational Resource (OER) or OpenCourseWare (OCW) release can be a good way of overcoming some of the problems institutions have regarding the management of their learning materials. By OER or OCW release we mean an institution, group or individual disseminating their educational resources under creative commons licences that allow anyone to take and use those resources for free. As you probably know over the last year or so HEFCE have put a lot of money into the UKOER programme.

I first started thinking about this approach in relation to building repositories four or five years ago.

I was on the advisory group for a typical institutional learning object repository project. The approach that they and many others like them at the time had chosen was to build a closed, inward-facing repository, providing access and services only within the institution. The project concerned about interoperability with their library systems and worried a lot about metadata.

Castle Kennedy The repository was not a success. In the final advisory group meeting I was asked whether I could provide an example of an institution with a successful learning object repository. I gave some rambling unsatisfactory answer about how there were a few institutions trying the same approach but it was difficult to know what was happening since they (like the one I was working with) didn’t want to broadcast much information about what they were doing.

And two days later it dawned on me that what I should have said was MIT.

MIT OpenCourseWare
At that time MIT’s OpenCourseWare initiative was by far the most mature open educational resource initiative, but now we have many more examples. But in what way does OER-related activity relate to the sort of internal management of educational materials that concerns projects like the one with which I was involved?

The challenges of managing educational resources
The problems that institional learning object repositories were trying to solve at that time were typically these:

  • they wanted to account for what educational content they had and where it was;
  • they wanted to promote reuse and sharing within the Institution;
  • they wanted more effective and efficient use of resources that they had paid to develop.

And why, in general, did they fail? I would say that there was a lack of buy-in or commitment all round, there was a lack of motivation from the staff to deposit and there was a lack of awareness that the repository even existed. Also there was more focus on the repository per se and systems interoperability than on directly addressing the needs of their stakeholders.

Does an open approach address these challenges?

Well, firstly, by putting your resources on the open web everyone will be able to access them, including the institution’s own staff and students. What’s more once these resources are on the open web they can be found using Google, which is how those staff and students search. Helping your staff find and have access to the resources created by other staff helps a lot with promoting reuse and sharing within the institution.

It is also becoming apparent that there are good institution-level benefits from releasing OERs.

For example the OU have traced a direct link from use of their OpenLearn website to course enrolment.

In general terms, open content raises the profile of the institution and its courses on the web, providing an effective shop window for the institution’s teaching, in a way that an inward facing repository cannot. Open content also gives prospective students a better understanding of what is offered by an institution’s courses than a prospectus can, and so helps with recruitment and retention.

There’s also a social responsibility angle on OERs. On launching the Open Universities OpenLearn initiative Prof. David Vincent said:

Our mission has always been to be open to people, places, methods and ideas and OpenLearn allows us to extend these values into the 21st century.

While the OU is clearly a special case in UK Higher Education, I don’t think there are many working in Universities who would say that something similar wasn’t at least part of what they were trying to do. Furthermore, there is a growing feeling that material produced with public funds should be available to all members of the public, and that Universities should be of benefit to the wider community not just to those scholars who happen to work within the system.

Another, less positive, harder-edged angle on social responsibility was highlighted in the ruling on a Freedom of Information request where the release of course material was required. The Information Tribunal said

it must be open to those outside the academic community to question what is being taught and to what level in our universities

We would suggest that we are looking at a future where open educational resources should be seen as the default approach, and that a special case should need to be made for resources that a public institution such as a university wants to keep “private”. But for now the point we’re making is that social responsibility is a strong motivator for some individuals, institutions and funders.

Legalities.
Releasing educational content openly on the web requires active management of intellectual property rights associated with the content used for teaching at the institution. This is something that institutions should be doing anyway, but they often fudge it. They should address questions such as:

  • Who is responsible for ensuring there is no copyright violation?
  • Who owns the teaching materials, the lecturer who wrote them or the institution?
  • Who is allowed to use materials created by a member of staff who moves on to another institution?

The process of applying open licences helps institutions address these issues, and other legal requirements such as responding to freedom of information requests relating to teaching materials (and they do happen).

Not all doom and gloom
Some things do become simpler when you release learning materials as OERs.

For example access management for the majority of users (those who just want read-only access) is a whole lot simpler if you decide to make a collection open; no need for the authentication or authorization burden that typically comes with making sure that only the right people have access.

On a larger scale, the Open University have found that setting up partnerships for teaching and learning with other institutions becomes easier if you no longer have to negotiate terms and conditions for mutual access to course materials from each institution.

Some aspects of resource description also become easier.

Some (but not all) OER projects present material in the context in which they were originally delivered, i.e. arranged as courses (The MIT OCW course a screen capture of which I used above is one example). This may have some disadvantages, but the advantage is that the resource is self describing–you don’t have to rely soley on metadata to convey information such as educational level and potential educational use. This is especially important becuase whereas most universities can describe their courses in ways that make sense, we struggle to agree controlled vocabularies that can be applied across the sector.

Course or resources?
The other advantage of presenting the material as courses rather than disaggregated as individual objects is that the course will be more likely to be useful to learners.

Of course the presentation of resources in the context of a course should not stop anyone from taking or pointing to a single component resource and using it in another context. That should be made as simple as possible; but it’s always going to be very hard to go in the other direction, once a course is disaggregated it’s very hard to put it back together (the source of the materil could describe how to put it back together, or how it fitted in to other parts of a course, but then we’re back into the creation of additional metadata).

Summary and technical
What I’ve tried say is that putting clearly licensed stuff onto the open web solves many problems.

What is the best technology genre for this? repository or content management system or VLE or Web2 service. Within the UKOER programme all four approaches were used successfully. Some of these technologies are primarily designed for local management and presentation of resources rather than open dissemination; and vice versa. There’s no consensus, but there is a discernable trend towards using a diversity of approaches and mixing-and-matching, e.g. some UKOER projects used repositories to hold the material and push it to Web 2 services; others pulled material in the other direction.

ps: While I was writing this, Timothy Vollmer over on the CreativeCommons blog was writing “Do Open Educational Resources Increase Efficiency?” making some similar points.

Image credits
Most of the images are sourced from Flickr and have one or another flavour of creative commons licence. From the top:

CETIS Gathering

At the end of June we ran an event about technical approaches to gathering open educational resources. Our intent was that we would provide space and facilities for people to some and talk about these issues, but we would not prescribe anything like a schedule of presentations or discussion topics. So, people came but what did they talk about?

In the morning we had a large group discussing approaches to aggregating resources and information about them through feeds such RSS or ATOM, and another smaller group discussing tracking what happens to OER resources once they are released.

I wasn’t part of the larger discussion, but I gather than they were interested in the limits of what can be brought in by RSS and difficulties due to the (shall we say) flexible semantics of the elements typically used in RSS even when extended in the typical way with Dublin Core. They would like to bring in information which was more tightly defined and also information from a broader range of sources relating to the actual use of the resource. They would also like to identify the contents of resources at a finer granularity (e.g. an image or movie rather than a lesson) while retaining the context of the larger resource. These are perennial issues, and bring to my mind technologies such as OAI-PMH with metadata richer than the default Dublin Core, Dublin Core Terms (in contrast to Dublin Core Element Set), OAI-ORE, and projects such as PerX and TicToCs (see JournalToCs) (just to mention two which happened to be based in the same office as me). At CETIS we will continue to explore these issues, but I think it is recognised that the solution is not as simple as using a new metadata standard that is in some way better than what we have now.

The discussion on tracking resources (summarized here by Michelle Bachler) was prompted by some work from the Open University’s OLNet on Collective Intelligence, and also some CETIS work on tracking OERs. For me the big “take-home” idea was that many individual OER providers and services must have information about the use of their resources which, while interesting in themselves, would become really useful if made available more widely. So how about, for example, open usage information about open resources? That could really give us some data to analyse.

There were some interesting overlaps between the two discussions: for example how to make sure that a resource is identified in such a way that you can track it and gather information about it from many sources, and what role can usage information play in the full description of a resources.

After lunch we had a demo of a search service built by cross-searching web 2 resource hosts via their APIs, which has been used by the Engineering Subject Centre’s OER pilot project. This lead on to a discussion of the strengths and limitations of this approach: essentially it is relatively simple to implement and can be used to provide a tailored search for an specialised OER collection so long as the number of targets being searched is reasonably low and their APIs stable reliable. The general approach of pulling in information via APIs could be useful in pulling in some of the richer information discussed in the morning. The diversity of APIs lead on to another well-rehearsed discussion mentioning SRU and OpenSearch as standard alternatives.

We also had a demonstration of the iCOPER search / metadata enrichment tool which uses REST, Atom and SPI to allow annotation of metadata records–very interesting as a follow-on from the discussions above which were beginning to see metadata not as a static record but as an evolving body of information associated with a resource.

Throughout the day, but especially after these demos, people were talking in twos and threes, finding out about QTI, Xerte, cohere, and anything else that one person knew about and others wanted to. I hope people who came found it useful, but it’s very difficult as an organiser of such and event to provide a single definitive summary!