JISC Learning Registry Node Experiment

Posted on November 7, 2011 by Lorna Campbell

Over the last decade the volume and range of educational content available on the Internet has grown exponentially, boosted by the recent proliferation of open educational resources. While search engines such as Google have made it easier to discover all kinds of content, one critical factor is missing where educational resources are concerned – context. Whether you are a teacher, learner or content provider, when it comes to discovering and using educational resources, context is key. Search engines may help you to find educational resources but they will tell you little of how those resources have been used, by whom, in what context and with which outcome.

Formal educational metadata standards have gone some way to addressing this problem, but it has proved to be extremely difficult to capture the educational characteristics of resources and the nuances of educational context within the constraints of a formal metadata standard. Indeed it is notoriously difficult to formally describe what a learning resource is, never mind how and by whom it may be used. Despite the not inconsiderable effort that has gone into the development of formal metadata standards, data models, bindings, application profiles and crosswalks the ability to quickly and easily find educational resources that match a specific educational context, competency level or pedagogic style has remained something of a holy grail.

A new approach to this problem is currently being explored by the Learning Registry, an innovative project being led and funded by the U.S. Department of Education and U.S. Department of Defense. In a guest blog post for CETIS in March this year ADL Senior Technical Advisor Dan Rehak explained that the Learning Registry intends to offer an alternative approach to learning resource discovery, sharing and usage tracking by prioritising sharing of second-party usage data and analytics over first party metadata.

Dan set out the Learning Registry’s use case as follows:

“Let’s assume you found several animations on orbital mechanics. Can you tell which of these are right for your students (without having to preview each)? Is there any information about who else has used them and how effective they were? How can you provide your feedback about the resources you used, both to other teachers and to the organizations that published or curated them? Is there any way to aggregate this feedback to improve discoverability?

The Learning Registry is defining and building an infrastructure to help answer these questions. It provides a means for anyone to ‘publish’ information about learning resources. Beyond metadata and descriptions, this information includes usage data, feedback, rankings, likes, etc.; we call this ‘paradata’”

Paradata is essentially a stream of activity data about a learning resource that effectively provides a dynamic timeline of how that resource has been used. As more usage data is collaboratively gathered and published the paradata timeline grows and evolves, amplifying the available knowledge about what educational resources are effective in which learning contexts. The Learning Registry team refer to this approach as “social networking for metadata”.

The Learning Registry itself is not a search engine, a repository, or a registry in the conventional sense. Instead the project aims to produce a core transport network infrastructure and will rely on the community to develop their own discovery tools and services, such as search engines, community portals, recommender systems, on top of this infrastructure. Dan commented; “We assume some smart people will do some interesting (and unanticipated) things with the timeline data stream.”

The Learning Registry infrastructure is built on couchDb, a noSQL style “document oriented database” providing a RESTful JSON API. The initial Learning Registry development implementation, or node, is available as an Amazon Machine Instance, hosted on Amazon EC2. This enables anyone to set up their own node on the Amazon cloud quickly and easily. As CouchDb is a cross-platform application, nodes can be run on most systems (e.g. Windows, Mac, Linux). The Learning Registry plan to produce zero-config installers to simplify the process of adding nodes to the network with the aim that developers should be able to set up their own node within a day. These nodes will form a decentralised network with each participant configuring their own rules regarding access permissions and what data they gather and share.

Although the Learning Registry will encourage users to produce their own tools and services on top of the network of nodes, the development team have defined a small set of non-core APIs for integration with existing edge services, e.g. SWORD for repository publishing and OAI-PMH for harvesting from the network to local stores.

A key feature of the Learning Registry is that it is metadata agnostic; it will accept legacy metadata in any format and will not attempt to harmonise the metadata it consumes. The team have also developed a specification for sharing and exchanging paradata which is inspired by the Activity Steams format.

As a leading innovator in digital infrastructure for resource discovery JISC have followed the development of the Learning Registry with interest, and in keeping with our remit as a JISC Innovation Support Centre CETIS have fostered a strategic working relationship with the Learning Registry team. In addition to maintaining a watching brief on the project, participating in the technical development working group, and submitting position papers to the Learning Registry summit, CETIS have also liaised directly with the project’s developers and technical advisor and communicated relevant strategic and technical developments back to JISC and the community. The Learning Registry team have also engaged closely with the JISC, CETIS and the UK technical development community by participating in two DevCSI hackdays, contributing to several CETIS events, and attending a number of JISC strategic planning meetings.

JISC have now extended this innovative collaboration with the announcement that they will fund the development of a Learning Registry test node, the first to be developed outwith the US. The node will be developed at MIMAS with input and support from JISC CETIS.

In a press release JISC’s Amber Thomas commented,

“This international collaboration will see us contributing the UK’s expertise to the Learning Registry. We are working with Mimas and JISC Cetis to support the Registry’s vision of gathering together the conversations, ratings, recommendations and usage data around digital content.”

And Steve Midgley, Deputy Director, Office of Education Technology at the US Department of Education added,

“I am greatly encouraged by the collaboration and opportunity presented by our work with JISC on the Learning Registry.”

The Learning Registry project has already generated considerable interest in the UK. We believe that technical developers, infrastructure managers and resource providers will have much to learn from the JISC Learning Registry test node development and we hope that ultimately educational communities in both the US and the UK will benefit from this innovative project.

Further Reading

The Learning Registry: Social Networking for Metadata by Dan Rehak, ADL.
The Learning Registry Plugfest: Report and Developments by Pat Lockley, University of Oxford.
The Learning Registry in 20 minutes or less.
Paradata in 20 minutes or less.
Paradata Specification, Version 1 (JSON schema).
Resource Data Data Model: Learning Registry Technical Specification V RM:0.49.0.

Event: Advances in Open Systems for Learning Resources

Posted on July 7, 2011 by Lorna Campbell

Interested on new developments and advances in open systems for managing learning resources? Yes? Good! Because CETIS are running an event on this very topic as part of this year’s Repository Fringe in Edinburgh. Repository Fringe 2011 takes place on Wednesday 3rd and Thursday 4th August with the CETIS “Advances in Open Systems for Learning Resources” event on Friday 5th August.

Encouraged by recent initiatives promoting the release of openly licensed educational resources there have been considerable developments in the innovative use of repositories, content management systems and web based tools to manage and share materials for teaching and learning. This event will bring together developers and implementers of open repositories, content management systems and other tools to present and discuss recent updates to their systems and their application to learning resources.

The speakers lined up for this event will cover a diverse range of topics that relate to “open systems”. These include open source repository system software, repositories of openly licensed content, open access repositories, open standards and open APIs.

Confirmed speakers include:

Patrick Mc Sweeney, University of Southampton, talking about “Community Engagement in Teaching and Learning Repositories: ePrints, HumBox and OER”.
John Casey, University of the Arts, presenting the ALTO OER Ecosystem.
Dan Rehak of ADL, outlining progress on the US Learning Registry initiative.
Terry McAndrew, University of Leeds, “Getting Bioscience Open Educational resources into ‘Academic Orbit’. Tales from the OeRBITAL launchpad”.
Charles Duncan, Intrallect Ltd, will discuss the development of an item bank repository.

More speakers are still being confirmed so keep an eye on the agenda for updates.

Both the Repository Fringe and the CETIS workshop are free to attend and you can register for either or both events via Eventbrite here.

DevCSI OER Hack Day Report

Posted on May 24, 2011 by Lorna Campbell

I am woefully late in amplifying this, however Kirsty Pitkin has produced an excellent summary of the joint UKOLN CETIS DevCSI OER Hackday that took place in Manchester last month. The two day event drew a wide range of participants from the UK and US including delegates from the Universities of Leeds, Newcastle, Oxford, Bolton and Nottingham, East Riding College, Harper Adams University College, the Open University, the US Learning Registry Initiative, Open Michigan and ISKME, together with colleagues from JISC, CETIS and UKOLN.

Kirsty’s report includes video interviews with many of the hackday participants and also presents a comprehensive summary of the projects developed at the event. These included:

The Course Detective – a Google custom search engine to search over the undergraduate prospectus pages for all UK universities.
WordPress tools, hacks and workflows for OER
Generating Paradata from MediaWiki – how to contribute paradata back into the Learning Registry by building a simple data pump that mines MediaWiki and transforms it into a paradata envelope for the Learning Registry.
Sacreligious – an OER version of Delicious, built on Django.
Xpert / Learning Registry Connection – working with the Xpert search API to parse it and push it into the Learning Registry

You can read Kirsty’s full report here: OER Hack Day, and I can also recommend the OER Hack Day Social Summary

JISC CETIS OER Technical Mini-Projects Proposals and Discussion

Posted on April 14, 2011 by Lorna Campbell

The bids are in for the JISC CETIS OER Technical Mini Projects and there’s a lively discussion going on over at oer-discuss@jiscmail.com

We’ve taken a new approach to the Technical Mini Projects call that builds on rapid innovation funding models already employed by the JISC. Interested parties were asked to submit short 1500 words proposals to the mailing list so that all bids can be openly discussed by members of the CETIS OER Technical Interest Group and anyone else that happens to be interested.

Four diverse proposals were submitted for the open strand of the call, though unfortunately we didn’t get any bids for other two strands (more about that later…). The open strand bids are as follows:

1. Development of Visual Vocabulary Management Tools from Dr Ian Piper, Tellura Information Services Ltd
2. OER Bookmarking Initiative from Paul Horner, James Outterside, Suzanne Hardy and Simon Cotterill, University of Newcastle.
3. Representing Aggregations of Open Educational Resources Utilising OAI-ORE from Alex Lydiate, Vic Jenkins and Kyriaki Anagnostopoulou, University of Bath
4. CaPRéT Cut and PAste reuse and Tracking from Brandon Muramatsu, MIT OEIT and Justin Ball and Joel Duffin, Tatemae.

We’d welcome any constructive comments on these these proposals, either here or on the oer-discuss mailing list which is open to all. You can catch up on the discussions at the oer-discuss archive.

The outcome of the call will be decided by a pannel of JISC and CETIS staff on Tuesday the 19th of April. All comments posted by the end of the day on Monday the 18th will be considered.

Thanks to all those who were brave enough to submit public proposals to this experimental open call, and also to those who have already contributed to the discussion!

Ranking and SEO – light on a dark art

Posted on February 9, 2011 by Lorna Campbell

Search engine optimisation can seem like a bit of a black art, particularly given that search engines can and do change their algorithms with little or no prior warning or documentation. However there is growing awareness that if institutions, projects or individuals wish to have a visible web presence and to disseminate their resources efficiently and effectively search engine optimisation and ranking can not be ignored. Indeed at the JISC HEA OER Phase 2 Prorgamme meeting in January the projects flagged up SEO as being an area where they would appreciate more support and guidance.

Coincidentally the day before the programme meeting Jenny Gray of the OU raised a query on the oer-discuss list about an unexplained drop off in traffic to OpenLearn from google, which she suspected was a result of a change to the google algorithm. Several people responded with helpful suggestions including Lisa McLaughlin of the Institute for the Study of Knowledge Management in Education (ISKME) who forwarded some invaluable advice on search engine ranking and optimisation from her colleague Julie Walling.

Julie has now written a similar post on the ISKME Research blog: Trouble shooting a Drop in Search Engine Rankings. This helpful blog post outlines a set of questions that can be used to troubleshoot whether a drop in rankings is the result of a change in a search engine algorithm, or due to an issue with the website in question. Recognising that SEO can be extremely complex and that the cause of ranking changes elusive, Julie sets out some basic principals to bear in mind. These include:

1. Structure sites so they are as content rich as possible
2. Pick one keyword per page and stick to it
3. Include your keyword in the anchor text of internal links
4. Attract high value external links

I can highly recommend Julie’s blog post to anyone interested in learning more about google ranking and search engine optimisation more generally and as an added bonus she also provides links to other useful resources on this arcane but important topic.

A TAACCCTful mandate? OER, SCORM and the $2bn grant

Posted on January 25, 2011 by Lorna Campbell

Last week’s announcement that the US Department of Labour is planning to allocate $2 billion in grant funds to the Trade Adjustment Assistance Community College and Career Training grants programme over the next four years, has already generated a huge response online. $2 billion is a lot of money #inthiscurrentclimate, or indeed in any climate, however the reason that this announcement has generated so much heat is that it has been billed as $2 billion for open educational resources and furthermore it mandates the use of SCORM. Although there has been almost universal approval that the TAACCCT call mandates the use of the CC-By license the inclusion of the SCORM mandate has stirred up a bit of a hornets nest. John Robertson of CETIS has helpfully curated the tweet storm as it escalated over the course of the day. You can follow it here To SCORM or not to SCORM.

Before attempting to summarise the arguments for and against this mandate, it is worth highlighting the following points from the Department of Labour’s Solicitation for Grant Applications:

The programme, which is releasing $500 million in the first instance, states its aim as follows:

“The TAACCCT provides community colleges and other eligible institutions of higher education with funds to expand and improve their ability to deliver education and career training programs that can be completed in two years or less, are suited for workers who are eligible for training under the Trade Adjustment Assistance for Workers program, and prepare program participants for employment in high-wage, high-skill occupations.”

It goes on to state that:

“The Department is interested in accessible online learning strategies that can effectively serve the targeted population. Online learning strategies can allow adults who are struggling to balance the competing demands of work and family to acquire new skills at a time, place and pace that are convenient for them.”

The SCORM mandate appears under the heading Funding Priorities:

“All successful applicants that propose online and technology-enabled learning projects will develop materials in compliance with SCORM, as referenced in Section I.B.4 of this SGA. These courses and materials will be made available to the Department for free public use and distribution, including the ability to re-use course modules, via an online repository for learning materials to be established by the Federal Government.”

And the Creative Commons mandate is covered in Funding Restrictions: Intellectual Property rights.

“In order to further the goal of career training and education and encourage innovation in the development of new learning materials, as a condition of the receipt of a Trade Adjustment Assistance Community College and Career Training Grant (“Grant”), the Grantee will be required to license to the public (not including the Federal Government) all work created with the support of the grant (“Work”) under a Creative Commons Attribution 3.0 License (“License”).”

It is interesting to note that although the call mandates license and content interoperability formats it does not mandate the use of a specific metadata standard:

“All grant products will be provided to the Department with meta-data (as described in Section III.G.4) in an open format mutually agreed-upon by the grantee and the Department.”

The section in question refers to an appendix of keywords and tags which grantees are advised to use. Although I am unclear from the call whether “grant products” refers to bids and documentation or actual educational resources.

To coincide with the publication of the call, Creative Commons issued a press release with the following endorsement from incoming CEO Cathy Casserly:

“This exciting program signifies a massive leap forward in the sharing of education and training materials. Resources licensed under CC BY can be freely used, remixed, translated, and built upon, and will enable collaboration between states, organizations, and businesses to create high quality OER. This announcement also communicates a commitment to international sharing and cooperation, as the materials will be available to audiences worldwide via the CC license.”

Some bloggers, including Dave Cormier, University of Prince Edward Island, and Stephen Downes, National Research Council of Canada, initially responded with cautious optimism, seeing this initiative as a possible step towards ending “the text book industry as we know it.”

Cormier commented:

“This kind of commitment from the government, money at that scale, that much commitment to the idea of creative commons… this tells me that we might be ready to rid ourselves of the $150 introductory textbook and move to open content.”

Downes concurred because:

“First, government support removes the risk from using a Creative Commons license. Second, it’s enough money. $2 billion will actually produce a measurable amount of educational content. And third, it’s not the only game in town.”

However Cormier was sufficiently incensed about the inclusion of the SCORM mandate to launch a petition on twitter titled “Educational professionals against the enforcement of SCORM by the US Department of Education.”

(In actual fact the TAACCCT call comes from the Department of Labour rather than the Department of Education.)

Rob Abel, CEO of IMS, also responded in no uncertain terms to the inclusion of the SCORM mandate. In a blog post and open letter of IMS members Abel quoted President Obama’s pledge to “remove outdated regulations that stifle job creation and make our economy less competitive,” adding that the inclusion of the SCORM mandate is a “clear violation” of this pledge. Abel claimed that the SCORM mandate is a “ticking time” bomb that will “add enormous cost to the creation of the courses and to the platforms that must deliver them” and “stifle the intended outcomes of the historic TAACCCT investment”. Abel provides a long and detailed critique of SCORM and points out that, “IMS has spent the last five years bringing to market standards that will actually deliver on what SCORM promised” namely Common Cartridge, Learning Tools Interoperability (LTI), and Learning Information Services (LIS).

Chuck Severence, University of Michigan School of Information and IMS consultant,
agreed with Abel’s comments while expanding his self-described rant into a critique of OER initiatives more generally. Severence argued that “this obsession with ‘making and publishing’ OER artefacts that are unsuitable for editing is why nearly all of this kind of work ends up dead and obsolete.” He adds that most OER initiatives “make some slick web site and then try to drive people to their site – virtually none of these efforts can demonstrate any real learning impact.” However Severence does believe that if educational resources are published in a remixable format with a creative commons license they can be of real value and cites his own book Python for Informatics by way of example. He also concedes that the problem is “difficult to solve” before concluding “IMS Common Cartridge is the best we have but it needs a lot more investment in both the specification and tools to support the specification fully.”

A rather more balanced argument was put forward by Michael Feldstein, Academic Enterprise Solutions, Oracle Corporation. While he agreed that mandating SCORM was a mistake he noted that SCORM and IMS CC have “substantially different affordances that are appropriate for substantially different use cases”. While recognizing that it is understandable that “the Federal government wants to mandate a particular standard for content reuse” he added that mandating any specific standard, whether SCORM, IMS CC, RSS or Atom, is likely to be problematic because “educational content re-use is highly context-dependent”. Instead Feldstein suggests:

“The better thing to do would be to require that grantees include in their proposal a plan for promoting re-use, which would include the selection of appropriate format standards.”

Which is exactly the approach taken by the JISC / Higher Education Academy OER Programmes.

Reflecting on these developments from across the pond I have to agree that mandating the use of SCORM for the creation of open educational resources does strike me as being somewhat curious to say the least. This is very much at odds with the approach taken by the JISC / HEA OER Programmes. UK OER does not mandate the use of any specific standards however there are detailed technical guidelines for the programme and CETIS provides technical support to all projects. However the TAACCCT programme is not an OER programme in the same sense as the JISC / HEA UK OER Programmes. It’s interesting to note that while the White House announcement by Hal Plotkin focuses squarely on open education resources, the call itself uses slightly different terminology, referring instead to “open-source courses”.

As CETIS Scott Wilson pointed out on twitter, given TAACCCT’s focus on adaptive self-paced interactive content, the initaitive appears to be more akin to the National Learning Network’s NLN Materials programme which ran for five years from 1999 and which also mandated the use of SCORM with a greater or lesser degree of success, depending on your perspective. This reflection led Amber Thomas of JISC to comment:

“It’s not that mandating standards for learning materials is always wrong, it was the right thing for the NLN Materials – its more nuanced than that. It’s about the point people are at and which direction things need taking in.”

At this stage and at this remove it’s difficult to comment further without knowing more about the rationale behind the Department of Labour’s decision to mandate the use of SCORM for this particular programme. Needless to say, CETIS will be following these developments with interest and will continue to disseminate any further developments.

This is not a blog post…#lwf11

Posted on January 11, 2011 by Lorna Campbell

Earlier today while listening to the livefeed of Learning Without Frontiers #lwf11 I had the misfortune to hear Katharine Birbalsingh presenting. Not since a recent Alt-C keynote have I seen such a vitriolic twitter backchannel. And in my opinion it was justified. Several people in my twitter feed missed the presentation but picked up on the backchannel and asked me to blog a summary of the talk. We’ll I’ve written a short summary but I’ve decided not to post it because that would just be providing publicity for opinions that I actually find quite objectionable. So if you want the summary let me know and I’ll send it to you. I don’t really want such nonsense on my blog.

Dan Stucke has blogged a short sumamry of the presentation on his blog here.

OER 2 Technical Requirements

Posted on December 3, 2010 by Lorna Campbell

Following the experiences of projects funded under the HEFCE / Academy / JISC Open Educational Resources Pilot Programme CETIS have made some minor revisions to the technical guidelines for the current OER 2 Programme. These guidelines reiterate and hopefully clarify the guidelines provided in the Programme Circular and presented at the Programme Start Up Meeting.

Resource Description

As with the OER Pilot Programme, the OER 2 Programme will not mandate the use of one single platform to disseminate resources and one single metadata application profile to describe content. However projects still need to ensure that content released through the programme can be found, used, analysed, aggregated and tagged. In order to facilitate this, content will have to be accompanied by some form of metadata. In this instance metadata doesn’t necessarily mean de jure standards, application profiles, formal structured records, cataloging rules, subject classifications, controlled vocabularies and web forms. Metadata can also take the form of tags added to resources in applications such as flickr and YouTube, time and date information automatically added by services such as slideshare, and author name, affiliation and other details added from user profiles when resources are uploaded. Consequently the OER 2 Programme only mandates the following “metadata”:

Programme tag – ukoer

Project tag – each project should devise a short tag for use in conjunction with the programme tag. e.g. projectname

Title – of the resource being described

Author / owner / contributor – Most systems, whether repositories, vles or applications such as SlideShare, YouTube, etc allow registered users to create a user profile detailing their name and other relevant details. When a user uploads a resource to such a system these details are usually associated with the resource.

Date – This is difficult to define in the context of open educational resources which have no formal publication date. Most applications are likely to record the date a resource is uploaded but it will also be important to record date of creation so users can judge the currency of a resource.

URL – Metadata must include a url that locates the resource being described. The system must assign each item a unique url.

Licence information – Creative Commons is the preferred licence for programme outputs. The cc:license element can be used to provide a URI for the licence chosen and the dc:rights element can be used to provide general textual information about copyrights, other IPR and licence. Embedding the license within the resource is also recommended where practicable. Projects may refer to the OER IPR Support Project for further guidance

Technical information such as file format, name and size may be added but is no longer mandatory.

The hash symbol # should be added to the programme and project tag for use on twitter. E.g. #ukoer for twitter, ukoer for blogs etc.

Projects are also encouraged to think about providing additional information that will help people to find and access resources. For example:

Language information – The language of the resource.

Subject classifications – Specific subject classifications vocabularies are not mandated for the OER Programme. However if a controlled vocabulary is required, projects are advised to use a vocabulary that is already being used by their subject and domain communities. It is not recommended that projects attempt to create new subject classification vocabularies.

Keywords – May be selected from controlled vocabularies or may be free text.

Additional Tags – Tags are similar to keywords. They may be entered by the creator / publisher of a resource and by users of the resource and they are normally free text. Many applications such as flickr, SlideShare and YouTube support the use of tags.

Comments – Are usually generated by users of a resource and may describe how that resource has been used, in what context and whether it’s use was successful or otherwise.

Descriptions – In contrast to comments, descriptions are usually generated by the creator/ publisher of a resource and tend to be more authoritative. Descriptions may provide a wide range of additional information about a resource including information on how it may be used or repurposed.

It’s also useful for projects to be aware that once OERs are released they can easily become separated from their metadata descriptions, if this information is recorded in an associated file. Consequently projects are encouraged to consider embedding relevant descriptive information within the open educational resource where practicable. For further discussion of this approach see Open Educational resources, metadata and self description.

Delivery Platforms

Projects should deposit their content in JorumOpen and in at least one other openly accessible system or application with the ability to produce RSS and / or Atom feeds; for example an open institutional repository, an international or subject area open repository, an institutional website or blog, or a Web 2.0 service.

The RSS / ATOM feed should list and describe the resources produced by the project, and should itself be easy to find. Where a project produces a large number or resources it may not be practical to include them all in one single feed. In such cases it may be necessary to create several feeds in order to list all the resources. If a number of feeds are required to represent the whole collection, the discovery of the complete set of feeds should be facilitated. A number of approaches to enable this are possible, e.g. by creating an OPML file and using multiple instances of the element in the HTML header, or simply listing all feeds in a human readable web page.

There are many other approaches that projects may choose to investigate and use to facilitate resource discovery including search engine optimisation, site maps, OAI-PMH or APIs for remote search (SRU, OpenSearch, ad hoc RESTful search). CETIS will provide further guidance on these approaches in due course.

Projects will be expected to report to JISC on resource use so it is highly recommended that if the chosen delivery platform has tracking functionality this should be switched on and monitored.

For an overview of the wide range of delivery platforms used by the OER Pilot Programme projects may find it useful to refer to the UKOER Technology Overview

Content Standards

The OER 2 Programme is expected to generate a wide range of content types so mandating specific content standards is impractical. However projects should consider using appropriate standards for sharing complex objects e.g. IMS Content Packaging, IMS Common Cartridge and IMS QTI for assessment items.

What We Hope To Learn

We have learned a great deal from the technical choices and experiences of the OER Pilot Projects but we still have much to learn about how to describe and distribute open educational resources most effectively on the open web. Consequently we strongly encourage projects to share their comments, queries, successes and frustrations with CETIS and with other OER 2 projects. CETIS OER Programme Support Officer R. John Robertson will be undertaking informal technical review calls with all OER 2 projects over the course of the programme. Feel free to comment here, or contact John with comments, queries and suggestions.

The #cetis10 Locate, Collate and Aggregate extravaganza

Posted on November 8, 2010 by Lorna Campbell

Next week Phil, John and I will be running a session at the JISC CETIS conference with the snappy title Locate Collate and Aggregate. The aim of this session is to explore innovative technical approaches related to, but not confined to, the JISC / HEA OER 2 Programme which are applicable to finding, using and managing content for teaching and learning, including:

Building collections of OERs.
Drawing together information about learning resources
Building rich descriptions from disparate sources of information

We’ve got an eclectic bunch of contributors lined up including David Kay, Sero; Vic Lyte, MIMAS; James Burke, deBurca; Chris Taylor, oErbital; Rob Pearce, Engineering a Lo-Carbon Future; Pierre Far, OCW Search; Pat Lockley, Xpert and some bloke called Phil Barker. Our contributors will be presenting and leading short discussions on a diverse range of topics including cross-silo semantic search opportunities, using mainstream and niche search engines to discover OERs and automatic selection of resources for a UKOER collection.

We’ve also been promised the world premiere of the long awaited dogme masterpiece The Plight of Metadata by acclaimed repository manager and film maker Pat Lockley. Mr Lockley assures us that the film will be “awesome, despite the limited CGI budget.”

So who should attend this Locate, Collate and Aggregate extravaganza? Anyone interested in open content, innovative use and management of teaching and learning resources, techies, geeks, rss wranglers, data miners and even the odd repository manager.

And what do we want? We want ideas! Lots of them! We want ideas, comments and input to other peoples ideas. We’re also looking for ideas for JISC CETIS technical mini-projects we can potentially take forward to run in parallel with the OER 2 Programme.

We’re not quite sure what the outputs of this session will be but we’re aiming to go beyond the boundaries of JISC programmes and domain focussed initiatives and we’re hoping for cross pollination and propagation of innovation ~~throughout the nation~~.

cetiswmd Activities

Posted on October 29, 2010 by Lorna Campbell

Phil has already blogged a summary of last week’s memorably tagged What Metadata or cetiswmd meeting. During the latter part of the meeting we split up to discuss practical tasks and projects that the community could undertake with support from CETIS and JISC to explore the kind of issues that were raised at the meeting. We agreed to draft a rough outline of some of these potential activities and then feed them back to the community for comment and discussion. So if you have any thoughts or suggestions please let us know. CETIS are proposing to set up a task group or working group of some kind to develop this work and to provide a forum to explore technical issues relating to the resource description, management and discovery in the context of open educational resources.

I helped to facilitate the breakout group that focused on what we might be able to achieve by looking at existing metadata collections. Here’s an outline of the activity what we discussed.

Textual Analysis of Metadata Records

A large number of existing collections of metadata records were identified by participants including NDLR, JorumOpen, OU openlearn, US data.gov collections, all of which could be analysed to ascertain which fields are used most widely and how they are described. Clearly this metadata exists in a wide range of heterogeneous formats so the task is not as simple as comparing like with like. The “traditional” way to compare different metadata schema and records is through the use of cross-walks. However developing cross walks is a non-trivial task that in itself requires considerable time and resource.

An alternative approach was put forward by ADL’s Dan Rehak who suggested treating the metadata collections as text, stripping out fields and formatting and running the raw data through a semantic analysis tool such as Open Calais. Open Calais uses natural language processing, machine learning and other methods to analyse documents and find the entities within them. Calais claim to go “well beyond classic entity identification and return the facts and events hidden within your text as well.”

Applying data mining and semantic analysis techniques to a large corpus of educational metadata records would be an interesting exercise in itself but until we attempt such an analysis it’s hard to speculate what it might be possible to achieve with the output data. It would certainly be valuable to compare frequently occurring terms and relationships with an analysis of search web logs to see if the metadata records are actually describing the characteristics that users are searching for.

There was general agreement amongst participants that this would be an interesting and innovative project. Participants felt it would be advisable to start small with a comparison of two or three metadata collections, possibly those of JorumOpen, Xpert and the OU Openlearn before taking this forward further.

One thing I am slightly unsure about regarding this method is that Open Calais identifies the relationship between words but once we strip out the metadata encoding of our sample records this information will be lost. I don’t know enough about how these semantic analysis tools work to know whether this is a problem or if they are clever enough for this not to be an issue. I suppose the only way we’ll find out if the results are sensible or useful is to give it a try!

I’d also be very interested to hear how this approach compares with work being undertaken on a much larger scale by the Digging into Data Challenge projects and Mimas’ Bringing Meaning into Search initiative.

Other Activities

Phil has already summarised the other possible tasks and activities put forward by the other breakout groups which include:

Establishing a common format for sharing search logs.
Identify which fields are used on advanced forms and how many people use advanced search facilities.
Analysis of the relative proportion of users who search and browse for resources and how many people click onwards from the initial resources.
Further development of the search questionnaire used by David Davies. If sufficient responses could be gathered to the same questions this would facilitate meta analysis of the results.
Work with communities around specific repositories and find out what works and doesn’t work across individual platforms and installations.
Create a research question inventory on the CETIS wiki and invite people to put forward ideas.

If anyone has any comments or suggestions on any of the above ideas we’d love to hear from you!

Lorna Campbell

Cetis Blog

Category Archives: educational content