Phil Barker » resource description http://blogs.cetis.org.uk/philb Cetis Blog Fri, 06 Jun 2014 11:06:54 +0000 en-US hourly 1 http://wordpress.org/?v=4.1.22 LRMI, Open badges and alignment objects http://blogs.cetis.org.uk/philb/2014/04/03/lrmi-open-badges-and-alignment-objects/ http://blogs.cetis.org.uk/philb/2014/04/03/lrmi-open-badges-and-alignment-objects/#comments Thu, 03 Apr 2014 11:59:55 +0000 http://blogs.cetis.org.uk/philb/?p=968 I had the pleasure yesterday to talk on the Mozilla Open Badges community call about how LRMI and Open Badges may intersect. Open Badges are a means of displaying digital recognition of skills and achievements, there’s a technical framework behind the badges that offers the means of providing data in support of the claimed achievement. A particular part of this technical framework is the assertion specification, which includes a pointer from each badge to “the educational standards this badge aligns to, if any”.  This parallels the LRMI alignment object  very closely: in short the educationalAlignment property that LMRI added to schema.org allows encoding of statements along the lines of “this resource [teaches|assess|requires|has level] X” where X is some point in an shared educational framework, e.g. of attainment standards, topics or educational levels or shared curriculum. Diagrammatically

The creative work aligns with a node in an educational framework. The alignment object identifies that node and the nature of the alignment.

The creative work aligns with a node in an educational framework. The alignment object identifies that node and the nature of the alignment.

The Mozilla badge alignment object is described thus:

Property Expected Type Description
name Text Name of the alignment.
url URL URL linking to the official description of the standard.
description Text Short description of the standard

and an example is provided

{
  "name": "Awesome Robotics Badge",
...
  "alignment": [
    { "name": "CCSS.ELA-Literacy.RST.11-12.3", 
      "url": "http://www.corestandards.org/ELA-Literacy/RST/11-12/3", 
      "description": "Follow precisely a complex multistep procedure when carrying out experiments, taking measurements, or performing technical tasks; analyze the specific results based on explanations in the text."
    }]
...
}

Diagrammatically:

The badge information includes an assertion that the skill or achievement aligns with some point in an educational standard

The badge information includes an assertion that the skill or achievement aligns with some point in an educational standard

Not only do the LRMI and Open Badge alignment objects both do the same thing they seem to have have the following semantically equivalent properties relating to identifying the thing that is aligned to:

  • OpenBadge alignment object URL == LRMI alignment object targetURL
  • OpenBadge alignment object name == LRMI alignment object targetName
  • OpenBadge alignment object description == LRMI alignment object targetDescription

(I like to think that this is not coincidence, but I don’t know how the similarity arose.)

The differences:

  • Open Badges do not identify the type of alignment. It has no need, I guess, since the alignment is always one of “asserts ability at” or something similar. LRMI currently recommends no relevant value.
  • Open Badges do not name the framework, I guess the assume that identifying the node will lead to knowledge of the framework. LRMI felt that this would not always be enough.
  • The LRMI alignment object can be used in conjunction with a property of schema.org/CreativeWorks, I don’t think Mozilla open badge assertions are creative works in that sense, I think they are some type of schema.org/Intangible.
  • Syntactically, OpenBadge assertions are made using JSON, I don’t think they use microdata. Through schema.org, LRMI uses microdata and JSON-LD.

aligning the alignment objects

The discussion that I hope to kick off with the Mozilla Open Badge and LRMI communities is should/could we make the similarities between the two alignment objects more explicit? This would give developers a two-for-one offer, understand the way Open Badges expresses alignment and you’ve understood what LRMI does, and vice versa. I don’t suppose either group wants to change a spec that is in productive use, but an informative statement about the similarities could be provided without changing either.

Beyond that I wonder if the Open Badge community have thought about use of schema.org when advertising badges, i.e. if you provide a webpage saying “we offer the following badges for X, Y and Z” would there be benefit in marking this up with schema.org microdata to improve discoverability by search engines? If there is benefit in doing so, then it would be worth thinking about what type of schema.org Thing badges are and how the LRMI alignment object might be attached to it.

The bigger picture is that someone working with the starting point of wanting to learn about something could find resources to help them learn it with the help of LRMI alignments and discover the means of showing that they had learnt it via Open Badge alignments.

]]>
http://blogs.cetis.org.uk/philb/2014/04/03/lrmi-open-badges-and-alignment-objects/feed/ 0
Explaining the LRMI Alignment Object http://blogs.cetis.org.uk/philb/2014/03/06/explaining-the-lrmi-alignment-object/ http://blogs.cetis.org.uk/philb/2014/03/06/explaining-the-lrmi-alignment-object/#comments Thu, 06 Mar 2014 15:04:20 +0000 http://blogs.cetis.org.uk/philb/?p=924 The educational alignment property and the associated alignment object that LRMI introduced into schema.org have been described as the “killer feature” for LRMI. However, I know from the number of questions asked about the alignment object and from examples I have seen of it being used wrongly that it is not the easiest construct to understand.

Perhaps the problems come from the nature of the alignment object as a conceptual abstraction, so maybe it will be help to show some concrete examples of how it may be used. However, bear in mind that the abstraction was a deliberate design decision made so that the alignment object should be more widely applicable than the examples given here. So I will first discuss a little about why some simpler more direct approaches were considered and rejected (as were some approaches that would be even more abstract).

basic use case

The general use case for which the alignment object was introduced to meet was , in brief,

“help people find resources that can be useful in teaching or learning in some specific scenario.”

That looks deceptively simple. The complications come when defining the “specific scenario” and unpacking the word “useful”

enter “educational frameworks”

One practical approach to defining various aspects of the specific scenario involves reference to an educational framework of some sort.  By educational framework I mean a structured description of educational concepts such as a shared curriculum, syllabus or set of learning objectives, or a vocabulary for describing some other aspect of education such as educational levels or reading ability.

“Educational framework” is a deliberately broad concept as we wanted LRMI to be applicable globally and across many levels and modes of education. Some specific examples are school-level curricula or attainment standards such as:

Perhaps more relevant to higher education many professional bodies define the competencies required to become a member of their  profession, for example:

As well as having a role in  defining competences and outcomes, measures of academic level or difficulty may be useful independently as reference points, for example:

  • the US K12 grade levels are well understood in terms of school level,
  • the more formally defined Scottish Credit and Qualifications Framework (SCQF) level descriptors
  • various empirical measures of reading difficulty, for example general idea of “reading age” and the specific measures of reading ability and text level used by lexile.

One the other hand you may just want to specify the subject being taught, or the educational discipline for which is it being taught. Various classifcation schemes for academic subjects are available, for example:

All of these frameworks (and many others) may be used to describe aspects of an educational scenario.

ways of being useful

Life isn’t simple enough for us to meet the use case described above by adding a single property to schema.org Creative Works to say that the resource “aligns with” (i.e. is useful in the context defined by) some entry or node in an educational framework.  In prescribing a “useful” resource we would want to distinguish between resources that teach and asses a topic; we also want a resource that assumes suitable previous knowledge, or requires some specific reading level, or assumes a certain general academic level. There may be other forms of alignment. There isn’t agreement on a minimum core set of properties required to address that word “useful” in the use case, but there is agreement that a resource can “align” with an “educational framework” in several ways, some of which we can enumerate. Hence the birth of the alignment property and abstract Educational Alignment object.

the abstraction

I think of it like this:

We start with a Creative work: simpleCreativeWork_small

and an educational framework:educationalFramework_small
(Note, there is no schema.org class of type EducationalFramework, but we assume that we can refer to some of the following properties pertaining to it: some text that identifies the framework as whole (let’s call it a name), and the URLs, names and/or descriptions of nodes within the framework.)

The alignment object alignmentObject_small was created to describe the relationship between the two. The following properties alignment objects are defined: educationalFramework, which can be used to hold text that identifies the educational framework you are pointing to;  targetDescription, targetName and targetURL, which can hold the values that correspond to properties we assumed that nodes in the educational framework would have. It also has an alignmentType property that I think of switching the object to specify the different types of alignment that are possible. So we can put them together to express an alignment between a creative work and some node in an educational framework:

educationalAlignment

common mistakes

I have seen both of these mistakes in actual markup of webpages.

1. the alignment object on its own is fairly meaningless. Unless it is referenced by the educational Alignment property of a creative work it’s as useful as half a link.

2. since the alignment object is a proper schema.org Thing (to be specific a subtype of an Intangible Thing) it inherits the properties that every schema.org Thing has. e.g. a name, a URL, a description an image. Some of these make some sense in some cases (see below) but importantly, none of them are used in expressing the alignment: the url of an alignment object is not the same as the url of the creative work or the node to which it aligns.

real-world examples of alignment assertions

I would like to use two real-world examples of where services provide information that can be seen as an assertion that a resource is useful in connection with (i.e. aligns with) an educational framework:

1. Kritikos, where students can tell other students what is useful for their course.

Screen shot of Kritikos information page about an MIT OCW lecture video.

Screen shot of Kritikos information page about an MIT OCW lecture video. See it in kritikos.

Kritikos is a custom search engine for visual media relevant to teaching and learning engineering.  In part the customisation comes through the use of a Google CSE,  but more relevant to this post is the part that comes through allowing users to classify whether resources found on it are useful for specific courses [aside: this part of the kritikos service is built on a Learning Registry node].

The example shown here is the kritikos information page for a video of a lecture from MIT Open CourseWare. It includes “what others are saying about this resource” with the information from a year 3 MEng Aerospace Engineering student that it is relevant to “Flight Dynamics and Control”. The link from this assertion leads to other resources deemed useful by users for that module. “Flight Dynamics and Control” is a module at the University of Liverpool (code AERO317) that exists within the framework of Liverpool’s Aerospace Engineering programme. It is worth noting that kritikos can also be used to record when a resource is not relevant to a course–this is useful for weeding out false positives that get through the Google custom search engine. [Disclosure/bragging: I had an advisory role in the project that lead to kritikos.]

So, there’s an expression of an educational alignment; how does it relate to the alignment object?

The creative work in question is the MIT lecture (to be precise it’s a http://schema.org/VideoObject), we could describe a few of its characteristics with schema.org properties:
name = “Lec 7 | MIT 16.885J Aircraft Systems Engineering, Fall 2005″
url=http://www.youtube.com/watch?v=2QRfkG7jOfY
duration = PT110M22S
I’m not guessing this, the YouTube page has Schema.org microdata in it.

The node in the educational framework is a bit less well defined, but we would be justified in calling the module description a node in a framework called “University of Liverpool Modules” and saying the name for this node is “AERO317″, its description is “Flight Dynamics and Control”. It has a page on the web which gives us a url, http://tulip.liv.ac.uk/mods/vital/vital_AERO317_200809.htm. So we can express the alignment:

item type=http://schema.org/VideoObject
    name = "Lec 7 | MIT 16.885J Aircraft Systems Engineering, Fall 2005"
    url = http://www.youtube.com/watch?v=2QRfkG7jOfY
    duration = PT110M22S
    educationalAlignment = item1

item1 type= http://schema.org/AlignmentObject
    alignmentType = "Teaches"
    educationalFramework = "University of Liverpool  Modules"
    targetName = "AERO317"
    targetDescrption = "Flight Dynamics and Control"
    targetUrl = http://tulip.liv.ac.uk/mods/vital/vital_AERO317_200809.htm

What about the other properties of the AlignmentObject, the ones it inherited by virtue of being an official Intangible Thing in the schema.org hierarchy? Well you could envisage the image property pointing to the screenshot above, and the url property being a url with a fragment identifier that points to the “what others are saying” part of the kritikos page. Sure, you can give it a name and descriptions if you want to. Maybe these aren’t especially useful, but the point it that they are clearly different from the url, name and description of the University of Liverpool course to which the MITOCW video aligns.

2. OER Commons, aligning to US Common Core State Standards

I’ll cover this in less detail. The main problem with the example above is that the educational framework, while locally useful, is somewhat ad hoc we had to kind of look at the course structure at Liverpool University in a certain way to see it as an educational framework. Better examples of a more widely shared and more formally constructed educational frameworks are those of the US Common Core State Standards Initiative.  OER Commons is a repository and search engine for Open Educational Resources that expresses alignment to these frameworks in its descriptions.

Screenshot from a resource description on OERCommons showing educational alignment information on the right.
Screenshot from a resource description on OERCommons showing educational alignment information on the right.

The screenshot on the left shows such an alignment being displayed (the image links to the actual page in question, which is more legible). You see that in this case the creative work called “Chocolate Chocolate Chocolate” aligns with the Common Core Standard “CCSS.ELA-Literacy.RL.1.9 : Compare and contrast the adventures and experiences of characters in stories.”

Interestingly there is some other information given about the “degree of alignment”, i.e. how good a match that resource is to teaching that State Standard.

justification for the abstraction of the alignment object

In part the motivation for creating an alignment object class in schema.org was the issue mentioned above  about not knowing what might be all the possible forms of alignment between a resource and an educational framework used to characterise some aspect of a teaching and learning scenario. However I hope the examples above go someway to showing that alignments are real (if intangible) things, you can give them URLs, and names if you want. Furthermore they do have properties. For example, they are asserted by someone: a student at Liverpool University in the kritikos example and a user of OER Commons in the other. In the OER Commons example there is other information about the degree of alignment. This goes some way to convincing me that the alignment object isn’t just some computer science trick of indirection.

]]>
http://blogs.cetis.org.uk/philb/2014/03/06/explaining-the-lrmi-alignment-object/feed/ 8
Where to put your EPUB metadata http://blogs.cetis.org.uk/philb/2014/01/15/where-to-put-your-epub-metadata/ http://blogs.cetis.org.uk/philb/2014/01/15/where-to-put-your-epub-metadata/#comments Wed, 15 Jan 2014 10:18:03 +0000 http://blogs.cetis.org.uk/philb/?p=918 Even in the knowledge that current mainstream EPUB readers and applications for managing eBooks will most likely ignore all but the most trivial metadata, we still have use cases that involve more sophisticate metadata. For example we would like to use the LRMI alignment object in schema.org to say that a particular subsection of a book can be useful in the context of a specific unit in a shared curriculum.

So, without evaluating pros and cons, starting from the most basic/most common, what are the options? This is a summary takes information from Garrish and Gulling, EPUB 3 Best Practices, OReilly 2013, (which I take to be authoritative and also as an example of best practice with regard to the metadata in the epub file) as well as the EPUB 3.0 Publications and Content Documents specifications. Any comments would be greatly appreciated.

1. Simple Dublin Core

Within the OEPBS directory of an unpacked EPUB3 is the content.opf file. It pretty much equates to the manifest of an IMS Content Package. The top-level element is <package> and <metadata> is a required first child of <package>.

The default metadata vocabulary is the Dublin Core Metadata Element Set (DCMES, simple DC), with prefix dc:. Three elements are mandatory–title, identifier and language–others are optional. For example, in /OEPBS/content.opf

<?xml version=’1.0’ encoding=’UTF-8’?>
<package xmlns:dc="http://purl.org/dc/elements/1.1/ [...]">
    <metadata>
        <dc:identifier>urn:isbn:9781449325299</dc:identifier>
        <dc:title>EPUB 3 Best Practices</dc:title>
        <dc:language>en</dc:language>
        <dc:rights>Copyright © 2013 Matt Garrish and Markus Gylling</dc:rights>
[...]

2 Other metadata schemas

The package element has a prefix attribute that may be used to declare prefixes for metadata schemas other than DCMES. Four vocabularies are reserved, i.e. the prefix may be used without a declaration: dcterms, marc, onix and media (the vocabulary used for EPUB3 media overlays). Example

<dcterms:title>EPUB 3 Best Practices</dcterms:title>

Other vocabularies may be used providing a prefix and a URL in a way so similar to xmlns that is makes you wonder why they didn’t just use xmlns.

<package prefix="prism: http://prismstandard.org/namespaces/basic/3.0/" [...]>

3 the meta element

If used without the refines attribute (see below) the meta element can provide information about the package as a whole, e.g.

<meta property="dcterms:title">EPUB 3 Best Practices</meta>

I have no idea what would be the benefit of this over <dcterms:title>.

4 Refining metadata elements: id attribute and the meta element

The id attribute can be used to provide an identifier any element in the metadata that it may be refined. One example of this is mandatory, i.e. that one occurrence of the dc:identifier element must be the publication identifier:

<dc:identifier id="pub-identifier">urn:isbn:9781449325299</dc:identifier>

In general the refinements are described using the meta element with the refines attribute and a property attribute that specifies the nature of the refinement. It’s kind of like RDF reification. The default vocabulary for the property attribute includes “file-as” – an alternative string for a name to be used when filing, “identifier-type” – a way to distinguish between different identifiers, “meta-auth” – the authority for a given instance of metadata, “title-type” – which of the six forms of title being provided.

<dc:creator id="1234">Matt Garrish</dc:creator>
<meta refines="#1234" property="file-as" id="5678">Garrish, Matt</meta>
<meta refines="#1234" property="role">Author</meta>

Terms from other vocabularies may be used for “property” so long as a prefix is declared.

Refinements may have ids and so may be refined.

<meta refines="#5678" property="meta-auth">Phil Barker</meta>

So and so you can make statements about your metadata statements to you heart’s content (though including the whole of the linked data graph in each epub would be silly).

The scheme attribute may be used to identify the controlled vocabulary from which the meta element’s value is drawn. For example, if the identifier is a DOI (which in onix is apparently entry 06 of codelist 5) you can have

<dc:identifier id="pub-id">urn:doi:10.1016/j.iheduc.2008.03.001 </dc:identifier>
<meta refines="#pub-id"
      property="identifier-type"
      scheme="onix:codelist5">06</meta>

Or, using the marc relator value Aut to specify author

<meta refines="#1234" property="role" scheme="marc:relators">Aut</meta>

5 Sub-package level metadata

The id attribute may be used to provide an identifier of an subelement of <package> or any element in the XHTML content documents, down to a span element around a phrase, word or character. So a chapter may have id=”chap1″ then we can use meta elements in the metadata to describe it seperately from the rest of the epub

<meta refines="#chap1" property="prism:contentType">bookChapter<meta>

6 Links to metadata records

The link element is an optional, repeatable subelement of <metadata>, “used to associate resources with a publication, such as metadata records” The metadata may be within package or anywhere on the www.
Example

<link rel="marc21xml-record" href="pub/meta/nor-wood-marc21.xml" />
<link refines="#chap1" rel="ex:schema_org-record"
      media-type="application/ld+json"
      href="http://example.org/nor-wood-lrmi.json" />

Metadata embedded in the XHTML5 content

As far as I can see the EPUB3 specs are mute on metadata in HTML of the content documents, e.g. as html:meta elements or as microdata or RDFa, there doesn’t seem to be any reason why one should not put metadata here. I wouldn’t expect any EPUB system to look that deeply into the package but it would be a good approach to helping the metadata travel with the resource if the EPUB is disaggregated and passed into a non-EPUB3 CMS.

]]>
http://blogs.cetis.org.uk/philb/2014/01/15/where-to-put-your-epub-metadata/feed/ 3
Heads up for HEDIIP http://blogs.cetis.org.uk/philb/2013/07/24/heads-up-for-hediip/ http://blogs.cetis.org.uk/philb/2013/07/24/heads-up-for-hediip/#comments Wed, 24 Jul 2013 14:10:30 +0000 http://blogs.cetis.org.uk/philb/?p=865 A while back I summarised the input about semantics and academic coding that Lorna and I had made on behalf of Cetis for a study on possible reforms to JACS, the Joint Academic Coding System. That study has now been published.

JACS is mainatained by HESA (the Higher Education Statistics Agency) and UCAS (Universities and Colleges Admissions Service) as a means of classifying UK University courses by subject; it is also used by a number of other organisations for classification of other resources, for example teaching and learning resources. The report (with appendices) considers the varying requirements and uses of subject coding in HE and sets out options for the development of a replacement for JACS.

Of course, this is all only of glancing interest, until you realise that stuff like Unistats and the Key Information Set (KIS) are powered by JACS.
- See more at Followers of the apocalypse

If you’re not sure why this should interest you (and yet for some reason have read this far) David Kernohan has written what I can only describe as an appreciation of the report, Hit the road JACS, from which the quote above is taken.

hediip_logoTo move forward from this and the other reports commissioned from the Redesigning the HE data landscape study, the Higher Education Data and Information Improvement Programme (HEDIIP) is being established to enhance the arrangements for the collection, sharing and dissemination of data and information about the HE system. Follow them on twitter.

]]>
http://blogs.cetis.org.uk/philb/2013/07/24/heads-up-for-hediip/feed/ 1
ePub metadata what gets shown? http://blogs.cetis.org.uk/philb/2013/06/18/epub-metadata/ http://blogs.cetis.org.uk/philb/2013/06/18/epub-metadata/#comments Tue, 18 Jun 2013 08:54:06 +0000 http://blogs.cetis.org.uk/philb/?p=848 One of the issues around eTextBooks is how to describe them, specifically by way of educational metadata in ePub. That’s something that on the face of it shouldn’t be too difficult to address (at least to the extent that we know how to describe any educational resource). One thing that would be useful in demonstrating different choices for educational metadata is an app or tool that will display any metadata found in the ePub package in a sensible way. As a bit of long shot I tried four eBook readers to see whether they would; they don’t. The details follow, if you’re interested, but do let me know if you know of any tool that might be useful.

The package metadata of an ePub can include a selection of Dublin Core elements and terms. These can be refined, for example you may have two dc:title elements with refinements to specify that one is the main title and the other the subtitle. You can also extend with elements from other XML namespaces, or if you prefer you can just link to a metadata record of your favourite flavour which can be either inside the ePub package or elsewhere on the web. Any of this metadata can relate to the eBook as a whole or some part of it, e.g. a single chapter or image. Without going into details there seems to be enough scope there to experiment with how educational characteristics of the eBook might be described.

But how to see the results? I took an ePub (a copy of O’reily’s EPUB 3 Best Practices, since it seemed likely to provide as good a starting point as I was going to find in a real book), made a copy, unzipped it and changed the values of the meta elements so that I could easily identify what elements were being displayed. For example I changed
<dc:title id="pub-title">EPUB 3 Best Practices</dc:title> to
<dc:title id="pub-title">dc:title</dc:title> and so on.

Here’s a list of the metadata elements in that file:

  • <dc:title id="pub-title">
  • <dc:creator id="..." >
  • <dc:publisher>
  • <dc:date>
  • <meta property="dcterms:modified">
  • <dc:identifier id="pub-identifier">
  • <dc:language id="pub-language">
  • <dc:contributor> (repeated)
  • <dc:rights>
  • <dc:subject>
  • <dc:description>
  • <meta id="meta-identifier" property="dcterms:identifier">
  • <meta property="dcterms:title" id="meta-title">
  • <meta property="dcterms:language" id="meta-language">
  • <meta property="dcterms:rights">
  • <meta property="dcterms:rightsHolder">
  • <meta property="dcterms:publisher">
  • <meta property="dcterms:subject">
  • <meta property="dcterms:description>
  • <meta id="...." property="dcterms:creator"> (repeated, different ids)
  • <meta name="cover" content="cover-image"/>
  • <meta property="ibooks:specified-fonts">

I then looked at this with various eBook readers:

Readium

I had hopes for Readium since it is pretty much the reference implementation of EPUB3. It displayed

in Readium

in Readium

  • dc:title
  • dc:creator
  • dc:publisher
  • dc:date
  • meta dcterms:modified
  • dc:identifier

Note that it doesn’t even check for a valid value for dates.

Calibre

Calibre, while it doesn’t claim to support ePub3 is targetted at managing personal book libraries. It displays:

in Calibre

in Calibre

  • dc:title
  • dc:creator
  • dc:subject (for tags)
  • dc:description
  • dc:publisher

It probably uses dc:language and dc:date (for published) as well but recognises that the values dc:language / dc:date aren’t valid.

Ideal Reader for Android

The Ideal Reader for Android is the other ePub3 reader I use. It displays

In Ideal Android Reader

In Ideal Android Reader

  • dc:title
  • meta dcterms:creator (just one of them)
  • dc:date
  • dc:publisher
  • dc:description
  • dc:subject
  • dc:rights

iTunes

Finally I gave a chunk of diskspace to Apple

in iTunes desktop for Windows 7

in iTunes desktop for Windows 7

  • dc:title
  • dc:creator
  • dc:title (again)
  • dc:subject (in the info tab, as Genre)

Yep, title is there twice: the info tab shows dc:title in the Name and Album fields, so you can gauge the amount of effort that Apple have put into adapting iTunes for books.

What did I learn?

I learnt that none of the ePub reading/management apps or tools that I have show more than the bare minimum of metadata, even if it is there. None of them will be much good for trying out ideas for how educationally characteristics can be described since I strongly suspect that none of it will be viewable. That’s not too surprising, especially when you consider that none of the tools I looked at are geared around resource discovery, but I can’t really go uploading dummy ePub files to book seller sites just to see what they look like. May be any meaningful exploration/demonstration of educational metadata in ePub is going to need a bespoke application, but if you know of a tool that might be helpful do drop me a line.

]]>
http://blogs.cetis.org.uk/philb/2013/06/18/epub-metadata/feed/ 4
ebooks 2013 http://blogs.cetis.org.uk/philb/2013/05/13/ebooks-2013/ http://blogs.cetis.org.uk/philb/2013/05/13/ebooks-2013/#comments Mon, 13 May 2013 11:33:06 +0000 http://blogs.cetis.org.uk/philb/?p=828 Every year for the past dozen or so years the Department of Information Sciences at UCL have organised a meeting on ebooks. I’ve only been to one of them before, two or three years ago, when the big issues were around what publishers’ DRM requirements for ebooks meant for libraries. I came away from that musing on what the web would look like if it had been designed by publishers and librarians (imagine questions like: “when you lend out our web page, how will you know that the person looking at the screen is a member of your library?”…). So I wasn’t sure what to expect when I decided to go to this year’s meeting. It turned out to be far more interesting than I had hoped, I latched on to three themes of particular interest to me: changing paradigms (what is an ebook?), eTextBooks and discovery.

Changing paradigms

With the earliest printed books, or incunabula, such as the Gutenberg Bible, printers sought to mimic the hand written manuscripts with which 15th cent scholars were familiar; in much the same way as publishers now seek to replicate printed books as ebooks.

With the earliest printed books, or incunabula, such as the Gutenberg Bible, printers sought to mimic the hand written manuscripts with which 15th cent scholars were familiar; in much the same way as publishers now seek to replicate printed books as ebooks.

In the first presentation of the day Lorraine Estelle, chief executive of Jisc Collections, focussed on access to electronic resources. Access not lending; resources not ebooks. She highlighted the problems of using yesterday’s language and thinking as being problematic in this context, like having a “horseless carriage” and buying it hay. [This is my chance to make the analogy between incunabula and ebooks again, see right.] The sort of discussions I recalled from the previous meeting I attended reflect this thinking, publishers wanting a digital copy of a book to be equivalent to the physical book, only lendable to one person at a time and to require replacing after a certain number of loans.

We need to treat digital content as offering new possibilities and requiring new ways of working. This might be uncomfortable for publishers (some more than others) and there was some discussion about how we cannot assume that all students will naturally see the advantages, especially if they have mostly encountered problematic content that presents little that could not be put on paper but is encumbered with DRM to the point that it is questionable as to whether they really own the book. But there is potential as well as resistance. Of course there can be more interesting, more interactive content–Will Russell of the Royal Society of Chemistry described how they have been publishing to mobile devices, with tools such as Chem Goggles that will recognise a chemical structure and display information about the chemical. More radically, there can also be new business models: Lorraine suggested Institutions could become publishers of their own teaching content, and later in the day Caren Milloy, also of Jisc Collections, and Brian Hole of Ubiquity Press pointed to the possibilities of open access scholarly publishing.

Caren’s work with the OAPEN Library is worth looking through for useful information relating to quality assurance in open monograms such as notifying readers of updates or errata. Caren also talked about the difficulties in advertising that a free online version of a resource is available when much of the dissemination and discovery ecosystem (you know, Amazon, Google…) is geared around selling stuff, difficulties that work with EDitEUR on the ONIX metadata scheme will hopefully address soon.

Brian described how Ubiquity Press can publish open access ebooks by driving down costs and being transparent about what they charge for. They work from XML source, created overseas, from which they can publish in various formats including print on demand, and explore economies of scale by working with university presses, resulting in a charge to the author (or their funders) of about £150 for a chapter assuming there is nothing to complex in that chapter.

eTextBooks

All through the day there were mentions of eTextBooks, starting again with Lorraine who highlighted the paperless medic and how his quest to work only with digital resources is complicated by the non-articulation of the numerous systems he has to use. When she said that what he wanted was all his content (ebooks, lecture handouts, his own notes etc.) on the same platform, integrated with knowledge about when and where he had to be for lectures and when he had exams, I really started to wonder how much functionality can you put into an eContent platform before it really becomes a single-person content-oriented VLE. And when you add in the ability to share notes with the social and communication capability of most mobile devices, what then do you have?

A couple of presentations addressed eTextBooks directly, from a commercial point of view. Jenni Evans spoke about Vital Source and Andrejs Alferovs about Kortext both of which are in the business of working with institutions distributing online textbooks to students. Both seem to have a good grasp of what students want, which I think should be useful requirements to feed into eTextBook standardization efforts such as eTernity, these include:

  • ability to print
  • offline access
  • availability across multiple devices
  • reliable access under load
  • integration with VLE
  • integration with syllabus/curriculum
  • epub3 interactive content
  • long term access
  • ability for student to highlight/annotate text and share this with chosen friends
  • ability to search text and annotations

Discovery

There was also a theme of resource discovery running through the day, and I have already mentioned in passing that this referenced Google and Amazon, but also social media. Nick Canty spoke about a survey of library use of social media, I thought it interesting that there seemed to be some sophisticated use of the immediacy of Twitter to direct people to more permanent content, e.g. to engagement on Facebook or the library website.

Both Richard Wallis of OCLC and Robert Faber of OUP emphasized that users tend to use Google to search and gave figures for how much of the access to library catalogue pages came direct from Google and other external systems, not from their own catalogue search interface. For example the Biblioteque Nationale de France found that 80% of access to their catalogue pages cam directly from web search engines not catalogue searches, and Robert gave similar figures for access to Oxford Journals. The immediate consequence of this is that if most people are trying to find content using external systems then you need to make sure that at least some (as much as possible, in fact) of your content is visible to them–this feeds in to arguments about how open access helps solve discoverability problems. But Richard went further, he spoke about how the metadata describing the resources needs to be in a language that Google/Bing/Yahoo understand, and that language is schema.org. He did a very good job distinguishing between the usefulness of specialist metadata schema for exchanging precise information between libraries or publishers, but when trying to pass general information to Google:

it’s no use using a language only you speak.

Richard went on to speak about the Google Knowledge graph and their “things not strings” approach facilitated by linked data. He urged libraries to stop copying text and to start linking, for example not to copy an author name from an authority file but to link to the entry in that file, in Eric Miller’s words to move from cataloguing to “catalinking”.

ebooks?

So was this really about ebooks? Probably not, and the point was made that over the years the name of the event has variously stressed ebooks and econtent and that over that time what is meant by “ebook” has changed. I must admit that for me there is something about the idea of a [e]book that I prefer over a “content aggregation” but if we use the term ebook, let’s use it acknowledging that the book of the future will be as different from what we have now as what we have now is from the medieval scroll.

Picture Credit
Scanned image of page of the Epistle of St Jerome in the Gutenberg bible taken from Wikipedia. No Copyright.

]]>
http://blogs.cetis.org.uk/philb/2013/05/13/ebooks-2013/feed/ 1
Learning Resource Metadata is Go for Schema http://blogs.cetis.org.uk/philb/2013/04/24/lrmi-in-schema/ http://blogs.cetis.org.uk/philb/2013/04/24/lrmi-in-schema/#comments Wed, 24 Apr 2013 13:44:39 +0000 http://blogs.cetis.org.uk/philb/?p=819 The Learning Resource Metadata Initiative aimed to help people discover useful learning resources by adding to the schema.org ontology properties to describe educational characteristics of creative works. Well, as of the release of schema draft version 1.0a a couple of weeks ago, the LRMI properties are in the official schema.org ontology.

Schema.org represents two things: 1, an ontology for describing resources on the web, with a hierarchical set of resource types each with defined properties that relate to their characteristics and relationships with other things in the schema hierarchy; and 2, a syntax for embedding these into HTML pages–well, two syntaxes, microdata and RDFa lite. The important factor in schema.org is that it is backed by Google, Yahoo, Bing and Yandex, which should be useful for resource discovery. The inclusion of the LRMI properties means that you can now use schema.org to mark up your descriptions of the following characteristics of a creative work:

audience the educational audience for whom the resource was created, who might have educational roles such as teacher, learner, parent.

educational alignment an alignment to an established educational framework, for example a curriculum or frameworks of educational levels or competencies. Expressed through an abstract thing called an Alignment Object which allows a link to and description of the node in the framework to which the resource aligns, and specifies the nature of the alignment, which might be that the resource ‘assesses’, ‘teaches’ or ‘requires’ the knowledge/skills/competency to which the resource aligns or that it has the ‘textComplexity’, ‘readingLevel’, ‘educationalSubject’ or ‘educationLevel’ expressed by that node in the educational framework.

educational use a text description of purpose of the resource in education, for example assignment, group work.

interactivity type The predominant mode of learning supported by the learning resource. Acceptable values are ‘active’, ‘expositive’, or ‘mixed’.

is based on url A resource that was used in the creation of this resource. Useful for when a learning resource is a derivative of some other resource.

learning resource type The predominant type or kind characterizing the learning resource. For example, ‘presentation’, ‘handout’.

time required Approximate or typical time it takes to work with or through this learning resource for the typical intended target audience

typical age range The typical range of ages the content’s intended end user.

Of course, much of the other information one would want to provide about a learning resource (what it is about, who wrote it, who published it, when it was written/published, where it is available, what it costs) was already in schema.org.

Unfortunately one really important property suggested by LRMI hasn’t yet made the cut, that is useRightsURL, a link to the licence under which the resource may be used, for example the creative common licence under which is has been released. This was held back because of obvious overlaps with non-educational resources. The managers of schema.org want to make sure that there is a single solution that works across all domains.

Guides and tools

To promote the uptake of these properties, the Association of Educational Publishers has released two new user guides.

The Smart Publisher’s Guide to LRMI Tagging (pdf)

The Content Developer’s Guide to the LRMI and Learning Registry (pdf)

There is also the InBloom Tagger described and demonstrated in this video.

LRMI in the Learning Registry

As the last two resources show, LRMI metadata is used by the Learning Registry and services built on it. For what it is worth, I am not sure that is a great example of its potential. For me the strong point of LRMI/schema.org is that it allows resource descriptions in human readable web pages to be interpreted as machine readable metadata, helping create services to find those pages; crucially the metadata is embedded in the web page in way that Google trusts because the values of the metadata are displayed to users. Take away the embedding in human readable pages, which is what seems to happen when used with the learning registry, and I am not sure there is much of an advantage for LRMI compared to other metadata schema,–though to be fair I’m not sure that there is any comparative disadvantage either, and the effect on uptake will be positive for both sides. Of course the Learning Registry is metadata agnostic, so having LRMI/schema.org metadata in there won’t get in the way of using other metadata schema.

Disclosure (or bragging)

I was lucky enough to be on the LRMI technical working group that helped make this happen. It makes me vary happy to see this progress.

]]>
http://blogs.cetis.org.uk/philb/2013/04/24/lrmi-in-schema/feed/ 3
On Semantics and the Joint Academic Coding System http://blogs.cetis.org.uk/philb/2013/04/17/on-semantics-and-the-joint-academic-coding-system/ http://blogs.cetis.org.uk/philb/2013/04/17/on-semantics-and-the-joint-academic-coding-system/#comments Wed, 17 Apr 2013 11:22:38 +0000 http://blogs.cetis.org.uk/philb/?p=802 Lorna and I recently contributed a study on possible reforms to JACS, a study which is part of a larger piece of work on Redesigning the HE data landscape. JACS, the Joint Academic Coding System, is mainatained by HESA (the Higher Education Statistics Agency) and UCAS (Universities and Colleges Admissions Service) as a means of classifying UK University courses by subject; it is also used by a number of other organisations for classification of other resources, for example teaching and learning resources. The study to which we were contributing our thoughts had already identified a problem with different people using JACS in different ways, which prompted the first part of this post. We were keen to promote technical changes to the way that JACS is managed that would make it easier for other people to use (and incidentally might help solve some of the problems in further developing JACS for use by HESA and UCAS), which are outline in the second part.

There’s nothing new here, I’m posting these thoughts here just so that they don’t get completely lost.

Subjects and disciplines in JACS

One of the issue identified with the use of JACS is that “although ostensibly believing themselves to be using a single system of classification, stakeholders are actually applying JACS for a variety of different purposes” including Universities who “often try to align JACS codes to their cost centres rather than adopting a strictly subject-based approach”. The cost centres in question are academic schools or departments, which are discipline based. This is problematic to the use of JACS to monitor which subjects are being learnt since the same subject may be taught in several departments. A good example of this is statistics, which is taught across many fields from Mathematics through to social sciences, but there are many other examples: languages taught in mediaeval studies and business translation courses, elements of computing taught in electronic engineering and computer science and so on. One approach would be to ignore the discipline dimension, to say the subject is the same regardless of the different disciplinary slants taken, that is to say statistics taught to mathematicians is the same as statistics taught to physicists is the same as statistics taught to social sciences. This may be true at a very superficial level, but obviously the relevance of theoretical versus practical elements will vary between those disciplines, as will the nature of the data to be analysed (typically a physicist will design an experiment to control each variable independently so as not to deal with multivariate data, this is not often possible in social sciences and so multivariate analysis is far more important). When it comes to teaching and learning resources something aimed at one discipline is likely to contain examples or use an approach not suited to others.

Perhaps more important is that academics identify with a discipline as being more than a collection of subjects being taught. It encapsulates a way of thinking, a framework for deciding on which problems are worth studying and a way of approaching these problems. A discipline is a community, and an academic who has grown up in a community will likely have acquired that community’s view of the subjects important to it. This should be taken into account when designing a coding scheme that is to be used by academics since any perception that the topic they teach is being placed under someone else’s discipline will be resisted as misrepresenting what is actually being taught, indeed as a threat to the integrity of the discipline.

More objectively, the case for different disciplinary slants on a problem space being important is demonstrated by the importance of multidisciplinary approaches to solving many problems. Both the reductionist approach of physics and the holistic approach of humanities and social sciences have their strengths, and it would be a shame if the distinction were lost.

The ideal coding scheme would be able to represent both the subject learnt and the discipline context in which it was learnt.

JACS and 5* data

Tim Berners-Lee suggested a 5 star deployment scheme for open data on the world wide web:
* make your stuff available on the Web (whatever format) under an open licence
** make it available as structured data (e.g., Excel instead of image scan of a table)
*** use non-proprietary formats (e.g., CSV instead of Excel)
**** use URIs to denote things, so that people can point at your stuff
**** link your data to other data to provide context

Currently JACS fails to meet the open licence requirement for 1-star data explicitly, but that seems to be a simple omission of a licensing statement that shows the intention that JACS should be freely available for others to use. It is important that this is fixed, but aside from this, JACS operates at about 3-star level. Assigning URIs to JACS subjects and providing useful information when someone accesses these URIs will allow JACS to be part of the web of linked open data. The benefits of linking data over the web include:

  • The identifiers are globally unique and unambiguous, they can be used in any system without fear of conflicting with other identifiers.
  • The subjects can be referenced globally by humans by from websites, emails, and by computer systems in/from data feeds and web applications.
  • The subjects can be used for semantic web approaches to representing ontologies, such as RDF.
  • These allow relationships such as subject hierarchies and relationships with other concepts (e.g. academic discipline) to be represented independently of the coding scheme used. An example of this is SKOS, see below.

In practical terms, implementing this would mean:

  • Devising a URI scheme. This could be as simple as adding the JACS codes to a suitable base URI. For example H713 could become http://id.jacs.ac.uk/H713
  • Setting up a web service to provide suitable information. Anyone connecting to that URL would be redirected to information that matched parameters in their request. A simple web browser would request an HTML page and so be redirected to http://id.jacs.ac.uk/H713.html; web applications would request data in a machine readable form such as xml, rdf or json.

The main overhead is in setting up, maintaining and managing the data provided by the web service, but Southampton University have already set one up for their own use. (The only problem with the Southampton service–and I believe Oxford have done something similar–is a lack of authority, i.e. it isn’t clear to other users whether the data from this service is authoritative, up to date, used under a correct license, sustainable.)

JACS and SKOS

SKOS (Simple Knowledge Organization System) is a semantic web application of RDF which provides a model for expressing the basic structure and content of concept schemes such as thesauri, classification schemes, subject heading lists, taxonomies, folksonomies, and other similar types of controlled vocabulary. It allows for the description of a concept and the expression of the relationship betweens pairs of concepts. But first the concept must be identified as such, with a URI. For example:
jacs:H713 rdf:type skos:concept
In this example jacs: is shorthand for the JACS base URI, http://id.jacs.ac.uk/ as suggested above; rdf: and skos: are shorthand for the base URIs for RDF and SKOS. This triple says “The thing identified by http://id.jacs.ac.uk/H713 is a resource of type (as defined by RDF) concept (as defined by SKOS)”.

Other assertions can be made about the resource, e.g. the preferred label to be used for it and a scope note for it.
jacs:H713 skos:prefLabel “Production Processes”
jacs:H713 skos:scopeNote “The study of the principles of engineering as they apply to efficient application of production-line technology.”

Assuming the other JACS codes have been suitably identified, relationships between them can be described:
jacs:H713 skos:broader jacs:H710
jacs:H713 skos:related jacs:H830

Once JACS is on the semantic web relationships between the JACS subjects and things in other schemas can also be represented
http://example.org/123 dct:subject jacs:H713
(The resource identified by the URI http://example.org/123 is about the subject identified by jacs:H713).

]]>
http://blogs.cetis.org.uk/philb/2013/04/17/on-semantics-and-the-joint-academic-coding-system/feed/ 6
Book now available. Into the Wild – Technology for Open Educational Resources http://blogs.cetis.org.uk/philb/2013/03/21/into-the-wild/ http://blogs.cetis.org.uk/philb/2013/03/21/into-the-wild/#comments Thu, 21 Mar 2013 11:04:30 +0000 http://blogs.cetis.org.uk/philb/?p=795
Into the Wild (Book cover)

Into the Wild (Book cover)

With great pleasure and more relief I can now announce the availability of Into the wild – technology for open educational resources, a book of our reflections on the technology involved in three years of the UK OER Programmes.

From the blurb:

Between 2009 and 2012 the Higher Education Funding Council funded a series of programmes to encourage higher education institutions in the UK to release existing educational content as Open Educational Resources. The HEFCE-funded UK OER Programme was run and managed by the JISC and the Higher Education Academy. The JISC CETIS “OER Technology Support Project” provided support for technical innovation across this programme. This book synthesises and reflects on the approaches taken and lessons learnt across the Programme and by the Support Project.

This book is not intended as a beginners guide or a technical manual, instead it is an expert synthesis of the key technical issues arising from a national publicly-funded programme. It is intended for people working with technology to support the creation, management, dissemination and tracking of open educational resources, and particularly those who design digital infrastructure and services at institutional and national level.

You may remember Lorna writing back in August that Amber Thomas, Martin Hawksey, Lorna and I had written 90% of this book together in a Book Sprint. Well, the last 10% and the publication turned in to a bit of a marathon-relay, something about which I might write some time, but now the book is available in a variety of formats:

  • If you want glossy-covered paperback, then you can order it print-on-demand from Lulu (£3.36); if you’re not so fussed about the glossy cover and binding then there is a print-quality pdf you can print yourself.
  • If you have an ePub reader you can download, there is a free download of an epub2 file.
  • If you have a Kindle, you can download the .mobi file and transfer it, or if you prefer the convenience of Amazon’s distribution over whisper-net you can buy it from them (77p, they don’t seem to distribute for free unless you agree to give them exclusive rights for all electronic formats).
  • finally, if you prefer your ebook reading as PDFs, there is one of those too.

All varieties are free or at minimum cost for the distribution channel used; the content is cc-by licensed and editable versions are available if you wish to remix and fix what we’ve done.

Available via the Cetis publications site.

]]>
http://blogs.cetis.org.uk/philb/2013/03/21/into-the-wild/feed/ 1
At the end of the JLeRN experiment http://blogs.cetis.org.uk/philb/2012/10/26/at-the-end-of-the-jlern-experiment/ http://blogs.cetis.org.uk/philb/2012/10/26/at-the-end-of-the-jlern-experiment/#comments Fri, 26 Oct 2012 15:53:18 +0000 http://blogs.cetis.org.uk/philb/?p=718 The JLeRN experiment was a toe dipped in the learning registry, a trial at different approach to sharing information about learning resources and how they are used that focusses on getting the information out there and not on worrying over the schemas and formats in which the information is conveyed. That experiment (JLeRN, not the Learning Registry as a whole) is drawing to a close, so we had a meeting earlier this week to review what had been done, what had been learnt and what was left to do and learn.

Sarah Currier had arranged for projects that had worked with JLeRN blog something about what they had done before the meeting, here’s the email with a summary of them, if you haven’t come across JLeRN before you might want to have a look through them before reading on. What I want to describe here is my own understanding of where the Learning Registry is and to report some of the issues about it raised at the meeting.

The Learning Registry: Nodes or a network?

The learning registry as a network from a presentation by Dan Rehak and others.. © Copyright 2011 US Advanced Distributed Learning Initiative: CC-BY-3.0.

The learning registry as a network from a presentation by Dan Rehak and others.. © Copyright 2011 US Advanced Distributed Learning Initiative: CC-BY-3.0.

From the outset the Learning Registry was conceived as a network, the software created would be nodes that connected together to share data about resources. Some of the details have been put on the back burner since those early descriptions, for example the ideas of communities and gateway nodes haven’t been much developed.

The community map on the Learning Registry website shows three nodes (the red pins), including the JLeRN node; Steve Midgely told us via email “There are a few development nodes out there that we know of: Agilix, Illinois Dept of Commerce and California Dept of Ed. To my knowledge there are no production nodes beyond the ones we currently run. Several companies have expressed interest in taking over our production nodes including Dell, Cisco and Amazon.” To that tally I can add the EngRich node at Liverpool. Steve adds that the only network he knows of is the LR public network. Now, I’m not sure about the other nodes, but I do know that the JLeRN and EngRich nodes haven’t interacted with the public network in any meaningful way (yet).

So I think we have to say that, to date, there isn’t really much to prove the concept of the Learning Registry as a network. There are, however some developments in the works that I think will change that, for example the Learning Registry Index, see below.

Services
The other aspect of the development of the Learning Registry against the vision shown in the diagram above is that of services being built to interact with the data in the nodes (these are shown as square in the diagram above). This is crucial since the Learning Registry is no more than plumbing to shift data around, it does nothing with that data that would interest a teacher or learner. It is left to others to develop services that meet user needs–Pat Lockley summed this up quite nicely in his presentation showing how the learning registry was targeted at developers and promoted relationships between developers, service managers and users more than was the case with traditional repository software.

“I think the major point of my slides was to suggest the learning registry is a “developer’s repository” – not that you need a developer to use it, more that you develop services around a node. Also, I feel there is a greater role for the developer in the ecosystems around a node than around a repository – the services on offer, and the scope of services you create seem richer – partially as any data can be stored.”

Well, there are some services for getting data in, there is the OAI-PMH to Learning Registry Publish Utility, and there is Pat’s RSS importer, Ramanathan, and his Google analytics data importer, Pliny. Also at least two projects–Scott Wilson’s SPAWS and Liverpool University’s EngRich–had involved the submission of data to Learning Registry nodes as part of the services they created.

But putting data in is meeting a service manager’s needs, it’s no good in itself since it doesn’t meet any user needs. There are a few user oriented services built off data in the Learning Registry. Pat showed us a couple of Chrome plugins, demos here and here. These are great as proofs of concept, and really important as such, they help show non-technical people what the learning registry is for. But there then follows some expectation management while you explain the limitations of the demonstrators. Other projects had embedded means of getting data out of the Learning Registry nodes into their project outputs, for example EngRich have an iLike widget for the Liverpool student portal that shows what resources students on specific courses have recommended based on data in their Learning Registry node.

Steve Midgely provided us with some very promising information, “the Gates foundation is funding several groups to build index and search services on top of Learning Registry (called Learning Registry Index) and that will require running nodes of some kind.”

Does it work?
One message that I picked up during the meeting and elsewhere is that the Learning Registry, as software, works. The people who set up nodes seem to have done so quickly, the people who used the APIs didn’t report problems in doing so. That’s a good place to be starting.

At a deeper level I guess we need to wait until there are more services built off the data in the Learning Registry to find out whether the Learning Registry works as a concept. Some known problems have been deliberately pushed out of scope in the development of the Learning Registry, one key one is not worrying about what formats and schemas for the data that goes in. This is good if you are submitting data, but unless some level of agreement is reached it does place the onus for making sense of the data on the people who are creating services that use the data. So far, the extent to which this (reaching agreement or making sense of arbitrary data) is possible in the context of the Learning Registry is untested.

Other questions remain over how the learning registry will function as a network, for example how duplicate and complementary records about the same resource will be dealt with when many people might be providing information about the same resource.

Why use it?
Owen Stephens and David Kay were at the meeting asking some very pertinent questions. Neither are particularly caught up in the education technology world, with more of a background in information systems for libraries, where of course there are different approaches to solving similar problems. So, why use the Learning Registry rather than raw couchDB, or some other schemaless, NoSQL, document store (e.g. MongoDB, which is popular for research data management), or free text indexing and search software such as Lucene/Solr, or RDF triple stores, or just a traditional relational database with SQL? To some extent the aim at the moment is to try and answer some of those questions: we won’t know if we don’t try it. But it’s valid to ask how far have we got to answering them, and here is my appraisal.

RDF?
Schemaless sharing of data still appeals to me because I don’t think we know what schema we want to use to share some of the interesting information about the use of resources for teaching and learning. I think the RDF approach will influence the data that is submitted, for example there is interest in using the Learning Registry to store LRMI style metadata. LRMI is adding properties to schema.org so that educational characteristics of resources can be described, and schema.org is only a step or two away from semantic web approaches such as RDF. But some influences of RDF we don’t want. For example there is a tendency at times for RDF approaches to fixate on ontologies. That would stall us. So, for example in LRMI it is possible to say that a resource “aligns” with some point in an educational framework: i.e. it is useful for teaching some topic in a standard curriculum, or assessing some skill required by a competency framework. That’s really useful, but the vocabulary for the nature of the alignment has had to be left open (“teaches” and “assess” are two suggested terms, others are that the resource has a certain “text complexity” or requires a “reading level” or other “educational level”)–the understanding of what education is about varies so much over the world and between settings that agreement on a closed ontology seems unattainable. Still, you could use RDF if you didn’t specify and ontology, and if you could make sense of the RDF without one.

Another weakness of RDF in this context, as I understand it, is its ability to deal with subjective opinions. As soon as a teacher or learner sees an assertion that resource X is good for teaching topic Y (to continue the example used above) they should be asking “says who”. Engineering students at Liverpool are more interested in what other Engineering students find useful, especially those at Liverpool, than they are in the opinions of physics students. Yes, you can have named graphs in RDF and provide information about who asserted which triples, but it goes beyond what is usual, whereas in it is built in from the start in the Learning Registry concept of paradata.

All of that is somewhat conjectural though, because as yet there is little in the Learning Registry that is not metadata that could be expressed in some standard schema such as LOM XML or DC RDF.

Other schemaless data stores
Why not use just CouchDB, without the Learning Registry API, or MongoDB, or Lucene? All of these would make sense for single instance data stores, which is pretty much what we have now with single more-or-less isolated nodes rather than a network. And, yes, I am sure that some way of sharing data between them could be worked up if that is what you wanted. So again any advantages of the Learning Registry is still putative at this stage.

One advantage of the Learning Registry is that, as I mentioned above, it does seem to work: it does seem to come out of the package as a functional way of storing and sharing data that is tailored to education. So as an introduction to No SQL databases it’s not a bad place for the education community to start.

In summary
In a post about the end of the JLeRN project David Kay has quoted Simon Schama on his not being sure whether the French Revolution was over. I’ll quote what Chairmain Mao supposedly said when asked what he thought of the French Revolution; “it’s too early to tell”. The things to look out for are a functioning network of nodes and user-facing services being delivered from data in those nodes. Then we can ask whether that data could be shared in any other way. For the time being I think that the main achievement of JLeRN and the UK’s involvement in the Learning Registry is that it has started people thinking about alternatives to relational databases and they have taken first steps into working with these. Too often, I think, data has been squeezed into an relational data where the benefits of doing so are simply that it is what the developer happens to be familiar with. If all you have is a hammer then you can have real problems dealing with screws.

[updated to correct an attribution error as to who was comparing JLeRN to the French revolution]

]]>
http://blogs.cetis.org.uk/philb/2012/10/26/at-the-end-of-the-jlern-experiment/feed/ 5