Phil Barker » metadata http://blogs.cetis.org.uk/philb Cetis Blog Fri, 06 Jun 2014 11:06:54 +0000 en-US hourly 1 http://wordpress.org/?v=4.1.22 LRMI, Open badges and alignment objects http://blogs.cetis.org.uk/philb/2014/04/03/lrmi-open-badges-and-alignment-objects/ http://blogs.cetis.org.uk/philb/2014/04/03/lrmi-open-badges-and-alignment-objects/#comments Thu, 03 Apr 2014 11:59:55 +0000 http://blogs.cetis.org.uk/philb/?p=968 I had the pleasure yesterday to talk on the Mozilla Open Badges community call about how LRMI and Open Badges may intersect. Open Badges are a means of displaying digital recognition of skills and achievements, there’s a technical framework behind the badges that offers the means of providing data in support of the claimed achievement. A particular part of this technical framework is the assertion specification, which includes a pointer from each badge to “the educational standards this badge aligns to, if any”.  This parallels the LRMI alignment object  very closely: in short the educationalAlignment property that LMRI added to schema.org allows encoding of statements along the lines of “this resource [teaches|assess|requires|has level] X” where X is some point in an shared educational framework, e.g. of attainment standards, topics or educational levels or shared curriculum. Diagrammatically

The creative work aligns with a node in an educational framework. The alignment object identifies that node and the nature of the alignment.

The creative work aligns with a node in an educational framework. The alignment object identifies that node and the nature of the alignment.

The Mozilla badge alignment object is described thus:

Property Expected Type Description
name Text Name of the alignment.
url URL URL linking to the official description of the standard.
description Text Short description of the standard

and an example is provided

{
  "name": "Awesome Robotics Badge",
...
  "alignment": [
    { "name": "CCSS.ELA-Literacy.RST.11-12.3", 
      "url": "http://www.corestandards.org/ELA-Literacy/RST/11-12/3", 
      "description": "Follow precisely a complex multistep procedure when carrying out experiments, taking measurements, or performing technical tasks; analyze the specific results based on explanations in the text."
    }]
...
}

Diagrammatically:

The badge information includes an assertion that the skill or achievement aligns with some point in an educational standard

The badge information includes an assertion that the skill or achievement aligns with some point in an educational standard

Not only do the LRMI and Open Badge alignment objects both do the same thing they seem to have have the following semantically equivalent properties relating to identifying the thing that is aligned to:

  • OpenBadge alignment object URL == LRMI alignment object targetURL
  • OpenBadge alignment object name == LRMI alignment object targetName
  • OpenBadge alignment object description == LRMI alignment object targetDescription

(I like to think that this is not coincidence, but I don’t know how the similarity arose.)

The differences:

  • Open Badges do not identify the type of alignment. It has no need, I guess, since the alignment is always one of “asserts ability at” or something similar. LRMI currently recommends no relevant value.
  • Open Badges do not name the framework, I guess the assume that identifying the node will lead to knowledge of the framework. LRMI felt that this would not always be enough.
  • The LRMI alignment object can be used in conjunction with a property of schema.org/CreativeWorks, I don’t think Mozilla open badge assertions are creative works in that sense, I think they are some type of schema.org/Intangible.
  • Syntactically, OpenBadge assertions are made using JSON, I don’t think they use microdata. Through schema.org, LRMI uses microdata and JSON-LD.

aligning the alignment objects

The discussion that I hope to kick off with the Mozilla Open Badge and LRMI communities is should/could we make the similarities between the two alignment objects more explicit? This would give developers a two-for-one offer, understand the way Open Badges expresses alignment and you’ve understood what LRMI does, and vice versa. I don’t suppose either group wants to change a spec that is in productive use, but an informative statement about the similarities could be provided without changing either.

Beyond that I wonder if the Open Badge community have thought about use of schema.org when advertising badges, i.e. if you provide a webpage saying “we offer the following badges for X, Y and Z” would there be benefit in marking this up with schema.org microdata to improve discoverability by search engines? If there is benefit in doing so, then it would be worth thinking about what type of schema.org Thing badges are and how the LRMI alignment object might be attached to it.

The bigger picture is that someone working with the starting point of wanting to learn about something could find resources to help them learn it with the help of LRMI alignments and discover the means of showing that they had learnt it via Open Badge alignments.

]]>
http://blogs.cetis.org.uk/philb/2014/04/03/lrmi-open-badges-and-alignment-objects/feed/ 0
Explaining the LRMI Alignment Object http://blogs.cetis.org.uk/philb/2014/03/06/explaining-the-lrmi-alignment-object/ http://blogs.cetis.org.uk/philb/2014/03/06/explaining-the-lrmi-alignment-object/#comments Thu, 06 Mar 2014 15:04:20 +0000 http://blogs.cetis.org.uk/philb/?p=924 The educational alignment property and the associated alignment object that LRMI introduced into schema.org have been described as the “killer feature” for LRMI. However, I know from the number of questions asked about the alignment object and from examples I have seen of it being used wrongly that it is not the easiest construct to understand.

Perhaps the problems come from the nature of the alignment object as a conceptual abstraction, so maybe it will be help to show some concrete examples of how it may be used. However, bear in mind that the abstraction was a deliberate design decision made so that the alignment object should be more widely applicable than the examples given here. So I will first discuss a little about why some simpler more direct approaches were considered and rejected (as were some approaches that would be even more abstract).

basic use case

The general use case for which the alignment object was introduced to meet was , in brief,

“help people find resources that can be useful in teaching or learning in some specific scenario.”

That looks deceptively simple. The complications come when defining the “specific scenario” and unpacking the word “useful”

enter “educational frameworks”

One practical approach to defining various aspects of the specific scenario involves reference to an educational framework of some sort.  By educational framework I mean a structured description of educational concepts such as a shared curriculum, syllabus or set of learning objectives, or a vocabulary for describing some other aspect of education such as educational levels or reading ability.

“Educational framework” is a deliberately broad concept as we wanted LRMI to be applicable globally and across many levels and modes of education. Some specific examples are school-level curricula or attainment standards such as:

Perhaps more relevant to higher education many professional bodies define the competencies required to become a member of their  profession, for example:

As well as having a role in  defining competences and outcomes, measures of academic level or difficulty may be useful independently as reference points, for example:

  • the US K12 grade levels are well understood in terms of school level,
  • the more formally defined Scottish Credit and Qualifications Framework (SCQF) level descriptors
  • various empirical measures of reading difficulty, for example general idea of “reading age” and the specific measures of reading ability and text level used by lexile.

One the other hand you may just want to specify the subject being taught, or the educational discipline for which is it being taught. Various classifcation schemes for academic subjects are available, for example:

All of these frameworks (and many others) may be used to describe aspects of an educational scenario.

ways of being useful

Life isn’t simple enough for us to meet the use case described above by adding a single property to schema.org Creative Works to say that the resource “aligns with” (i.e. is useful in the context defined by) some entry or node in an educational framework.  In prescribing a “useful” resource we would want to distinguish between resources that teach and asses a topic; we also want a resource that assumes suitable previous knowledge, or requires some specific reading level, or assumes a certain general academic level. There may be other forms of alignment. There isn’t agreement on a minimum core set of properties required to address that word “useful” in the use case, but there is agreement that a resource can “align” with an “educational framework” in several ways, some of which we can enumerate. Hence the birth of the alignment property and abstract Educational Alignment object.

the abstraction

I think of it like this:

We start with a Creative work: simpleCreativeWork_small

and an educational framework:educationalFramework_small
(Note, there is no schema.org class of type EducationalFramework, but we assume that we can refer to some of the following properties pertaining to it: some text that identifies the framework as whole (let’s call it a name), and the URLs, names and/or descriptions of nodes within the framework.)

The alignment object alignmentObject_small was created to describe the relationship between the two. The following properties alignment objects are defined: educationalFramework, which can be used to hold text that identifies the educational framework you are pointing to;  targetDescription, targetName and targetURL, which can hold the values that correspond to properties we assumed that nodes in the educational framework would have. It also has an alignmentType property that I think of switching the object to specify the different types of alignment that are possible. So we can put them together to express an alignment between a creative work and some node in an educational framework:

educationalAlignment

common mistakes

I have seen both of these mistakes in actual markup of webpages.

1. the alignment object on its own is fairly meaningless. Unless it is referenced by the educational Alignment property of a creative work it’s as useful as half a link.

2. since the alignment object is a proper schema.org Thing (to be specific a subtype of an Intangible Thing) it inherits the properties that every schema.org Thing has. e.g. a name, a URL, a description an image. Some of these make some sense in some cases (see below) but importantly, none of them are used in expressing the alignment: the url of an alignment object is not the same as the url of the creative work or the node to which it aligns.

real-world examples of alignment assertions

I would like to use two real-world examples of where services provide information that can be seen as an assertion that a resource is useful in connection with (i.e. aligns with) an educational framework:

1. Kritikos, where students can tell other students what is useful for their course.

Screen shot of Kritikos information page about an MIT OCW lecture video.

Screen shot of Kritikos information page about an MIT OCW lecture video. See it in kritikos.

Kritikos is a custom search engine for visual media relevant to teaching and learning engineering.  In part the customisation comes through the use of a Google CSE,  but more relevant to this post is the part that comes through allowing users to classify whether resources found on it are useful for specific courses [aside: this part of the kritikos service is built on a Learning Registry node].

The example shown here is the kritikos information page for a video of a lecture from MIT Open CourseWare. It includes “what others are saying about this resource” with the information from a year 3 MEng Aerospace Engineering student that it is relevant to “Flight Dynamics and Control”. The link from this assertion leads to other resources deemed useful by users for that module. “Flight Dynamics and Control” is a module at the University of Liverpool (code AERO317) that exists within the framework of Liverpool’s Aerospace Engineering programme. It is worth noting that kritikos can also be used to record when a resource is not relevant to a course–this is useful for weeding out false positives that get through the Google custom search engine. [Disclosure/bragging: I had an advisory role in the project that lead to kritikos.]

So, there’s an expression of an educational alignment; how does it relate to the alignment object?

The creative work in question is the MIT lecture (to be precise it’s a http://schema.org/VideoObject), we could describe a few of its characteristics with schema.org properties:
name = “Lec 7 | MIT 16.885J Aircraft Systems Engineering, Fall 2005″
url=http://www.youtube.com/watch?v=2QRfkG7jOfY
duration = PT110M22S
I’m not guessing this, the YouTube page has Schema.org microdata in it.

The node in the educational framework is a bit less well defined, but we would be justified in calling the module description a node in a framework called “University of Liverpool Modules” and saying the name for this node is “AERO317″, its description is “Flight Dynamics and Control”. It has a page on the web which gives us a url, http://tulip.liv.ac.uk/mods/vital/vital_AERO317_200809.htm. So we can express the alignment:

item type=http://schema.org/VideoObject
    name = "Lec 7 | MIT 16.885J Aircraft Systems Engineering, Fall 2005"
    url = http://www.youtube.com/watch?v=2QRfkG7jOfY
    duration = PT110M22S
    educationalAlignment = item1

item1 type= http://schema.org/AlignmentObject
    alignmentType = "Teaches"
    educationalFramework = "University of Liverpool  Modules"
    targetName = "AERO317"
    targetDescrption = "Flight Dynamics and Control"
    targetUrl = http://tulip.liv.ac.uk/mods/vital/vital_AERO317_200809.htm

What about the other properties of the AlignmentObject, the ones it inherited by virtue of being an official Intangible Thing in the schema.org hierarchy? Well you could envisage the image property pointing to the screenshot above, and the url property being a url with a fragment identifier that points to the “what others are saying” part of the kritikos page. Sure, you can give it a name and descriptions if you want to. Maybe these aren’t especially useful, but the point it that they are clearly different from the url, name and description of the University of Liverpool course to which the MITOCW video aligns.

2. OER Commons, aligning to US Common Core State Standards

I’ll cover this in less detail. The main problem with the example above is that the educational framework, while locally useful, is somewhat ad hoc we had to kind of look at the course structure at Liverpool University in a certain way to see it as an educational framework. Better examples of a more widely shared and more formally constructed educational frameworks are those of the US Common Core State Standards Initiative.  OER Commons is a repository and search engine for Open Educational Resources that expresses alignment to these frameworks in its descriptions.

Screenshot from a resource description on OERCommons showing educational alignment information on the right.
Screenshot from a resource description on OERCommons showing educational alignment information on the right.

The screenshot on the left shows such an alignment being displayed (the image links to the actual page in question, which is more legible). You see that in this case the creative work called “Chocolate Chocolate Chocolate” aligns with the Common Core Standard “CCSS.ELA-Literacy.RL.1.9 : Compare and contrast the adventures and experiences of characters in stories.”

Interestingly there is some other information given about the “degree of alignment”, i.e. how good a match that resource is to teaching that State Standard.

justification for the abstraction of the alignment object

In part the motivation for creating an alignment object class in schema.org was the issue mentioned above  about not knowing what might be all the possible forms of alignment between a resource and an educational framework used to characterise some aspect of a teaching and learning scenario. However I hope the examples above go someway to showing that alignments are real (if intangible) things, you can give them URLs, and names if you want. Furthermore they do have properties. For example, they are asserted by someone: a student at Liverpool University in the kritikos example and a user of OER Commons in the other. In the OER Commons example there is other information about the degree of alignment. This goes some way to convincing me that the alignment object isn’t just some computer science trick of indirection.

]]>
http://blogs.cetis.org.uk/philb/2014/03/06/explaining-the-lrmi-alignment-object/feed/ 8
Where to put your EPUB metadata http://blogs.cetis.org.uk/philb/2014/01/15/where-to-put-your-epub-metadata/ http://blogs.cetis.org.uk/philb/2014/01/15/where-to-put-your-epub-metadata/#comments Wed, 15 Jan 2014 10:18:03 +0000 http://blogs.cetis.org.uk/philb/?p=918 Even in the knowledge that current mainstream EPUB readers and applications for managing eBooks will most likely ignore all but the most trivial metadata, we still have use cases that involve more sophisticate metadata. For example we would like to use the LRMI alignment object in schema.org to say that a particular subsection of a book can be useful in the context of a specific unit in a shared curriculum.

So, without evaluating pros and cons, starting from the most basic/most common, what are the options? This is a summary takes information from Garrish and Gulling, EPUB 3 Best Practices, OReilly 2013, (which I take to be authoritative and also as an example of best practice with regard to the metadata in the epub file) as well as the EPUB 3.0 Publications and Content Documents specifications. Any comments would be greatly appreciated.

1. Simple Dublin Core

Within the OEPBS directory of an unpacked EPUB3 is the content.opf file. It pretty much equates to the manifest of an IMS Content Package. The top-level element is <package> and <metadata> is a required first child of <package>.

The default metadata vocabulary is the Dublin Core Metadata Element Set (DCMES, simple DC), with prefix dc:. Three elements are mandatory–title, identifier and language–others are optional. For example, in /OEPBS/content.opf

<?xml version=’1.0’ encoding=’UTF-8’?>
<package xmlns:dc="http://purl.org/dc/elements/1.1/ [...]">
    <metadata>
        <dc:identifier>urn:isbn:9781449325299</dc:identifier>
        <dc:title>EPUB 3 Best Practices</dc:title>
        <dc:language>en</dc:language>
        <dc:rights>Copyright © 2013 Matt Garrish and Markus Gylling</dc:rights>
[...]

2 Other metadata schemas

The package element has a prefix attribute that may be used to declare prefixes for metadata schemas other than DCMES. Four vocabularies are reserved, i.e. the prefix may be used without a declaration: dcterms, marc, onix and media (the vocabulary used for EPUB3 media overlays). Example

<dcterms:title>EPUB 3 Best Practices</dcterms:title>

Other vocabularies may be used providing a prefix and a URL in a way so similar to xmlns that is makes you wonder why they didn’t just use xmlns.

<package prefix="prism: http://prismstandard.org/namespaces/basic/3.0/" [...]>

3 the meta element

If used without the refines attribute (see below) the meta element can provide information about the package as a whole, e.g.

<meta property="dcterms:title">EPUB 3 Best Practices</meta>

I have no idea what would be the benefit of this over <dcterms:title>.

4 Refining metadata elements: id attribute and the meta element

The id attribute can be used to provide an identifier any element in the metadata that it may be refined. One example of this is mandatory, i.e. that one occurrence of the dc:identifier element must be the publication identifier:

<dc:identifier id="pub-identifier">urn:isbn:9781449325299</dc:identifier>

In general the refinements are described using the meta element with the refines attribute and a property attribute that specifies the nature of the refinement. It’s kind of like RDF reification. The default vocabulary for the property attribute includes “file-as” – an alternative string for a name to be used when filing, “identifier-type” – a way to distinguish between different identifiers, “meta-auth” – the authority for a given instance of metadata, “title-type” – which of the six forms of title being provided.

<dc:creator id="1234">Matt Garrish</dc:creator>
<meta refines="#1234" property="file-as" id="5678">Garrish, Matt</meta>
<meta refines="#1234" property="role">Author</meta>

Terms from other vocabularies may be used for “property” so long as a prefix is declared.

Refinements may have ids and so may be refined.

<meta refines="#5678" property="meta-auth">Phil Barker</meta>

So and so you can make statements about your metadata statements to you heart’s content (though including the whole of the linked data graph in each epub would be silly).

The scheme attribute may be used to identify the controlled vocabulary from which the meta element’s value is drawn. For example, if the identifier is a DOI (which in onix is apparently entry 06 of codelist 5) you can have

<dc:identifier id="pub-id">urn:doi:10.1016/j.iheduc.2008.03.001 </dc:identifier>
<meta refines="#pub-id"
      property="identifier-type"
      scheme="onix:codelist5">06</meta>

Or, using the marc relator value Aut to specify author

<meta refines="#1234" property="role" scheme="marc:relators">Aut</meta>

5 Sub-package level metadata

The id attribute may be used to provide an identifier of an subelement of <package> or any element in the XHTML content documents, down to a span element around a phrase, word or character. So a chapter may have id=”chap1″ then we can use meta elements in the metadata to describe it seperately from the rest of the epub

<meta refines="#chap1" property="prism:contentType">bookChapter<meta>

6 Links to metadata records

The link element is an optional, repeatable subelement of <metadata>, “used to associate resources with a publication, such as metadata records” The metadata may be within package or anywhere on the www.
Example

<link rel="marc21xml-record" href="pub/meta/nor-wood-marc21.xml" />
<link refines="#chap1" rel="ex:schema_org-record"
      media-type="application/ld+json"
      href="http://example.org/nor-wood-lrmi.json" />

Metadata embedded in the XHTML5 content

As far as I can see the EPUB3 specs are mute on metadata in HTML of the content documents, e.g. as html:meta elements or as microdata or RDFa, there doesn’t seem to be any reason why one should not put metadata here. I wouldn’t expect any EPUB system to look that deeply into the package but it would be a good approach to helping the metadata travel with the resource if the EPUB is disaggregated and passed into a non-EPUB3 CMS.

]]>
http://blogs.cetis.org.uk/philb/2014/01/15/where-to-put-your-epub-metadata/feed/ 3
ePub metadata what gets shown? http://blogs.cetis.org.uk/philb/2013/06/18/epub-metadata/ http://blogs.cetis.org.uk/philb/2013/06/18/epub-metadata/#comments Tue, 18 Jun 2013 08:54:06 +0000 http://blogs.cetis.org.uk/philb/?p=848 One of the issues around eTextBooks is how to describe them, specifically by way of educational metadata in ePub. That’s something that on the face of it shouldn’t be too difficult to address (at least to the extent that we know how to describe any educational resource). One thing that would be useful in demonstrating different choices for educational metadata is an app or tool that will display any metadata found in the ePub package in a sensible way. As a bit of long shot I tried four eBook readers to see whether they would; they don’t. The details follow, if you’re interested, but do let me know if you know of any tool that might be useful.

The package metadata of an ePub can include a selection of Dublin Core elements and terms. These can be refined, for example you may have two dc:title elements with refinements to specify that one is the main title and the other the subtitle. You can also extend with elements from other XML namespaces, or if you prefer you can just link to a metadata record of your favourite flavour which can be either inside the ePub package or elsewhere on the web. Any of this metadata can relate to the eBook as a whole or some part of it, e.g. a single chapter or image. Without going into details there seems to be enough scope there to experiment with how educational characteristics of the eBook might be described.

But how to see the results? I took an ePub (a copy of O’reily’s EPUB 3 Best Practices, since it seemed likely to provide as good a starting point as I was going to find in a real book), made a copy, unzipped it and changed the values of the meta elements so that I could easily identify what elements were being displayed. For example I changed
<dc:title id="pub-title">EPUB 3 Best Practices</dc:title> to
<dc:title id="pub-title">dc:title</dc:title> and so on.

Here’s a list of the metadata elements in that file:

  • <dc:title id="pub-title">
  • <dc:creator id="..." >
  • <dc:publisher>
  • <dc:date>
  • <meta property="dcterms:modified">
  • <dc:identifier id="pub-identifier">
  • <dc:language id="pub-language">
  • <dc:contributor> (repeated)
  • <dc:rights>
  • <dc:subject>
  • <dc:description>
  • <meta id="meta-identifier" property="dcterms:identifier">
  • <meta property="dcterms:title" id="meta-title">
  • <meta property="dcterms:language" id="meta-language">
  • <meta property="dcterms:rights">
  • <meta property="dcterms:rightsHolder">
  • <meta property="dcterms:publisher">
  • <meta property="dcterms:subject">
  • <meta property="dcterms:description>
  • <meta id="...." property="dcterms:creator"> (repeated, different ids)
  • <meta name="cover" content="cover-image"/>
  • <meta property="ibooks:specified-fonts">

I then looked at this with various eBook readers:

Readium

I had hopes for Readium since it is pretty much the reference implementation of EPUB3. It displayed

in Readium

in Readium

  • dc:title
  • dc:creator
  • dc:publisher
  • dc:date
  • meta dcterms:modified
  • dc:identifier

Note that it doesn’t even check for a valid value for dates.

Calibre

Calibre, while it doesn’t claim to support ePub3 is targetted at managing personal book libraries. It displays:

in Calibre

in Calibre

  • dc:title
  • dc:creator
  • dc:subject (for tags)
  • dc:description
  • dc:publisher

It probably uses dc:language and dc:date (for published) as well but recognises that the values dc:language / dc:date aren’t valid.

Ideal Reader for Android

The Ideal Reader for Android is the other ePub3 reader I use. It displays

In Ideal Android Reader

In Ideal Android Reader

  • dc:title
  • meta dcterms:creator (just one of them)
  • dc:date
  • dc:publisher
  • dc:description
  • dc:subject
  • dc:rights

iTunes

Finally I gave a chunk of diskspace to Apple

in iTunes desktop for Windows 7

in iTunes desktop for Windows 7

  • dc:title
  • dc:creator
  • dc:title (again)
  • dc:subject (in the info tab, as Genre)

Yep, title is there twice: the info tab shows dc:title in the Name and Album fields, so you can gauge the amount of effort that Apple have put into adapting iTunes for books.

What did I learn?

I learnt that none of the ePub reading/management apps or tools that I have show more than the bare minimum of metadata, even if it is there. None of them will be much good for trying out ideas for how educationally characteristics can be described since I strongly suspect that none of it will be viewable. That’s not too surprising, especially when you consider that none of the tools I looked at are geared around resource discovery, but I can’t really go uploading dummy ePub files to book seller sites just to see what they look like. May be any meaningful exploration/demonstration of educational metadata in ePub is going to need a bespoke application, but if you know of a tool that might be helpful do drop me a line.

]]>
http://blogs.cetis.org.uk/philb/2013/06/18/epub-metadata/feed/ 4
On Semantics and the Joint Academic Coding System http://blogs.cetis.org.uk/philb/2013/04/17/on-semantics-and-the-joint-academic-coding-system/ http://blogs.cetis.org.uk/philb/2013/04/17/on-semantics-and-the-joint-academic-coding-system/#comments Wed, 17 Apr 2013 11:22:38 +0000 http://blogs.cetis.org.uk/philb/?p=802 Lorna and I recently contributed a study on possible reforms to JACS, a study which is part of a larger piece of work on Redesigning the HE data landscape. JACS, the Joint Academic Coding System, is mainatained by HESA (the Higher Education Statistics Agency) and UCAS (Universities and Colleges Admissions Service) as a means of classifying UK University courses by subject; it is also used by a number of other organisations for classification of other resources, for example teaching and learning resources. The study to which we were contributing our thoughts had already identified a problem with different people using JACS in different ways, which prompted the first part of this post. We were keen to promote technical changes to the way that JACS is managed that would make it easier for other people to use (and incidentally might help solve some of the problems in further developing JACS for use by HESA and UCAS), which are outline in the second part.

There’s nothing new here, I’m posting these thoughts here just so that they don’t get completely lost.

Subjects and disciplines in JACS

One of the issue identified with the use of JACS is that “although ostensibly believing themselves to be using a single system of classification, stakeholders are actually applying JACS for a variety of different purposes” including Universities who “often try to align JACS codes to their cost centres rather than adopting a strictly subject-based approach”. The cost centres in question are academic schools or departments, which are discipline based. This is problematic to the use of JACS to monitor which subjects are being learnt since the same subject may be taught in several departments. A good example of this is statistics, which is taught across many fields from Mathematics through to social sciences, but there are many other examples: languages taught in mediaeval studies and business translation courses, elements of computing taught in electronic engineering and computer science and so on. One approach would be to ignore the discipline dimension, to say the subject is the same regardless of the different disciplinary slants taken, that is to say statistics taught to mathematicians is the same as statistics taught to physicists is the same as statistics taught to social sciences. This may be true at a very superficial level, but obviously the relevance of theoretical versus practical elements will vary between those disciplines, as will the nature of the data to be analysed (typically a physicist will design an experiment to control each variable independently so as not to deal with multivariate data, this is not often possible in social sciences and so multivariate analysis is far more important). When it comes to teaching and learning resources something aimed at one discipline is likely to contain examples or use an approach not suited to others.

Perhaps more important is that academics identify with a discipline as being more than a collection of subjects being taught. It encapsulates a way of thinking, a framework for deciding on which problems are worth studying and a way of approaching these problems. A discipline is a community, and an academic who has grown up in a community will likely have acquired that community’s view of the subjects important to it. This should be taken into account when designing a coding scheme that is to be used by academics since any perception that the topic they teach is being placed under someone else’s discipline will be resisted as misrepresenting what is actually being taught, indeed as a threat to the integrity of the discipline.

More objectively, the case for different disciplinary slants on a problem space being important is demonstrated by the importance of multidisciplinary approaches to solving many problems. Both the reductionist approach of physics and the holistic approach of humanities and social sciences have their strengths, and it would be a shame if the distinction were lost.

The ideal coding scheme would be able to represent both the subject learnt and the discipline context in which it was learnt.

JACS and 5* data

Tim Berners-Lee suggested a 5 star deployment scheme for open data on the world wide web:
* make your stuff available on the Web (whatever format) under an open licence
** make it available as structured data (e.g., Excel instead of image scan of a table)
*** use non-proprietary formats (e.g., CSV instead of Excel)
**** use URIs to denote things, so that people can point at your stuff
**** link your data to other data to provide context

Currently JACS fails to meet the open licence requirement for 1-star data explicitly, but that seems to be a simple omission of a licensing statement that shows the intention that JACS should be freely available for others to use. It is important that this is fixed, but aside from this, JACS operates at about 3-star level. Assigning URIs to JACS subjects and providing useful information when someone accesses these URIs will allow JACS to be part of the web of linked open data. The benefits of linking data over the web include:

  • The identifiers are globally unique and unambiguous, they can be used in any system without fear of conflicting with other identifiers.
  • The subjects can be referenced globally by humans by from websites, emails, and by computer systems in/from data feeds and web applications.
  • The subjects can be used for semantic web approaches to representing ontologies, such as RDF.
  • These allow relationships such as subject hierarchies and relationships with other concepts (e.g. academic discipline) to be represented independently of the coding scheme used. An example of this is SKOS, see below.

In practical terms, implementing this would mean:

  • Devising a URI scheme. This could be as simple as adding the JACS codes to a suitable base URI. For example H713 could become http://id.jacs.ac.uk/H713
  • Setting up a web service to provide suitable information. Anyone connecting to that URL would be redirected to information that matched parameters in their request. A simple web browser would request an HTML page and so be redirected to http://id.jacs.ac.uk/H713.html; web applications would request data in a machine readable form such as xml, rdf or json.

The main overhead is in setting up, maintaining and managing the data provided by the web service, but Southampton University have already set one up for their own use. (The only problem with the Southampton service–and I believe Oxford have done something similar–is a lack of authority, i.e. it isn’t clear to other users whether the data from this service is authoritative, up to date, used under a correct license, sustainable.)

JACS and SKOS

SKOS (Simple Knowledge Organization System) is a semantic web application of RDF which provides a model for expressing the basic structure and content of concept schemes such as thesauri, classification schemes, subject heading lists, taxonomies, folksonomies, and other similar types of controlled vocabulary. It allows for the description of a concept and the expression of the relationship betweens pairs of concepts. But first the concept must be identified as such, with a URI. For example:
jacs:H713 rdf:type skos:concept
In this example jacs: is shorthand for the JACS base URI, http://id.jacs.ac.uk/ as suggested above; rdf: and skos: are shorthand for the base URIs for RDF and SKOS. This triple says “The thing identified by http://id.jacs.ac.uk/H713 is a resource of type (as defined by RDF) concept (as defined by SKOS)”.

Other assertions can be made about the resource, e.g. the preferred label to be used for it and a scope note for it.
jacs:H713 skos:prefLabel “Production Processes”
jacs:H713 skos:scopeNote “The study of the principles of engineering as they apply to efficient application of production-line technology.”

Assuming the other JACS codes have been suitably identified, relationships between them can be described:
jacs:H713 skos:broader jacs:H710
jacs:H713 skos:related jacs:H830

Once JACS is on the semantic web relationships between the JACS subjects and things in other schemas can also be represented
http://example.org/123 dct:subject jacs:H713
(The resource identified by the URI http://example.org/123 is about the subject identified by jacs:H713).

]]>
http://blogs.cetis.org.uk/philb/2013/04/17/on-semantics-and-the-joint-academic-coding-system/feed/ 6
Book now available. Into the Wild – Technology for Open Educational Resources http://blogs.cetis.org.uk/philb/2013/03/21/into-the-wild/ http://blogs.cetis.org.uk/philb/2013/03/21/into-the-wild/#comments Thu, 21 Mar 2013 11:04:30 +0000 http://blogs.cetis.org.uk/philb/?p=795
Into the Wild (Book cover)

Into the Wild (Book cover)

With great pleasure and more relief I can now announce the availability of Into the wild – technology for open educational resources, a book of our reflections on the technology involved in three years of the UK OER Programmes.

From the blurb:

Between 2009 and 2012 the Higher Education Funding Council funded a series of programmes to encourage higher education institutions in the UK to release existing educational content as Open Educational Resources. The HEFCE-funded UK OER Programme was run and managed by the JISC and the Higher Education Academy. The JISC CETIS “OER Technology Support Project” provided support for technical innovation across this programme. This book synthesises and reflects on the approaches taken and lessons learnt across the Programme and by the Support Project.

This book is not intended as a beginners guide or a technical manual, instead it is an expert synthesis of the key technical issues arising from a national publicly-funded programme. It is intended for people working with technology to support the creation, management, dissemination and tracking of open educational resources, and particularly those who design digital infrastructure and services at institutional and national level.

You may remember Lorna writing back in August that Amber Thomas, Martin Hawksey, Lorna and I had written 90% of this book together in a Book Sprint. Well, the last 10% and the publication turned in to a bit of a marathon-relay, something about which I might write some time, but now the book is available in a variety of formats:

  • If you want glossy-covered paperback, then you can order it print-on-demand from Lulu (£3.36); if you’re not so fussed about the glossy cover and binding then there is a print-quality pdf you can print yourself.
  • If you have an ePub reader you can download, there is a free download of an epub2 file.
  • If you have a Kindle, you can download the .mobi file and transfer it, or if you prefer the convenience of Amazon’s distribution over whisper-net you can buy it from them (77p, they don’t seem to distribute for free unless you agree to give them exclusive rights for all electronic formats).
  • finally, if you prefer your ebook reading as PDFs, there is one of those too.

All varieties are free or at minimum cost for the distribution channel used; the content is cc-by licensed and editable versions are available if you wish to remix and fix what we’ve done.

Available via the Cetis publications site.

]]>
http://blogs.cetis.org.uk/philb/2013/03/21/into-the-wild/feed/ 1
Webinar: Learning resource metadata for schema.org http://blogs.cetis.org.uk/philb/2012/07/13/lrmi-webinar/ http://blogs.cetis.org.uk/philb/2012/07/13/lrmi-webinar/#comments Fri, 13 Jul 2012 10:39:55 +0000 http://blogs.cetis.org.uk/philb/?p=651 As you may know, I have been involved in the development of the Learning Resource Metadata Initiative‘s extension of schema.org since about this time last year. Things are shaping up well for the inclusion of the LRMI properties in the main schema.org vocabulary, so this seems like a good time(*) to start explaining and promoting them. To that end, we will be running webinar, hosted on JISC’s BlackBoard Collaborate service on Fri 27 July starting at 15:00 UK time, it will run for up to 2 hours.

Update: the webinar happened, you can get the slides that were used from slideshare and you can view a full recording of the webinar (that’s a BlackBoard Collaborate recording, you need Java for it to play).

In this webinar we will explore the background, intent and output of the Learning Resource Metadata Initiative (LRMI). The LRMI has proposed extensions to the schema.org microdata vocabulary with the aim of facilitating the discovery of learning resources through major search engines and other discovery services. We will provide an introduction to schema.org and describe the specific approach taken by LRMI.

My first take at an outline programme is along the lines of:

  • Outline of schema.org as semantic tagging of HTML content (this isn’t intended to be a tutorial on how to add schema to a web page, but I think it will be useful to make sure everyone starts from the same understanding of schema’s place in the web)
  • Who is behind schema.org
  • Their motivation: “improve search services”–what that means
  • What schema.org (initial release) offers for Learning Resources and what it doesn’t.
  • Who is behind LRMI
  • How LRMI worked
  • Most importantly, what LRMI produced

I am delighted that helping me with this webinar will be two key players in LRMI and schema.org. Dan Brickley, who many of you will know from his years of activity on RDF and the semantic web and who is heavily involved in the outreach, standards and community work around schema.org, and Greg Grossmeier of Creative Commons who is Co-chair of the LRMI technical working group and so has steered us from the collection of user requirement through to the development of new schema.org properties.

The target audience is staff from UK Further and Higher Education with an interest in the dissemination of learning resources (for example Open Educational Resources, OERs) and building services for their discovery, especially those people involved in JISC projects and services. If demand is high priority will be given to this audience.

(* yeah, OK, Friday afternoon at the end of July isn’t really a good time for this, but it ended up as the best time for the people involved given their other constraints….)

]]>
http://blogs.cetis.org.uk/philb/2012/07/13/lrmi-webinar/feed/ 2
Learning resource metadata initiative http://blogs.cetis.org.uk/philb/2011/09/08/learning-resource-metadata-initiative/ http://blogs.cetis.org.uk/philb/2011/09/08/learning-resource-metadata-initiative/#comments Thu, 08 Sep 2011 03:40:49 +0000 http://blogs.cetis.org.uk/philb/?p=535 In the spirit of Godwin’s law, I would propose that

“As any discussion about metadata grows longer the probability of a comparison to Google approaches one.”

Of course the comparison is usually that formal metadata is insignificant for the resource discovery needs of most people when compared to Google.

On one hand this is an over simplification: metadata is important for resource management in general not just for resource discovery, the information contained in metadata can be exposed to Google and other search engines, and it helps resource discovery in other ways, for example in displaying relationships between resources that can be browsed and crawled. It remains, however, true that all the effort that has gone into formalising and standardising metadata schema has had little, if any, direct effect on how people find resources through the search engine of their choice. So it’s interesting that the big search engines are now taking an interest in metadata markup of web pages, first with Google’s rich snippets, and now the more extensive (in a number of ways) schema.org initiative. I guess that this approach (that is, marking up the human readable infomation on a web page to show its relationship to a formal metadata schema as opposed to holding it seperately in a purely machine readable format) appeals to search engines because of their suspicion that any information not visible to the reader of a page (e.g. metadata elements in the HTML head element) might be there purely to spam search engine results.

Of course, my interest through CETIS is in educational metadata, and I have already dabbled in using rich snippets to mark-up a description of an educational resource. So I was extremely interested to hear about the Learning Resource Metadata Initiative headed up by Creative Commons and the Association of Educational Publishers, aiming to apply the schema.org approach to educational resources (schema.org initially, with an RDFa expression planned as a secondary output derived from it). I was extremely pleased to be accepted on to the technical working group to help draw up the details. Tomorrow is the first face to face meeting of that technical group, which is why I writing this on a plane on the way to San Fransisco.

While this will be the first face to face meeting, the technical group has made a start on its work. The previous work in educational metadata has been surveyed; use cases for lrmi have been collected, including those which were submitted for the Dublin Core Education Application Profile; and we’ve had a couple of teleconference meetings. It’s early days, so a lot is still open, but this much I can say (but I say it as an individual, I’m not claiming to be reporting any consensus of the working group). The scope of lrmi is resource discovery, and for me it stands or falls on whether it helps discovery through search engines. With respect to this there does already seem to be some uncertainty (generally) over how search engines will use schema.org and how the governance of the main schema.org vocabulary allows for community-driven additions and usage profiles (there is an upcoming schema.org meeting that might help clarify this). However, I guess that in the end it will come down to Google and others using what they find useful and ignoring what that don’t: which isn’t a bad way of establishing an industry standard in this field (I see parallels with browser developers and HTML5). The use cases gathered include the usual discovery issues, so far I haven’t seen anything unexpected, so hopefully the lrmi output will align with other efforts to meet those same scenarios. There is one slight coda to that though, there is a lot of interest in expressing the usefulness of a resource for specific learning objectives as set out in standard curricula. This is largely with respect to showing the alignment of a resource with US state standard curricula, and the US national core K-12 curriculum. I know very little about the US standard curriculum/a, but I do think it is important that (and believe it would be useful) any approach adopted by lrmi to showing this alignment should be usable more generally for, e.g., the English National Curriculum and possibly for wider competency frameworks as used in UK HE for some disciplines (e.g. medecine, Scottish law, engineering). I should stress that, while the level of interest in this is noteworthy, showing such alignments isn’t new: it’s achievable with the LOM (classification with purpose set to learning objective), Dublin Core has had the conformsTo term for showing alignment to an educational standard for a number of years, and it has been discussed for the conceptual model for ISO MLR part 5.

I’ll report more when I am home from the meeting and will, of course, be happy to feed forward any comments you have, but to be kept up to date on all developments and to have a more direct say join the LRMI discussion group.

]]>
http://blogs.cetis.org.uk/philb/2011/09/08/learning-resource-metadata-initiative/feed/ 1
Event: what metadata is really useful? http://blogs.cetis.org.uk/philb/2010/09/08/event-mdreqs/ http://blogs.cetis.org.uk/philb/2010/09/08/event-mdreqs/#comments Wed, 08 Sep 2010 15:30:56 +0000 http://blogs.cetis.org.uk/philb/?p=324 CETIS are organising an event “What metadata is really useful” at Brettenham House in London on Mon. 18 October.

This meeting will focus on looking at what data we have (or could acquire) to answer the question of what metadata is really required to support the discovery, selection, use and management of educational resources. The emphasis is on identifying data that demonstrates a real requirement from some party; this is in contrast to other approaches such as hypothetical, future-looking usecases. Future looking use cases have their place–we would all like to see applications and services which allow us to do things that we cannot do now–but now seems to be suitable point to reflect on what needs to be prioritised because it meets the needs of users today. Of those four functions (discovery, selection, use and management), it is likely that the meeting will deal mainly with the first two or three; I think we will be able to find more data for these, but it is important to keep all four functions in mind before we say that there is no demonstrated need to describe some characteristic of a resource. The data in question may come from various sources, for example, user surveys of how people look for educational resources, current practice in metadata production, or analysis of user search behaviour.

Here’s an example of what we can get from user surveys from David Davies. I hope that at this meeting we will be able to build on this and any other existing work people care to bring along. we might, for example, want to consider whether we can increase the scope and reach of such questionnnaires in the future by suggesting some common questions they could include.

A second source of data can be found in the current cataloguing practice for existing repositories; this can be surfaced by examination of application profiles or cataloguing guidelines in use and examination of the records themselves. So we can find out whether people using the LOM do find it useful to have seperate description elements for general and educational properties (not to mention all those that come in the classification category). This is especially interesting since it is perhaps the only source of data I have thought of that reflects metadata required internally to the repository for managing resources.

Finally, and this is where I think we will have most to discuss, data can be obtained from logging access and queries. This is what I have in mind by way of questions that could be answered this way:

  1. How do people find the site? Is it through search engines or direct referral? Do they land on a resource page (=> they were looking for a resource and found it directly with an external search) or on your home page (=> they were looking for a collection of resources)? Obviously the answers will depend on who your users are and why they are coming to your site, if you have an institional repository or other local collection of your own resources (e.g. an OER site) you might find that members of your own institution, staff and students, have a different behaviour to others from outwith your institution.
  2. What search terms do people use to find resources? We can divide this into two: people who search elsewhere, e.g. Google, with query terms discovered through referrer logs or other web analytics tools; and people who search using a site’s own search functionality. A lot of the search terms will be subject keywords and they’ll be of interest to cataloguers or thesaurus developers for a specific site, but there will be other search terms (e.g. ‘powerpoint’, ‘ppt’, and ‘slides’ all featured in the one set of logs I looked at recently), which lead us to …
  3. What do the search terms tell us about what characteristics of a resource people are searching for? And how do they conceptualize those characteristics. So a search for “powerpoint” suggests that they’re searching for a particular resource type, “introduction to…” would suggest a way of thinking about educational level. This would help us when making decisions about what metadata elements to use.

Interested?
If you are interested in this event, you can register online. More details about the programme etc. will become available on the event’s wiki page (at the time writing there is very little there that isn’t also on this post). Most importantly, if you think you have something to present or contribute, please get in touch with me, phil.barker /at/ hw.ac.uk.

]]>
http://blogs.cetis.org.uk/philb/2010/09/08/event-mdreqs/feed/ 3
Views sought on ISO Metadata for Learning Resources http://blogs.cetis.org.uk/philb/2009/12/03/views-on-mlr/ http://blogs.cetis.org.uk/philb/2009/12/03/views-on-mlr/#comments Thu, 03 Dec 2009 12:14:05 +0000 http://blogs.cetis.org.uk/philb/?p=219 Work on the ISO standard Metadata for Learning Resources is reaching a critical point, with bodies such as BSI being asked to vote on whether the current draft text for part 1 (the framework) should be allowed to continue to the next stage of the ISO standardization process. The current draft is the final committee draft, approval by this ballot would indicate that those interested at this stage had reached consensus on the technical content, and the document could become a draft International Standard. There then follows a wider enquiry stage and further votes before the standard is fully ratified.

MLR is being drafted as a multi-part standard and the role of part 1, the framework, is to provide the overall principles rules, and structures for how the other parts define data elements and how they should be used. One of the objectives is that MLR should be as compatible as possible with the LOM and the Dublin Core abstract model (and therefore with RDF though specific bindings are out of scope for this part).

CETIS have passed-on comments about previous drafts to the ISO committee through various channels. The most important channel for us for this draft is BSI, who get a vote in the ballot, and they are looking for comments by the end of February. We would like to put together an agreed position on behalf of those involved in UK F&HE , if you are interested in contributing to this please get in touch (email philb@icbl.hw.ac.uk) and I will pass on the details (update: there is a copy of the draft text on the ISOTC website). We are of course interested in views from outwith UK F&HE, but there might more appropriate routes for you to provide your feedback to BSI or your own national body.

Liddy Nevile is also asking for help in submitting comments from the Dublin Core Metadata Initiative. I would encourage people to help her with that.

Update: thanks to Erlend Øverby and Andy Heath, for showing me where a copy of this draft can be found.

Update 2: There has been some discussion on the CETIS-Metadata email list about this. Please consider joining in.

]]>
http://blogs.cetis.org.uk/philb/2009/12/03/views-on-mlr/feed/ 1