Reviewing the future for Leap2

JISC commissioned a Leap2A review report (PDF), carried out early in 2012, that has now been published. It is available along with other relevant materials from the e-Portfolio interoperability JISC page. For anyone following the fortunes of Leap2A, it is highly worthwhile reading. Naturally, not all possible questions were answered (or asked), and I’d like to take up some of these, with implications for the future direction of Leap2 more generally.

The summary recommendations were as follows — these are very welcome!

  1. JISC should continue to engage with vendors in HE who have not yet implemented Leap2A.
  2. Engagement should focus on communities of practice that are using or are likely to use e-portfolios, and situations where e-portfolio data transfer is likely to have a strong business case.
  3. JISC should continue to support small-scale tightly focused developments that are likely to show immediate impact.
  4. JISC should consider the production of case studies from PebblePad and Mahara that demonstrate the business case in favour of Leap2A.
  5. JISC should consider the best way of encouraging system vendors to provide seamless import services.
  6. JISC should consider constructing a standardisation roadmap via an appropriate BSI or CEN route.

That tallies reasonably with the outcome of the meeting back in November last year, where we reckoned that Leap2A needs: more adoption; more evidence of utility; to be taken more into the professional world; good governance; more examples; and for the practitioner community to build around it models of lifelong development that will justify its existence.

Working backwards up the list for the Leap2A review report, recommendation 6 is one for the long term. It could perhaps be read in the context of the newly formed CETIS position on the recent Government Open Standards Consultation. There we note:

Established public standards bodies (such as ISO, BSI and CEN), while doing valuable work, have some aspects that would benefit from modernisation to bring them more into line with organisations such as W3C and OASIS.

The point then elaborated is that the community really needs open standards that are freely available as well as royalty-free and unencumbered. The de jure standards bodies normally still charge for copies of their standards, as part of their business model, which we see as outdated. If we can circumvent that issue, then BSI and CEN would become more attractive options.

It is the previous recommendation, number 5 in the list above, that I will focus on more, though. Here is the fuller version of that recommendation (appearing as paragraph 81).

One of the challenges identified in this review is to increase the usability of data exchange with the Leap2A specification, by removing the current necessity for separate export and import. This report RECOMMENDS that JISC considers the best way of encouraging system vendors to provide seamless data exchange services between their products, perhaps based on converging practice in the use of interoperability and discovery technologies (for example future use of RDF). It is recognised that this type of data exchange may require co-ordinated agreement on interoperability approaches across HEIs, FECs and vendors, so that e-portfolio data can be made available through web services, stressing ease of access to the learner community. In an era of increasing quantities of open and linked data, this recommendation seems timely. The current initiatives around courses information — XCRI-CAP, Key Information Sets (KIS) and HEAR — may suggest some suitable technical approaches, even though a large scale and expensive initiative is not recommended in the current financially constrained circumstances.

As an ideal, that makes perfect sense from the point of view of an institution transferring a learner’s portfolio information to another institution. However, seamless transfer is inherently limited by the compatibility (or lack of it) between the information stored in each system. There is also a different scenario, that has always been in people’s minds when working on Leap2A. It is that learners themselves may want to be able to download their own information, to keep for use, at an uncertain time in the future, in various ways that are not necessarily predictable by the institutions that have been hosting their information. In any case, the predominant culture in the e-portfolio community is that all the information should be learner-ownable, if not actually learner-owned. This is reflected in the report’s paragraph 22, dealing with current usage from PebblePad.

The implication of the Leap2A functionality is that data transfer is a process of several steps under the learner’s control, so the learner has to be well-motivated to carry it out. In addition Leap2A is one of several different import/export possibilities, and it may be less well understood than other options. It should perhaps be stressed here that PebblePad supports extensive data transfer methods other than Leap2A, including zip archives, native PebblePad transfers of whole or partial data between accounts, and similarly full or partial export to HTML.

This is followed up in the report’s paragraph 36, part of the “Challenges and Issues” section.

There also appears to be a gap in promoting the usefulness of data transfer specifically to students. For example in the Mahara and PebblePad e-portfolios there is an option to export to a Leap2A zip file or to a website/HTML, without any explanation of what Leap2A is or why it might be valuable to export to that format. With a recognisable HTML format as the other option, it is reasonable to assume that students will pick the format that they understand. Similarly it was suggested that students are most likely to export into the default format, which in more than one case is not the Leap2A specification.

The obvious way to create a simpler interface for learners is to have just one format for export. What could that format be? It should be noted first that separate files that are attached to or included with a portfolio will always remain separate. The issue is the format of the core data, which in normal Leap2A exports is represented by a file named “leap2a.xml”.

  1. It could be plain HTML, but in this case the case for Leap2A would be lost, as there is no easy way for plain HTML to be imported into another portfolio system without a complex and time-consuming process of choosing where each single piece of information should be put in the new system.
  2. It could be Leap2A as it is, but the question then would be, would this satisfy users’ needs? Users’ own requirements for the use of exports is not spelled out in the report, and it does not appear to have been systematically investigated anywhere, but it would be reasonable to expect that one use case would be that users want to display the information so that it can be cut and pasted elsewhere. Leap2A supports the display of media files within text, and formatting of text, only through the inclusion of XHTML within the content of entries, in just the same way as Atom does. It is not unreasonable to conclude that limiting exports to plain Leap2A would not fully serve user export needs, and therefore it is and will continue to be unreasonable to expect portfolio systems to limit users to Leap2A export only.
  3. If there were a format that fully met the requirements both for ease of viewing and cut-and-paste, and for relatively easy and straightforward importing to another portfolio system (comparable to Leap2A currently), it might then be reasonable to expect portfolio systems to have this as their only export format. Then, users would not have to choose, would not be confused, and the files which they could view easily and fully through a browser on their own computer system would also be able to be imported to another portfolio system to save the same time and effort that is currently saved through the use of Leap2A.

So, on to the question, what could that format be? What follows explains just what the options are for this, and how it would work.

The idea for microformats apparently originated in 2000. The first sentence of the Wikipedia article summarises nicely:

A microformat (sometimes abbreviated µF) is a web-based approach to semantic markup which seeks to re-use existing HTML/XHTML tags to convey metadata and other attributes in web pages and other contexts that support (X)HTML, such as RSS. This approach allows software to process information intended for end-users (such as contact information, geographic coordinates, calendar events, and the like) automatically.

In 2004, a more sophisticated approach to similar ends was proposed in RDFa. Wikipedia has “RDFa (or Resource Description Framework –in– attributes) is a W3C Recommendation that adds a set of attribute-level extensions to XHTML for embedding rich metadata within Web documents.”

In 2009 the WHATWG were developing Microdata towards its current form. The Microformats community sees Microdata as having grown out of Microformats ideas. Wikipedia writes “Microdata is a WHATWG HTML specification used to nest semantics within existing content on web pages. Search engines, web crawlers, and browsers can extract and process Microdata from a web page and use it to provide a richer browsing experience for users.”

Wikipedia quotes the Schema.org originators (launched on 2 June 2011 by Bing, Google and Yahoo!) as stating that it was launched to “create and support a common set of schemas for structured data markup on web pages”. It provides a hierarchical vocabulary, in some cases drawing on Microformats work, that can be used within the RDFa as well as Microdata formats.

Is it possible to represent Leap2A information in this kind of way? Initial exploratory work on Leap2R has suggested that it is indeed possible to identify a set of classes and properties that could be used more or less as they are with RDFa, or could be correlated with the schema.org hierarchy for use with Microdata. However, the solution needs detail adding and working through.

In principle, using RDFa or Microdata, any portfolio information could be output as HTML, with the extra information currently represented by Leap2A added into the HTML attributes, which is not directly displayed, and so does not interfere with human reading of the HTML. Thus, this kind of representation could fully serve all the purposes currently served by HTML export of Leap2A. It seems highly likely that practical ways of doing this can be devised that can convey the complete structure currently given by Leap2A. The requirements currently satisfied by Leap2A would be satisfied by this new format, which might perhaps be called “Leap2H5″, for Leap2 information in HTML5, or maybe alternatively “Leap2XR”, for Leap2 information in XHTML+RDFa (in place of Leap2A, meaning Leap2 information in Atom).

Thus, in principle it appears perfectly possible to have a single format that simultaneously does the job both of HTML and Leap2A, and so could serve as a plausible principal export and import format, removing that key obstacle identified in paragraph 36 of the Leap2A review report. The practical details may be worked out in due course.

There is another clear motivation in using schema.org metadata to mark up portfolio information. If a web page uses schema.org semantics, whether publicly displayed on a portfolio system or on a user’s own site, Google and others state that the major search engines will create rich snippets to appear under the search result, explaining the content of the page. This means, potentially, that portfolio presentations would be more easily recognised by, for instance, employers looking for potential employees. In time, it might also mean that the search process itself was made more accurate. If portfolio systems were to adopt export and import using schema.org in HTML, it could also be used for all display of portfolio information through their systems. This would open the way to effective export of small amounts of portfolio information simply by saving a web page displayed through normal e-portfolio system operation; and could also serve as an even more effective and straightforward method for transferring small amounts of portfolio information between systems.

Having recently floated this idea of agreeing Leap2 semantics in schema.org with European collaborators, it looks like gaining substantial support. This opens up yet another very promising possibility: existing European portfolio related formats could be harmonised through this new format, that is not biased towards any of the existing ones — as well as Leap2A, there is the Dutch NTA 2035 (derived from IMS ePortfolio), and also the Europass CV format. (There is more about this strand of unfunded work through MELOI.) All of these are currently expressed using XML, but none have yet grasped the potential of schema.org in HTML through microdata or RDFa. To restate the main point here, this means having the semantics of portfolio information embedded in machine-processable ways, without interfering with the human-readable HTML.

I don’t want to be over-optimistic, as currently money tends only to go towards initiatives with a clear business case, but I am hopeful that in the medium term, people will recognise that this is an exciting and powerful potential development. When any development of Leap2 gets funded, I’m suggesting that this is what to go for, and if anyone has spare resource to work on Leap2 in the meanwhile, this is what I recommend.