achievement information | Wilbert Kraan

Last weekend, a motley crew of designers, students, developers, business and government people came together in Edinburgh to prototype designs and apps to help learners manage their journeys. With help, I built a prototype that showed how curriculum and course offering data can be combined with e-portfolios to help learners find their way.

The first official Scottish government data jam, facilitated by Snook and supported by TechCube, is part of a wider project to help people navigate the various education and employment options in life, particularly post 16. The jam was meant to provide a way to quickly prototype a wide range of ideas around the learner journey theme.

While many other teams at the jam built things like a prototype social network, or great visualisations to help guide learners through their options, we decided to use the data that was provided to help see what an infrastructure could look like that supported the apps the others were building.

In a nutshell, I wanted to see whether a mash-up of open data in open standard formats could help answer questions like:

Where is the learner in their journey?
Where can we suggest they go next?
What can help them get there?
Who can help or inspire them?

Here’s a slide deck that outlines the results. For those interested in the nuts and bolts read on to learn more about how we got there.

Where is the learner?

To show how you can map where someone is on their learning journey, I made up an e-portfolio. Following an excellent suggestion by Lizzy Brotherstone of the Scottish Government, I nicked a story about ‘Ryan’ from an Education Scotland website on learner journeys. I recorded his journey in a Mahara e-portfolio, because it outputs data in the standard LEAP2a format- I could have used PebblePad as well for the same reason.

I then transformed the LEAP2a XML into very rough but usable RDF using a basic stylesheet I made earlier. Why RDF? Because it makes it easy for me to mash up the portfolios with other datasets; other data formats would also work. The made-up curriculum identifiers were added manually to the RDF, but could easily have been taken from the LEAP2a XML with a bit more time.

Where can we suggest they go next?

I expected that the Curriculum for Excellence would provide the basic structure to guide Ryan from his school qualifications to a college course. Not so, or at least, not entirely. The Scottish Qualifications Framework gives a good idea of how courses relate in terms of levels (i.e. from basic to a PhD and everything in between), but there’s little to join subjects. After a day of head scratching, I decided to match courses to Ryan’s qualifications by level and comparing the text of titles. We ought to be able to do better than that!

The course data set was provided to us was a mixture of course descriptions from the Scottish Qualifications Authority, and actual running courses offered by Scottish colleges all in one CSV file. During the jam, Devon Walshe of TechCube made a very comprehensive data set of all courses that you should check out, but too late for me. I had a brief look at using XCRI feeds like the ones from Adam Smith college too, but went with the original CSV in the end. I tried using LOD Refine to convert the CSV to RDF, but it got stuck on editing the RDF harness for some reason. Fortunately, the main OpenRefine version of the same tool worked its usual magic, and four made-up SQA URIs later, we were in business.

This query takes the email of Ryan as a unique identifier, then finds his qualification subjects and level. That’s compared to all courses from the data jam course data set, and whittled down to those courses that match Ryan’s qualifications and are above the level he already has.

The result: too many hits, including ones that are in subjects that he’s unlikely to be interested in.

So let’s throw in his interests as well. Result: two courses that are ideal for Ryan’s skills, but are a little above his level. So we find out all the sensible courses that can take him to his goal.

What can help them get there?

One other quirk about the curriculum for excellence appears to be that there are subject taxonomies, but they differ per level. Intralect implemented a very nice one that can be used to tag resources up to level 3 (we think). So Intralect’s Janek exported the vocabulary in two CSV files, which I imported in my triple store. He then built a little web service in a few hours that takes the outcome of this query, and returns a list of all relevant resources in the Intralibrary digital repository for stuff that Ryan has already learned, but may want to revisit.

Who can help or inspire them?

It’s always easier to have someone along for the journey, or to ask someone who’s been before you. That’s why I made a second e-portfolio for Paula. Paula is a year older than Ryan, is from a different, but nearby school, and has done the same qualifications. She’s picked the same qualification as a goal that we suggested to Ryan, and has entered it as a goal on her e-portfolio. Ryan can get it touch with her over email.

This query takes the course suggested to Ryan, and matches it someone else’s stated academic goal, and reports on what she’s done, what school she’s from, and her contact details.

Conclusion

For those parts of the Curriculum for Excellence for which experiences and outcomes have been defined, it’d be very easy to be very precise about progression, future options, and what resources would be particularly helpful for a particular learner at a particular part of the journey. For the crucial post 16 years, this is not really possible in the same way right now, though it’s arguable that its all the more important to have solid guidance at that stage.

Some judicious information architecture would make a lot more possible without necessarily changing the syllabus across the board. Just a model that connects subject areas across the levels, and school and college tracks would make more robust learner journey guidance possible. Statements that clarify which course is an absolute pre-requisite for another, and which are suggested as likely or preferable would make it better still.

We have the beginnings of a map for learner journeys, but we’re not there yet.

Other than that, I think agreed identifiers and data formats for curriculum parts, electronic portfolios or transcripts and course offerings can enable a whole range of powerful apps of the type that others at the data jam built, and more. Thanks to standards, we can do that without having to rely on a single source of truth or a massive system that is a single point of failure.

Find out all about the other great hacks on the learner journey data jam website.

All the data and bits of code I used are available on github

After a good session with the folks from the Achievement Standards Network (ASN), and earlier discussions with Link Affiliates, I could see the potential of linking LEAP2a portfolios with ASN curriculum information and learning resources. So I implemented a proof of concept.

Fortunately, almost all the information required is already available as RDF: the ASN makes its machine readable curricula available in that format, and Zotero (my bibliography tool of choice) happily puts out its data in RDF too. What still needed to be done was the ability to turn LEAP2a eportfolios into RDF.

That took some doing, but since LEAP2a is built around the IETF Atom newsfeed format, there were at least some existing XSL transformations to build on. I settled on the one included in the open source OpenLink Virtuoso data management server, since that’s what I used for the subsequent Linked Data meshing too. Also, the OpenLink Virtuoso Atom-to-RDF XSLT came out of their ‘sponger’ middleware layer, which allows you to treat all kinds of structured data as if they were RDF datasources. That means that it ought to be possible to built a wee LEAP2a sponger cartridge around my leap2rdf.xslt, which then allows OpenLink Virtuoso to treat any LEAP2a portfolio as RDF.

The result still has limitations: the leap2rdf.xslt only works on LEAP2a records with the new, proper namespace, and it only works well on those records that use URIs, but not those that use Compact URIs (CURIEs). Fixing these things is perfectly possible, but would take two or three more days that I didn’t have.

So, having spotted my ponds of RDF triples and filled one up, it’s time to go fishing. But for what and why?

Nigel Ward and Nick Nicholas of Link Affiliates have done an excellent job in explaining the why of machine readable curriculum data, so I’ve taken the immediate advantages that they identified, and illustrated them with noddy proof-of-concept hows:

1. Learning resources can be easily and unambiguously tagged with relevant learning outcomes.
For this one, I made a query that looks up a work (Robinson Crusoe) in my Zotero bibliographic database and gets a download link for it, then checks whether the work supports any known learning outcomes (in my own 6-lines-of-RDF repository), and then gets a description of that learning outcome from the ASN. You can see the results in CSV.

It ought to have been possible to use a bookmarking service for the learning resource to learning outcome mapping, but hand writing the equivalent of

‘this book’ ‘aligns to’ ‘that learning outcome’

seemed easier

2. A student’s progress can be easily and unambiguously mapped to the curriculum.
To illustrate this one, I’ve taken Theophilus Thistledown’s LEAP2a example portfolio, and added some semi-appropriate Californian K-12 learning outcomes from the ASN against the activities Theophilus recorded in his portfolio. (Anyone can add such ASN statements very easily and legally within the scope of the LEAP2a specification, by the way) I then RDFised the lot with my leap2rdf XSLT.

I queried the resulting RDF portfolio to see what learning outcomes were supported by one particular learning activity, and I then got descriptions of each of these learning outcomes from the ASN, and also got a list of other learning outcomes that belong to the same curriculum standard. That is, related learning outcomes that Theophilus could still work on. This is what the SPARQL looks like, and the results can be seen here. Beware that a table is not the most helpful way of presenting this information- a line and a list would be better.

3. Lesson plans and learning paths to be easily and unambiguously mapped to the curriculum.
This is what I think of as the classic case: I’ve taken an RDFised, ASN enhanced LEAP2a eportfolio, and looked for the portfolio owner’s name, any relevant activities that had a learning outcome mapped against them, then fished out the identifier of that learning outcome and a description of same from the ASN. Here’s the SPARQL, and there’s the result in CSV.

Together, these give a fairly good of what Robinson Crusoe was up to, according to the Californian K-12 curriculum, and gives a springboard for further exploration of things like comparison of the learning outcomes he aimed for then with later statements of the same outcome or the links between the Californian outcomes with those of other jurisdictions.

4. The curriculum can drive content discovery: teachers and learners want to find online resources matching particular curriculum outcomes they are teaching.
While sitting behind his laptop, Robinson might be wondering whether he can get hold of some good learning resources for the learning activities he’s busy with. This query will look at his portfolio for activities with ASN learning outcomes and check those outcomes against the outcome-to-resource mapping repository I mentioned earlier. It will then look up some more information about the resources from the Zotero bibliographic database, including a download link- like so

The nice thing is that this approach should scale up nicely all the way from my six lines of RDF to a proper repository.

5. Other e-learning applications can be configured to use the curriculum structure to share information.
A nice and simple example could be a tool that lets you discover other learners with the same learning outcome as a goal in their portfolio. This sample query looks through both Theophilus and Robinson’s eportfolios, identifies any ASN learning outcomes they have in common, and then gets some descriptions of that outcome from the ASN, with this result.

Lessons learned

Of all the steps in this and other meshups, deriving decent RDF from XML is easily the hardest and most time consuming. Deriving RDF from spreadsheets or databases seems much easier, and once you have all your source data in RDF, the rest is easy.

Even using the distributed graph pattern I described in a previous post, querying across several datasets can still be a bit slow and cumbersome. As you may have noticed if you follow the sample query links, uriburner.com (the hosted version of OpenLink Virtuoso) will take it’s time in responding to a query if it hasn’t got a copy of all relevant datasets downloaded, parsed and stored. Using a SPARQL endpoint on your own machine clearly makes a lot of sense.

Perhaps more importantly, all the advantages of machine readable curricula that Nigel and Nick outlined are pretty easily achievable. The queries and the basic tables they produce took me one evening. The more long term advantages Nigel and Nick point out – persistence of curricula, mapping different curricula to each other, and dealing with differences in learning outcome scope – are all equally do-able using the linked data stack.

Most importantly, though, are the meshups that no-one has dreamed of yet.

What’s next

For other people to start coming up with those meshups, though, some further development needs to happen. For one, the leap2rdf.xslt needs to deal with a greater variety of LEAP2a eportfolios. A bookmark service that lets you assert simple triples with tags, and expose those triples as RDF with URIs (rather than just strings) would be great. The query results could look a bit nicer too.

The bigger deal is the data: we need more eportfolios to be available in either LEAP2a or LEAP2r formats as a matter of course, and more curricula need be described using the ASN.

Beyond that, the trickier question is who will do the SPARQL querying and how. My sense is that the likeliest solution is for people to interact with the results of pre-fabbed SPARQL queries, which they can manipulate a bit using one or two parameters via some nice menus. Perhaps all that the learners, teachers, employers and others will really notice is more relevant, comprehensive and precisely tailored information in convenient websites or eportfolio systems.

Resources

The leap2rdf.xslt is also available here. Please be patient with its many flaws- improvements are very welcome.

Wilbert Kraan

Cetis blog

Category Archives: achievement information

What could a GPS for learner journeys look like?

How to meshup eportfolios, learning outcomes and learning resources using Linked Data, and why