PROD’s Progress

Apart from the previous post about the OpenID implementation it has been a while since I’ve written about PROD so here is the “vision” and some details of what’s happening with the project.

Before we go on I’ve written an FAQ on the PROD wiki which you are all advised to have a look at…

The PROD Vision:

PROD is a dynamic directory of JISC projects providing an easy-to-use way to locate projects and get a view of their current status and activity. Through integration with the Standards Catalogue and e-Framework it will also provide an overview of interoperability standards used by projects and their rationale for doing so.

PROD draws information on projects from a number of sources including the JISC website, individual project sites and project RSS feeds. We have also developed import mechanisms for legacy spreadsheets and catalogues.

The data in prod can be exported in standard formats (including RSS, ATOM, DOAP and CSV) to facilitate re-use in other catalogues.

Progress report

People oriented activities:

We are currently looking at how this data can facilitate integration with efforts at OSSwatch and with the JISC PIM system. We had a meeting in London to discuss how we can leverage doap across the different systems to exchange data and avoid duplication of effort. Present included Ross Gardler from OSSwatch with SIMAL, Yvonne Howard and Dave Millard from Southampton with their e-Framework Knowledge Base, Neil Chue Hong from OMII in Edinburgh, and Simone Spencer who is heading up the JISC PIM. It was pretty satisfying to feel we all agreed that with a bit of work on our respective DOAP implementations we would be able share core project data and thus concentrate on the more individual value-adding aspects of our projects.

Here in Bolton we are holding a workshop tomorrow on how we plan to use PROD internally to help us with the process of ”technical audits” of projects and how we can go about integrating PROD with the other JISC CETIS web offerings.

Ongoing development work:

DOAP, RSS & CSV export for collections of projects through the browse/query interface. We’re also thinking about making widgets to embed this in other places (like the main JISC CETIS site – or your own personal iGoogle or Dashboard if you like!)

OpenID associations for existing users – this is part of the general OpenID implementation across JISC CETIS sites. Currently it works to enable commenting.

Selectively elevated privileges for project staff and programme managers. This will happen automatically through existing data where available, we will also put in a “claim” button to users to assert a relationship to a project where a connection is not already held.

General review of data held, sanitisation particularly around people, organisations, themes. This will include a manual trawl for project sites, feeds etc where they haven’t been auto-discovered. Administrative interfaces may also see some improvement.

Integration with Standards Catalogue. Users (CETIS staff, projects, etc) will be able to associate projects with relevant standards and comment on the rationale for their use or implementation. The standards catalogue bit is working fine now.

Integration with main JISC CETIS sites – highlighting relevant projects within domain pages and other CETIS output (blogs, e-learning focus etc). This activity will be of particular relevance to ongoing comms work including the “technology & standards briefings”.

Highlights of completed development work to date:
(Roughly in order of implementation)

  • Core data model
  • Core interface
  • Old directory import
  • JISC spreadsheet import
  • DOAP export
  • Search interface
  • Funding status indicators
  • AJAX editing (administrators only at the moment)
  • JISC web-scraper
  • RSS feed-scraper
  • Data-sanitisation utilities (for admins)
  • Activity indicators
  • Comments
  • Browse & querying interface
  • OpenID authentication (for commenting)

Down and dirty with OpenID

I’ve spent the last few hours (after getting home from a swift pint in the pub admittedly) having one of those satisfying coding experiences where the dots just start joining up… I took the very nicely written OpenIDenabled PHP library and bolted it on to the authentication routines for PROD.

The technical principles behind OpenID are simple enough: the user tells your application their openid URL, the app asks the relevant provider if everything is ok, the provider comes back and tells the app a whole bunch of stuff saying that the user is kosher (or halal or whatever it says in their profile).

The latest version of the toolkit made this a breeze – coming as it does with working examples and very well documented code. Most of the work was putting in a few new hooks in my authentication script to catch both ends of the transaction, copying and pasting some code from the example scripts to create the consumer object and set it flying and finally catching the response at the end and telling my application that the user is now logged in.

As with most quick work there is still quite a bit tidying up to do – particularly around how I associate existing users in the LDAP directory with their OpenIDs… At the moment I’m just not bothering. Useful error messages would probably be a good idea too! Testing it with a few different providers is also a must.

One gotcha I discovered was that at some point the exact recipe for doing Delegation must have changed and that the library is more fussy about this than other implementations I’ve seen and used. When testing using my own domain’s delegation which I’ve had set up for years it was consistently failing. This is not good news as there are probably thousands of people who still have it set up exactly as I did…

Another (Ubuntu specific) issue was that it was failing to authenticate against yahoo’s service because I was missing some bits of openssl… This was fixed with a quick sudo apt-get install openssl ca-certificates

Now I’ve had a few brushes in recent months with OpenID mainly around the web provision for the XCRI project – where we got OpenID working across WordPress, Mediawiki, and (through some rather cheap hacking) BBpress. It was however reliant on plugins for said apps and never really a very satisfactory experience – generating a long string of complaints from users getting very variable results depending on which provider they were using. Upgrading any particular component of the site seemed to just lead to more chaos.

Sadly I think that these variable experiences do rather detract from the potential that OpenID has to help us all better manage our online identities. That and the insistence of so many “providers” like Yahoo! and WordPress.com that they are just that, providers and not consumers. I’ve already got about 6 OpenIDs on the go without really realising – useful for testing but the exact opposite of the single authentication service goal. Tsk tsk.

Anyway… Now that I’ve actually tackled the problem at a slightly deeper level I’m feeling confident that over time we can not only iron out XCRI’s woes but also introduce OpenID across the JISC CETIS (and IEC) services in a reasonably robust way. The future looks rosy, the sky is blue, thunderclouds? What thunderclouds?

See PROD run! Run PROD run!

As of a couple of days ago PROD has gone live. Not wishing to blow my own trumpet too much but it went out pretty much on time and on schedule too. The project tracking utility (for that is what it is) is at the first of several milestones , resplendent with a new look and feel and a fair amount of the back-end plumbing sorted out. For anyone who is interested, the front end is being done in PHP and I’ve put my Ruby books on the shelf for the time being.

Essentially at this stage the structure consists of a list of projects, each of which may have a multiplicity of properties owing their syntax to the DOAP specification and other variables derived from discussions with JISC. The taxonomies for this are flexible and extra possibilities can be added very easily. Looking at the DOAP RDF schema it would not be hard to add multiple language support too – but perhaps we can keep that for some other time! I’d be interested to know if people might want such a thing.

Prod is designed to derive its data from a range of sources and produce a unified up-to-date view of projects and the activity that is taking place within them. So far there are two data import modules – one for the old e-Learning Framework project database and another for the Excel spreadsheets of projects currently in use at JISC. The former was woefully out-of-date but provided an effective proof that the system was functioning, and the latter refreshingly brings us a relatively fresh data set extending up to December 2007 as well as details of programme managers, funding, dates (rendered useless by Excel sadly) and themes.

This has thrown up a couple of other increments to work on over the next week or so – tighter validation and sanitisation of data for one, and some way of managing the precedence of properties. For example there is currently no way of saying that the data from one source is more authoritative than another, the most recent addition always wins…

The e-framework integration work (known internally as “development tables”) is also a major push at the moment. I’ve got this pretty much worked out conceptually and am hoping to have a first cut at functionality by the end of the month.

The next major iteration will also see the logging and display of new activity, starting with property updates and increasing in scope as more input sources are added. This will mean a corresponding change to the front page, transforming the list of “active projects” into an activity list or “mini-feed”. Keeping this well attenuated (i.e. relevant) may take some tweaking but it hopefully will make a good at-a-glance view of exactly what is going on in project world.

PROD me until I squeak

I’ve spent some time over the past weeks thinking about and writing up a new specification for the Project Directory – now known as PROD. As previously discussed in my post entitled Out of my mind on doap and ohloh (my god that was in March!) it’s all about drawing in project information from a range of sources, twisting it about a bit, analysing it and producing metrics and presenting it in a friendly, socially-enabled way.

The challenge is taking this sea of information (something that is inherently large and complex) and attenuating it until it is easily digestible chimes rather well with much of what we have been discussing in our newly materialised department The Institute for Educational Cybernetics here at Bolton. By taking a modular approach to digesting the information produced by and about a project I’m envisaging “boiling it all down” to a series of activity indicators showing (for example) that a given project has made a high number of commits to it’s SVN repository over the last month, relative of course to how many commits all the other projects have made. Other metrics would include a buzz-o-meter to measure general web activity referring to the project (sourced from places such as Technorati and Delicious).

In terms of the project itself it’s going to be done in a rapid and sensible kind of way with regular monthly milestones for new functionality! There is a bit of a discussion going on about platforms (rails or php? I’m desperate to learn rails and this will be a good opportunity! On the other hand I code php in my sleep…)

PROD itself! (prod.cetis.org.uk)

Trac instance (trac.cetis.org.uk/trac.cgi/prod)
Including mockups, milestones, wikiness, tickets and all manner of trac goodness

Out of my mind on DOAP and OHLOH

One of my main projects at the moment is to devise and ultimately be part of implementing a new all-singing all-dancing project tracking system. The starting point for this is of course the one I prepared earlier which consists of a flat-ish database of the JISC-funded projects I’m interested in (not by any means all of them) mashed up with the magpie rss parser. So you get the projects, their recent blog posts, and aggregations of them.

The issues with it are around:

  • coverage – only a small subset are currently included
  • maintenance – new projects need adding, old projects need reviewing, there is no admin interface
  • added value – various kinds of information would add to the usefulness of the site
    • comments, ratings and reviews
    • more links and aggregations from blogs, wikis, repositories etc
    • relationships between projects – same developers, similar category
    • relationships with interop standards – it uses FOAF, it uses IMS QTI etc (this should be linking in with the e-framework service expression definitions)
    • indications of code quality and other metrics

So to some research – how might we go about developing this, and what exists out there in the same space?

DOAP

DOAP or Description Of A Project is a useful looking RDF-XML spec for describing projects. It has elements for descriptions, names, URLs (including those for source repositories) and person information by hooking in the FOAF spec.

There are a couple of models by which we could integrate this into a project tracker:

  1. Host the DOAPs: Projects and staff fill in a form on the tracker site – the tracker site produces (persistant) doap xml.
  2. Aggregate DOAPs: Projects host their own doap files – instead of filling out the form on the tracker site they can simply point it to their hosted file – the tracker then picks up the project details, feeds etc. They would be periodically spidered for updates. The files can be generated by a third party tool (doap-a-matic).

The aggreagtion approach is rather attractive from the point of view that projects become responsible for their own information. It is unattractive for the reason that projects may not bother to maintain such files properly. There is a further positive argument to say that if they don’t maintain their DOAP files, they should just be considered worthless and dead – such tough love might be just what they need.

As I have alluded to in earlier posts I’ve had a couple of discussions with Ross Gardler from OSS Watch who is also engaged in activities around tracking JISC’s projects. He is also interested in using DOAP to achieve this in combination with his beloved Apache Forrest.

DOAP me up: some useful DOAP resources

  • The DOAP home page
  • Doap-a-matic web-form to generate a DOAP
  • There SHOULD be a validator service however it doesn’t seem to exist these days. I suspect link-rot…. which doesn’t excatly inspire confidence in the whole DOAP initiative :(

Ohloh

Then again there is always the question – why are we bothering at all with our own tracker when there are better solutions out there in the world. One such is Ohloh.net which does much of what we need; user comments, reviews, feed aggregation and general project information – and it really does the business when it comes to automated analysis of source code. I ran it over a little open source project I created and was delighted to learn that my efforts would be worth over 100 thousand dollars if I were being paid, that my code is mainly written in php and that there is a small development team of 2 people.

Samxom in Ohloh

This is just marvellous – and could even be used directly in combination with instructions to projects to employ little careful tagging. The word JISC perhaps might do the job. While it might be very web-2 and very trendy the con with this is that it is out of our control – and I’m not quite sure of the provenance and policy of Ohloh.