How on earth do I add OpenID to my LDAP schema

Okay – this is bugging me.

The scenario is as follows: I have an OpenLDAP directory with several hundred users in it. For the records I’m using the normal inetorgperson schema.

I want to add an openid attribute for my users (in a responsible and proper way) so that I can associate users with multiple arbitrary external OpenID providers.

All I’ve managed to find on the net about this was a blog at oracle discussing how this is an issue and how it would be a really good idea to do something about it.

I’m all at sea – how on earth am I supposed to do this? Do I create a new subclass of inetorgperson and migrate everyone on to it? Can I do this without breaking everything? Do I hackily use the “labeledURI” attribute and just shove things in there?

Come on lazyweb!

See PROD run! Run PROD run!

As of a couple of days ago PROD has gone live. Not wishing to blow my own trumpet too much but it went out pretty much on time and on schedule too. The project tracking utility (for that is what it is) is at the first of several milestones , resplendent with a new look and feel and a fair amount of the back-end plumbing sorted out. For anyone who is interested, the front end is being done in PHP and I’ve put my Ruby books on the shelf for the time being.

Essentially at this stage the structure consists of a list of projects, each of which may have a multiplicity of properties owing their syntax to the DOAP specification and other variables derived from discussions with JISC. The taxonomies for this are flexible and extra possibilities can be added very easily. Looking at the DOAP RDF schema it would not be hard to add multiple language support too – but perhaps we can keep that for some other time! I’d be interested to know if people might want such a thing.

Prod is designed to derive its data from a range of sources and produce a unified up-to-date view of projects and the activity that is taking place within them. So far there are two data import modules – one for the old e-Learning Framework project database and another for the Excel spreadsheets of projects currently in use at JISC. The former was woefully out-of-date but provided an effective proof that the system was functioning, and the latter refreshingly brings us a relatively fresh data set extending up to December 2007 as well as details of programme managers, funding, dates (rendered useless by Excel sadly) and themes.

This has thrown up a couple of other increments to work on over the next week or so – tighter validation and sanitisation of data for one, and some way of managing the precedence of properties. For example there is currently no way of saying that the data from one source is more authoritative than another, the most recent addition always wins…

The e-framework integration work (known internally as “development tables”) is also a major push at the moment. I’ve got this pretty much worked out conceptually and am hoping to have a first cut at functionality by the end of the month.

The next major iteration will also see the logging and display of new activity, starting with property updates and increasing in scope as more input sources are added. This will mean a corresponding change to the front page, transforming the list of “active projects” into an activity list or “mini-feed”. Keeping this well attenuated (i.e. relevant) may take some tweaking but it hopefully will make a good at-a-glance view of exactly what is going on in project world.

Out of my mind on DOAP and OHLOH

One of my main projects at the moment is to devise and ultimately be part of implementing a new all-singing all-dancing project tracking system. The starting point for this is of course the one I prepared earlier which consists of a flat-ish database of the JISC-funded projects I’m interested in (not by any means all of them) mashed up with the magpie rss parser. So you get the projects, their recent blog posts, and aggregations of them.

The issues with it are around:

  • coverage – only a small subset are currently included
  • maintenance – new projects need adding, old projects need reviewing, there is no admin interface
  • added value – various kinds of information would add to the usefulness of the site
    • comments, ratings and reviews
    • more links and aggregations from blogs, wikis, repositories etc
    • relationships between projects – same developers, similar category
    • relationships with interop standards – it uses FOAF, it uses IMS QTI etc (this should be linking in with the e-framework service expression definitions)
    • indications of code quality and other metrics

So to some research – how might we go about developing this, and what exists out there in the same space?


DOAP or Description Of A Project is a useful looking RDF-XML spec for describing projects. It has elements for descriptions, names, URLs (including those for source repositories) and person information by hooking in the FOAF spec.

There are a couple of models by which we could integrate this into a project tracker:

  1. Host the DOAPs: Projects and staff fill in a form on the tracker site – the tracker site produces (persistant) doap xml.
  2. Aggregate DOAPs: Projects host their own doap files – instead of filling out the form on the tracker site they can simply point it to their hosted file – the tracker then picks up the project details, feeds etc. They would be periodically spidered for updates. The files can be generated by a third party tool (doap-a-matic).

The aggreagtion approach is rather attractive from the point of view that projects become responsible for their own information. It is unattractive for the reason that projects may not bother to maintain such files properly. There is a further positive argument to say that if they don’t maintain their DOAP files, they should just be considered worthless and dead – such tough love might be just what they need.

As I have alluded to in earlier posts I’ve had a couple of discussions with Ross Gardler from OSS Watch who is also engaged in activities around tracking JISC’s projects. He is also interested in using DOAP to achieve this in combination with his beloved Apache Forrest.

DOAP me up: some useful DOAP resources

  • The DOAP home page
  • Doap-a-matic web-form to generate a DOAP
  • There SHOULD be a validator service however it doesn’t seem to exist these days. I suspect link-rot…. which doesn’t excatly inspire confidence in the whole DOAP initiative :(


Then again there is always the question – why are we bothering at all with our own tracker when there are better solutions out there in the world. One such is which does much of what we need; user comments, reviews, feed aggregation and general project information – and it really does the business when it comes to automated analysis of source code. I ran it over a little open source project I created and was delighted to learn that my efforts would be worth over 100 thousand dollars if I were being paid, that my code is mainly written in php and that there is a small development team of 2 people.

Samxom in Ohloh

This is just marvellous – and could even be used directly in combination with instructions to projects to employ little careful tagging. The word JISC perhaps might do the job. While it might be very web-2 and very trendy the con with this is that it is out of our control – and I’m not quite sure of the provenance and policy of Ohloh.

Fixing feeds

Last week I managed to get round to doing several items on the Web Tasklist (private wikipage) including sorting out all the JISC CETIS site news feeds. This covers the main feed from the front page, feeds organised by tag, and feeds from the events system. Needless to say they are now all validating nicely and easily locatable by all your favourite aggregators.

The real stick in the mud with producing the feeds turned out to be the precise formatting of dates. The ATOM 1.0 spec requires dates to be formatted according to RFC3339 and the various flavours of RSS require a variation of RFC822. All very well I think, I have the mighty Smarty templating engine running atop PHP. All I need to do is ask it to format the dates using the built-in date format conversion support, isn’t it. But no, that would be too easy.

Smarty has a useful modifier plugin called date_format which converts incoming dates (from php or mysql native date formats) into anything you might want. It is essentially a wrapper for the PHP strftime function, taking the same format instructions as the C function of the same name. So I start concocting format strings for the two RFCs in question and trying to get them to validate.
I also tried using PHP’s date() function – this takes a completely different syntax to produce the desired output including useful constants for such standard dates. Not that they were any help either!

Atom (rfc3339)

PHP Function Format string Sample output Problem
strftime() %Y-%m-%dT%H:%M:%SZ 2007-02-12T17:01:07Z The Z is a fudge – the time might not actually be in the UTC timezone
strftime() %Y-%m-%dT%H:%M:%S%Z 2007-02-12T17:01:07UTC No – UTC is not valid…
strftime() %Y-%m-%dT%H:%M:%S%z 2007-02-12T17:01:07+0000 Using lowercase %z better but missing colon in time-zone
date() DATE_ATOM 2007-02-12T17:01:07+00:00 It’s right!

RSS (rfc822)

PHP Function Format string Sample output Problem
strftime() %a, %d %b %Y %H:%M:%S %Z Thu, 15 Feb 2007 17:12:23 UTC Produces ‘UTC’ as the time zone – this is not allowed in the rfc
strftime() %a, %d %b %Y %H:%M:%S %z Thu, 15 Feb 2007 17:12:23 +0000 Using the undocumented lowercase %z produces the right output!
date() DATE_RSS Thu, 08 Feb 2007 13:05:37 UTC NO NO NO Not UTC! It should be right for goodness sake!
date() D, d M y H:i:s O Thu, 08 Feb 2007 13:05:37 +0000 It’s right!

I find this state of affairs pretty silly really – bunging some dates in standard formats into a feed should be a trivial nothing and not something that takes hours of faffing to get quite right. I ended up writing a new smarty wrapper for the date() function with support for the useful constants and correction for RSS and RFC822 dates. The smarty plugin is attached.

Download: Smarty plugin phpdate_format

And finally our feeds validate. Touch wood.

iCalendar Gotchas

Another development in progress here on the CETIS site is a calendaring and event-registration system. We’ve built a database – and various members of staff have provided some events to put in it… For our next trick we need to get the data out again.

The main player in interchange formats for calendaring is the iCalendar spec (aka rfc2445. The format neatly imports into iCal on the mac or Google Calendar or most other modern calendaring systems…

HOWEVER when cranking the information from the database into iCalendar format I came across several important gotchas which I may otherwise not have noticed.

1) The calendars must be encoded as LATIN 1 with Windows-style CR,LF line breaks. Not doing this causes iCal to barf on the file.
2) You need to watch out for carriage returns generally. For example an event description spread over several lines can cause problems – or missing one out between fields for that matter.

With these issues more-or-less sorted we now have a (hopefully) working cetis-events ical feed.