Installing Shibboleth
What and Why Is Shibboleth?
One of the major issues that faces all today’s Internet users is identity management: how to prove to a Web site that you are who you claim you are, and do so securely enough to prevent someone else being able to convince the Web site that they are you. There are many initiatives attacking the problem, with approaches both technical and legal.
Shibboleth [1] is a relatively new piece of software which concentrates on one specific area: trust management within the Higher Education community and between that community and the academic publishers which service it. In this arena, there is one very important player: the institution. Each individual in the community is part of some institution (university, college, research centre etc.), and this institution knows a great deal about the individual. This means that it becomes possible to separate the information about the individual (i.e. that held by the institution) from the information about the resource (i.e. that held by the publisher); this in turn makes it possible to anonymise the information about the individual in many cases, as most resources only need to know that a user is a member of an institution which holds a valid licence for access. The purpose of Shibboleth is to provide a secure and verifiable channel for the transfer of information about the individual from the institution to the publisher.
The architecture of Shibboleth moves on from the clumsy or over-centralised models which have catered for the needs of the Higher Education community up to this point, and its widespread acceptance shows that this is the time to move forward.
Like many open source software packages, Shibboleth uses (and informs development of) standards. The most relevant ones are Security Assertion Markup Language (SAML) [2], which is used for the passing of information about access by Shibboleth; and eduPerson [3], an LDAP (Lightweight Directory Access Protocol) object class for describing individuals in Higher Education. This is, like Shibboleth, a project which is part of the massive Internet2 [4] development initiative. For those unfamiliar with Internet2, its overall brief is to fund development leading to the next generation of the Internet, and Shibboleth is part of its Middleware Initiative [5], focusing on developing the software between the network and applications, and related standards.
How Does Shibboleth Work?
Shibboleth has an extremely complex architecture, despite only implementing a selection of possible SAML assertions. It divides into two major components, the Identity Provider (known as the origin up until version 1.2) and the Service Provider (formerly the target). As the names suggest, the Identity Provider makes authenticated information about individuals available, while the Service Provider protects a resource, allowing access only to authorised users. The two parts divide into several components, which sequentially deal with authentication and then provision of user attributes so that the resource can determine what the user is authorised to access.
The trust architecture of Shibboleth is based on the idea of a federation. This is a grouping of identity and service providers which agree on such issues as usage of attributes (e.g. more precise shared ideas of what something like ‘member’ means than there is in the eduPerson specification) and where members provide metadata for distribution across the federation. Federations can be extremely informal, like the InQueue Federation [6] set up by Internet2 for test purposes, or they can have membership conditions which are strict and complex, like the InCommon Federation [7]. Federations can have a small number of members (the smallest containing one of each type of provider), or a large number (though there are scaling difficulties with finding the relevant identity provider for an individual - something which is the most immediately obvious limitation of the Shibboleth software). Within a federation, fine-grained access control is, of course, possible: a Service Provider can deny access to a user even if they have been vouched for by an Identity Provider that it trusts, if the attributes sent to it by the Identity Provider do not show a right to use the resource. Both the Identity and Service Providers can be members of several federations at once.
Installing the Shibboleth Identity Provider
The Shibboleth Identity Provider requires a complex and up-to-date institutional infrastructure to be present prior to a full installation, and this needs to be planned properly before going ahead. For testing, it is possible to omit a fair amount of what is required in the longer term, but such an installation will not be suitable for serious use. The actual machine is reasonably standard, with a secure Web server and a servlet container (typically Apache and Tomcat, but it has also been tested with IIS for those irretrievably committed to Microsoft). One point that does need to be borne in mind when planning an installation is that time needs to be allocated to obtain a server certificate, which can take up to several weeks, depending on the supplier used and the state of the institutional relationship with them. Test and self-signed certificates may be used, but are not suitable for a production installation. Note that many federations will place limitations on the particular certificate suppliers which can be used for verification of communication between members (because certificate information, including root certificates, also needs to be shared); Shibboleth also uses certificates for other purposes (e.g. for securing HTTP connections), and these are not restricted.
The single requirement which may well lead to a significant amount of work is the need for an institutional repository which provides the data passed on by the identity provider as attributes: an institutional LDAP server is by a long way the most likely solution here. While many institutions own an LDAP server (because most email servers come equipped with one), they are not always activated and are frequently little used except for address look-up from email clients. Most Shibboleth Service Providers will require eduPerson attributes to be passed via the Identity Provider, and so the eduPerson schema needs to be loaded into the LDAP server and the attributes populated, something that in itself may require some debate within an institution (to determine what data fits best into each attribute). At the moment, most resources accessible through Shibboleth do not require anything complicated in terms of attributes, so it is possible to carry out this process in stages; but at the very least, an Identity Provider is going to need to be able to give out a user’s eduPersonScopedAffiliation, and even there the value ‘member@institution.ac.uk’ is likely to be satisfactory. Getting the eduPerson schema and data into the Windows ActiveDirectory LDAP server was the biggest single task involved in implementing the Shibboleth Identity Provider at the London School of Economics; the difficulty of this task will depend on the LDAP server and the ease of making the data available to it from other systems.
The documentation suggests that a WebISO installation is also needed, to provide the authentication needed by Shibboleth. In fact, it is possible to get round this; I have installed an Identity Provider using mod_authz_ldap [8] to authenticate directly against the institution’s LDAP server, and this is probably an easier route to follow if the institution has no general need for an internal single sign-on system (such systems being complex software in their own right). If this is the case, it is important that the userID and password used to authenticate to the LDAP server should be ones which are familiar to the user; if it is possible to use the standard network credentials, this is ideal.
For the Identity Provider, there are currently two main official documents which describe the installation process. Both are useful, though they also have deficiencies. One of these, the deployment guide [9], is a general description. It can seem complicated and hard to follow. The other, the checklist [10], is more informal and was developed specifically for the purposes of the Internet2 Shibboleth Installfest events, at which a roomful of system administrators are guided through the installation process. This describes a simplified installation without the complex institutional prerequisites listed above. It also uses informal server certificates that can be more easily obtained and so have no assurance. This means that in places its instructions are not appropriate in the institutional context. There are also a few places where the two documents disagree; this may be confusing, but following one or the other will work in most situations. Neither document really tackles how to get things working if (as is almost certain) the installation doesn’t work first time; there is a (sketchy) FAQ [11] for this or the installers can google the shib-users [12] mailing list - or indeed, post to it themselves. The Shibboleth user community is still small, and it’s a friendly mailing list where the newbie can expect sympathetic treatment.
Testing can be carried out through a federation which allows this (InQueue Federation [6] or SDSS [13] for UK institutions), or (if the installer feels brave) by simultaneous installation of a local Service Provider. Whatever you choose to do, the co-operation of the Service Provider to be used for testing is essential, as it is usually only by a comparison of the error messages at each end that problems can be diagnosed and fixed. Most problems with new installations seem to be related to the metadata that the two providers use in their respective configurations to describe each other, once obvious problems such as incorrect firewall settings are fixed. This is one reason why it is useful to test a new Identity Provider by connecting to a Service Provider which is known to work, and vice versa.
Installing the Shibboleth Service Provider
Installing the Service Provider is for some institutions likely to be the logical next step once the Identity Provider is working. (I say ‘some’ because not all will feel a need to do so. Possible reasons for such a requirement might be to protect and share resources with other institutions or, more likely in the future perhaps, to provide single sign-on to many resources via an institutional portal). This is also a complex process, though much easier once the Identity Provider has been installed: many of the same pitfalls can be avoided as a result of this experience.
There are RPMs available, and on Fedora Core 3 (the system I installed it on) these need to be used as they contain a bug fix not yet in the main source code release. (This is something that I only discovered when a source code release failed to work and I searched the mailing list for a solution.) Those wanting to use the RPMs need to be cautious: there were reports on the mailing list of clashes with other RPMs already installed, but this was not a problem I encountered personally.
The current release is also not compatible with the SELinux [14] security policies implemented by default on Fedora Core 3. There is only a deployment guide for this installation [15], but it seems less complete than the companion Identity Provider document, so an installer is likely to need to spend a fair amount of time looking error messages up on the mailing list, particularly if they are not installing the Service Provider to use the InQueue Federation. Some parts of the XML configuration are not documented at all, and need to be inferred from the configuration files distributed with the software.
Conclusions
Shibboleth is a complex piece of software, and its installation is likely to stretch many institutions - particularly when it comes to setting up the infrastructure needed for it. It is essential to plan the process carefully, and to ensure that enough time is allocated. The skills required are at the advanced end of system administration; experience in installing software beyond just downloading RPM files (or the equivalent for your favourite Linux distribution) is necessary, and expertise in handling Web server configuration will also be needed. The difficulty of understanding Shibboleth is heightened by the out-of-date nature of the documentation of the architecture; it is hard to answer questions like ‘what is the role played by cookies?’ without recourse to the mailing list. This documentation is currently being updated ready for the 1.3 release.
The Shibboleth software generally seems to work fine, once configured correctly. Logging is reasonable, and when turned up to ‘debug’ level more than comprehensive enough to help locate a problem. (One issue I experienced with the logging is that, when the Identity Provider is sharing a Tomcat server with other applications, it tends to take over the logging for all of them, so that log messages for the other applications will confusingly also appear in the Shibboleth log files.) So far, there have not been any serious security concerns, and the issues that have been raised have been followed by quick updates fixing the problem. The scalability issue in respect of federations, as mentioned above, is being addressed by the development team. (However, the problem is one of interface design and is slightly out of their hands, as the information enabling a user to find their Identity Provider is likely to be displayed in a different fashion by each federation.)
Shibboleth has been widely accepted as the sensible way forward for authentication in the Higher Education context, being endorsed by national bodies elsewhere in Europe as well as in the UK and US. More publishers are making their material accessible via Shibboleth all the time; for example, an Identity Provider set up with the SDSS Federation can provide access to a number of resources hosted at Edina and MIMAS (depending on your institutional licences) - making a far better demonstration of the success of Shibboleth than the test sites that were previously available. This means that time spent working on a Shibboleth installation is unlikely to be wasted.
This means that the only remaining impediment for institutions regarding Shibboleth is the difficulty of installing it. This is a major hurdle, and has prompted eduServ, for example, to begin to write its own version of the software with the aim (in part) of making this a much simpler process. The difficulty is partly that the Identity Provider is software for an inherently complex task, but there are definitely issues that would be worth working on here.
The obvious area for improvement with respect to ease of software installation is the documentation. The existing documentation should be tidied up - though production of documents is notoriously a low priority of open source developers. The production of two install guides (even if intended for different purposes) for the Identity Provider which do not quite agree is even poorer than the average. There have been several independent attempts to produce installation guides, particularly ones which focus on specific varieties of Unix or on Windows, which are worth consulting and which are easily found via a search engine. These often contain details derived from the installation experience which could usefully be fed back into the general, official documentation, as this is seriously lacking in help with problem resolution.
It would not be fair to end without pointing out that work is currently under way to improve the documentation, using a new Wiki-based format (also accessible to members of the SDSS Federation), and this will, I hope, address many of these issues. The current problem-solving model, where members of the development team tend to field queries on the mailing list, cannot continue to be supported as the number of installations increases, so improvements in the documentation of problem resolution are vital for the continued success of the software.
Acknowledgements
Thanks to Masha Garibyan for the diagram, which is derived from one originally produced by SWITCH (Swiss Education and Research Network); to Ian Young, Masha Garibyan and John Paschoud for reading and commenting on a draft; to Ian Young and Fiona Culloch for help in joining the SDSS Federation; to Scott Cantor and Walter Hoehn for technical support.
References
- Shibboleth Project Web site http://shibboleth.internet2.edu/
- SAML Home Page http://www.oasis-open.org/committees/tc_home.php?wg_abbrev=security
- eduPerson Home Page http://www.educause.edu/content.asp?PAGE_ID=949&bhcp=1
- Internet2 Web site http://www.internet2.edu/
- Internet2 Middleware Initiative Home Page http://middleware.internet2.edu/
- InQueue Federation Home Page http://inqueue.internet2.edu/
- InCommon Federation Web site http://www.incommonfederation.org/
- mod_authz_ldap Home Page http://authzldap.othello.ch/
- Shibboleth Identity Provider Deployment Guide
http://shibboleth.internet2.edu/guides/deploy-guide-origin1.2.1.html - Shibboleth Identity Provider Installation Checklist
http://shibboleth.internet2.edu/guides/identity-provider-checklist.html
The account of one such ‘Installfest’ can be read in Ariadne, issue 42, January 2005:
Shibboleth Installation Workshop - Shibboleth Frequently Asked Questions (FAQ) https://umdrive.memphis.edu/wassa/public/shib.faq/shibboleth-faq.html
- shib-users mailing list: archive
https://mail.internet2.edu/wws/arc/shibboleth-users
joining instructions
http://shibboleth.internet2.edu/shib-misc.html#mailinglist - Shibboleth Development and Support Services Federation http://sdss.ac.uk/
- SELinux Home Page http://www.nsa.gov/selinux/
- Shibboleth Service Provider Deployment Guide
http://shibboleth.internet2.edu/guides/deploy-guide-target1.2.1.html