Controlling Access in the Electronic Library
Abstract
The growth of networking and the Internet has led to more and more information resources being funded centrally by JISC (the Joint Information Systems Committee of the Higher Education Funding Council) and provided from a single or limited number of locations to the whole of the academic community. Centralised networked services such as these have been a fact of life in commercial organisations for many years and this model is now being adopted by government agencies like the NHS.
The centralisation of services naturally offers both cost and organisational benefits; however, it brings with it a complex user authentication problem which stems from the need to control access to a service with a large number of users, across many sites all requiring access.
While registration of users on each host system has been an effective approach in the past, the inter-networked community is ever increasing in size and already millions strong. This potentially has significant associated problems and a more practical model is needed. The development of such a system is not trivial but must be solved in a coherent way before the benefits of distributed networked resources can be fully realised.
Introduction & Environment
The term "libraries without walls" has been used both within and outside of eLib (the Electronic Libraries Programme) for the development of electronic libraries with multiple network centric points of access. These 'wall-less' libraries offer unique opportunities to share and distribute knowledge and information widely and concurrently to many interested parties around the world; they also offer manifold challenges in terms of:
- copyright and intellectual property rights administration,
- usage monitoring (and user profiling)
- access restriction
- charging
All of these are at least partially resolved by a user based authentication system, provided in a conventional library by the library card. In some libraries, cards not only identify the member to the librarian at the time of removing a book on loan, but also identify the member to the library itself (in the case of a library whose entrance is policed by card readers), to specific areas of the library, e.g. the copying room, video viewing rooms or computing facilities and also to confirm identity and status to staff in the event of a requirement to view or remove 'restricted material'.
The bulk of these uses are administrative in function. However such data as is available in the larger university or municipal library may now be used for profiling, usage monitoring and thus, for the improvement of service. The issue of 'restricting access' is often seen as a philosophical problem by those intent that knowledge should be as free and widely available as the learning from it is desirable. Such people often see the advent of computing as the opening of the final door in their quest, by making high-fidelity high quality information resources available to an audience that need not be restricted by print run or by economic 'cost or print' considerations.
It is likely however that few would agree to the wide and unrestricted distribution of the clinical photographs of child abuse [1], the photographs of human dissection [2] or perhaps the detailed instructions for the manufacture of poisons or drugs. Some would go further or set wider briefs or definitions of restricted material. For the scope of this discussion it is simply necessary to confirm that any single such restriction may be a requirement.
The technical and practical need for authentication flows from the requirement for any restriction and remains largely unchanged by the scope or detail of such restrictions. It is however true that most libraries have books (or other material) which for some reason are of restricted availability - being lent only to those with adequate reason (usually defined by some local procedure/national legislation) to view them.
These issues, which are not restricted to academic computing, are just as relevant to the wider community. Internet commerce is struggling to provide a secure method for transmitting payments across the Internet, some of which require a confirmed identity for one or both of the parties involved in the transaction. Many other information providers are becoming increasingly interested in restricting portions of their services to certain pre-registered users, in order that they may provide information that is either confidential or commercially restricted; still more people are concerned about the lack of security with the current standards for e-mail, leading to technologies such as PGP and S/MIME becoming more prevalent as methods of authenticating sender and recipient identity and where necessary providing encryption. Many groups, including the W3C, are investigating the opportunities for progress in these areas and those of digital signatures [3] as well as other international standards such as X.509.
The Problem
Most authentication systems require some form of electronic user registration and subsequent access control and monitoring. The concept of username (identifier) and password (authenticator) in computing is as widespread as the use of the library card in libraries; however, it is seldom used in Internet information services. This arises for several reasons:
- Initially, many services were established to serve information with few or none of the above concerns - the information was meant be freely and anonymously available.
- Where restriction was seen as necessary, it was often applied on a site/campus wide level (e.g. only on-site hosts were given access to some resources).
- There are several problems of scale and detail in registering and securing access on a public network, which may have thousands, if not hundreds of thousands, of users. To give a measure of the scale of the problem applicable to the eLib programme, MIDRIB [4] a system aimed at a typically small faculty, medicine, has a potential 800,000 users. This is a significantly larger scale than registering users for access to a typical campus' computing facilities, e.g. The University of Bath, with 6000 undergraduates, 3000 postgraduate students and some 2000 staff or most typical corporate computing systems. Some of the bigger access controlled WWW sites do not even have that many users at present, e.g. The Times Newspaper which has had an electronic site established for some two years has recently (Jan 1997) reached 500,000 users [5].
Most existing WWW servers offer a link to Operating System (OS) level security with relatively small numbers of users (thousands as opposed to tens or hundreds of thousands) recommended as the optimum operational size for the UNIX password file. The type of electronic reference service typified by the eLib programme may either immediately or in the future need to be 'mirrored', 'cached' or otherwise 'distributed' Significant problems are posed with performing these duties with the existing OS-led authentication model. Indeed only two such systems are in common use; NIS and NIS+ from SUN is available for many flavours of UNIX while NDS from Novell is available for their proprietary NetWARE and UnixWARE operating systems (UnixWARE is now distributed and maintained by SCO) and converging with SCO OpenUnix. The latter is much more sophisticated than the former and there is some evidence to support the view that even SUN believe it, indeed they have recently licensed NDS from Novell for implementation in a future release of Solaris [6].
Consequently, many of the resulting resources have grown organically, driven by local need, into databases in the broadest sense of the word i.e. stored and structured data-sets with or without access control or monitoring. Some national services like Bath Information Data Services (BIDS) and National Information Services and Systems (NISS) have group usernames which are allocated at institution level. This type of system does not typically identify an individual, rather it locates a user at an institution, where the institution is known to the service. Commonly, users are left with one campus username, one or more usernames for national services and a larger number of 'partially' self selected usernames for other services, e.g. The Times [7] and New Scientist [8]. The problem is not confined to electronic reference works or journals; indeed the problem is not restricted to academia - most hospitals have more than one system (e.g. Administration and Pathology) as do many companies.
The Need
To summarise the requirements so far. It appears that there is a need for a locally administered, centrally managed, distributed authentication system which need not be accessed through the OS, although it could be. Such a system would need to be secure enough to be used to control access to the systems themselves as well as the information they contain, support a rich enough data-set to contain the majority of information required by any local system and be fast enough not to mitigate against regular and widespread usage. Self replication in such a system would also be desirable as it could provide for increased performance and more localised network traffic (so reducing the strain on the wide area). Local administration in such a model would permit accurate and timely data collection while minimising duplication of central or departmental administrative overhead and data re-entry. Such a system, given sufficient import and export tools, could even become a master index used by any or all institutions. Indeed such a local 'directory' of information would offer a resource of additional value to both the local institution and the 'world' as a whole.
Conclusions and a Way forward
The academic community has made some inroads into investigating the issues surrounding mass and wide scale implementation of an 'open' authentication service. The work commissioned by the JISC on Technologies to support Authentication in Higher Education [9] is a valuable study of the available technology and gives a useful survey of providers requirements in the academic community. It does not however offer firm recommendations as to a single coherent solution. Some existing academic information providers such as BIDS and NISS have developed their own infrastructures for the deployment of service wide authentication. The NISS system, ATHENS, [10] has evolved from a system based around departmental rather than individual level authentication (though it will now handle the latter) and is centralised rather than distributed in concept, though some of the administration can be devolved to libraries in other institutions to manage User Id's. It does however appear to provide a useful interim solution to many of the problems raised above.
The requirement for authentication services on any significant scale will require the widespread introduction of directory systems. In addition there will be a requirement for the secure (probably encrypted) transfer of data between the various system components. Recent increased interest in directory systems, in the form of the Lightweight Directory Access Protocol (LDAP) and the success of Novell's 'Novell Directory Services' (NDS), offers a glimmer of hope that the future will provide the kind of integrated solutions that would be of benefit to all. Taken with the availability of both the PGP and X.509v3 key based encryption standards, these developments in the availability of stable and scaleable directory infrastructures pave the way for the development of a secure national authentication infrastructure.
Such a distributed system, if implemented, could offer many advantages; the reduction in administrative overhead on the service providers could result in an increased spending (both in terms of effort and cash) on the provision of more, higher quality data resources, while the increase in load on local administrative authorities would be relatively low in comparison. The directory would therefore deliver both better value for money when assessed in terms of the resources delivered and reduced overheads, both of which are a significant benefit to funding authorities.
It may not yet be too incredible to suggest that (given the resources and a commitment and a clear national agreement as to the specification of such a service) one username per person providing access to all sites of interest throughout the community could be possible.
Authentication in computing is the process of confirming the identity of a user by comparing a 'secret' (piece of information known only to the user) with one stored against their user ID in a database of those authorised to use a service.
References
[1] The Children and Young Persons (Harmful Publications) Act, HMSO
[3] Digital Signatures
http://www.w3.org/pub/WWW/Security/DSig/DsigProj.html
[4] Medical Images Digitisted Reference and Information Bank, MIDRIB
http://www.midrib.ac.uk/
[5] The Times Newspaper, Interface, Wednesday 8th Jan 1997
[6] Sun licence NDS from Novell
http://www.novell.com/strategy/sunqa13.html
[7] The Times Newspaper
http://www.the-times.co.uk/
[8] New Scientist
http://www.newscientist.com/
[9] Technologies to Support Authentication in Higher Education v5, A study for the UK Joint Information Systems Committee, August 21st,1996: A.Young, P.T.Kirstein, A.Ibbetson
http://www.ukoln.ac.uk/elib/wk_papers/scoping/jisc5.html
[10] NISS Athens Web Pages
http://www.niss.ac.uk/authentication/index.html
Author Details
Andy Powell works for UKOLN on metadata and resource discovery projects
Email: a.powell@bath.ac.uk
Mark Gillett works in the Computer Unit of St Georges Medical School
Email: mgillett@sghms.ac.uk