Book Review: The Institutional Repository
This timely publication has arrived at a point where a number of UK Higher Education (HE) establishments have set up, have started, or have at least considered setting up their own institutional repository (IR). This is a new area for all involved, many experiences so far have been ground-breaking and there are few (if any) IRs which would describe themselves as mature. Not only is the technology developing rapidly, but user needs are continuing to be determined and institutions are expanding the ways in which an IR can serve their needs.
Fortunately for all involved, "The institutional repository" has appeared, written by three members of the library and information community who have won well earned respect for their contributions in this field. Other contributors include individuals with extensive and varied experience in IR circles, despite it being such a new area.
Who Should Read This Book?
According to the publisher's Web site, the intended readership is "implementers of institution repositories, digital librarians, academic librarians, library managers, librarians, information scientists and library studies students." A personal view would be that the first chapters should be compulsory reading for every vice chancellor and senior manager within a university in order that they understand the environment and the importance that IRs will/could play in information management.
Overall, this book is a text for the practitioner. Its main audience is likely to be those responsible for setting up and running the repository within an institution. No matter how far down the IR path the IR manager has ventured, there is something within the book for everyone, whatever their experience or lack of it. Because this is such a developing area, all those involved are pioneers.
The librarian responsible for the IR (and it is usually a librarian) needs a plethora of knowledge and skills, as well as support from others within their institution in order to undertake the task of repository management. This includes a mixture of technical commonsense together with 'softer' customer service skills. "The institutional repository" addresses all the relevant topics including digital preservation, advocacy, persistent identifiers, metadata, technical issues and rights.
Why Should They Read It?
The book enables readers to learn from the experiences of people who have actually 'got their hands dirty.' The examples and case study use the Edinburgh experience, particularly of e-theses [1]. It is designed to offer ideas and suggestions without being prescriptive. This book acts as an inspiration for individuals at other institutions: managers can expect to encounter any number of variations on the theme as well as unexpected surprises at a local level. Managers should not be afraid to experiment either, and the text provides inspiration for adapting and enhancing the general set-up to improve repository performance, functionality and provision.
As a manager of a fairly immature repository, I found that many of the situations described are familiar. The book inspires confidence even though the breadth of content is vast. The authors have elected not to go into great depth, or examine the minutiae of each topic, but refer the reader to other sources for the detail. This mainly works as it makes the book a particularly good starting point and keeps the text readable.
How Does the Book Tackle the Topic?
The book opens with the 'why' and then goes on to explain the 'how' of setting up a repository. It outlines the ideal situation, whilst being extremely pragmatic with advice. Take for example, the description regarding establishing a repository:
"Acquiring the content is slow and laborious work, and at the present time we pay for it with the sweat of our brow, rather than by dipping into our materials budget."
One other example is the advice concerning risks associated with the IR. The advice is to be realistic and not worry about negligible risks such as plagiarism, but to be vigilant about more real concerns such as third party copyright and defamation.
The text is interspersed with clear and useful diagrams and tables. These enhance the text and enable the reader to comprehend a complex situation easily.
The appendix comprises descriptions of six major open source platforms for institutional repository software which have been written by their developers or user communities. Those at the stage of selecting a platform will find these, coupled with the section on evaluating software, invaluable.
Setting the Scene
Reasons why an institution should spend resources on building and maintaining an IR are dealt with in some depth in the opening chapters. The main thrusts of the argument are:
- the concept of the digital library as an organised collection built for a purpose
- the co-existence of institutional and subject repositories
- the multifarious possibilities of the digital library
- use of repositories to store multiple content types ie research, learning objects (whose 'troublingly elastic definition' will evoke a knowing smile from many) and 'corporate assets'
- the role of libraries in the scholarly communication process (or risks of not being involved)
- the existing expertise within libraries to describe objects using standard schemes (ie cataloguing and metadata creation including the charming description of the metadata expert as an artisan)
- enhancing the impact of scholarly output
To some extent the authors consider differences of needs between disciplines. However, the answer to the authors' question of 'Why set up a repository?' is given as the immediacy and speed of publication. It is worth remembering that this is not the case in all disciplines: to many scientists and economists speed is crucial; to most philosophers and historians it is not.
It might have been preferable for the studies showing increased reading and citation of open access papers mentioned in the first chapter, to be backed up by some references.
The issue of publisher profits and serials costs is discussed in the book although it is arguable whether or not repository managers see this as a main reason in the first instance for setting up their repository.
The focus of the 'why' is neatly defined as making "more efficient use of the institution's resources; allows the digital content to be preserved over time; provides a comprehensive view of the institutional product; supports high-quality searching; and permits interoperability with similar repositories across the Web, so contributing to a global service." I suspect these reasons will be incorporated into many a document heading towards the offices of senior management in HE institutions.
How to Create Your Own IR
The central body of the work is given over to a discussion of methods by which an IR may be established including the estimated resources required (both in terms of staff time and costs).
These chapters set out some of the traps to be aware of at the outset which, if not resolved, may come back to haunt the IR manager later, for example, incorporating common metadata standards from the start.
As any IR manager will confirm, one of the main challenges is how to change the culture of scholarly communication and academic workflows so that authors/creators deposit their output into the IR. There is a substantive section devoted to explaining diffusion innovation theory as a model for observing patterns of change within the institution. This model does not make the elusive culture change and widespread adoption of the IR any easier, but it will help any manager anticipate the rate of change and where the main sticking points may occur. As might be expected, the authors explain that the use of mandates for deposit or at the very least, an opinion leader with enough clout to force change are the most effective means of making use of the IR the norm.
Use of statistics as a vehicle for persuasion are given. It may be prudent to be somewhat cautious in their use, firstly because of any political backwash and secondly because of the fallability of such figures: academics can be extremely scathing of figures used without clear explanation and interpretation. Impressive the statistics can be, and persuasive, but it may be advisable to use with care.
The book sets out some of the main concerns and barriers to persuading researchers to deposit (for example lack of time and subject differences). It does not address the problems involved with the often many and various versions of an item [2] and associated problems of what might be construed as the same object, not least including academics' concerns over which version is or is not included in the IR.
The advantages of federated systems are briefly described. For many this will be something of a 'holy grail' to which a manager will aspire but which may seem some distance away. Nonetheless, it is an important aspect to be considered, even if not resolved when planning a repository.
Another problem which is discussed is that of encountering increasingly complex items and knowing how to deal with them. It is all very well being able to include single unrelated files within a repository, but the IR needs to be able to deal with the structure of items to some degree of granularity and also with multiple related items and this will become paramount as repositories expand. There is a neat, if brief, explanation of this issue including a description of the inclusion of 'wrappers' and METS (Metadata Encoding and Transmission Standard).
The excellent section on workflows will be of great use to repository and library managers. The explanations and diagrams will enable managers to work through requirements and processes at their own institution. The workflows are limited to those within the boundaries of the IR ie they do not include the workflows of the academic who is producing the material and how they might fit with deposit [3]. As mentioned above, IR managers could use this book as inspiration for their own situation. This chapter deals with workflows as a separate entity for the repository. In reality it is likely that, in time, they will fit with existing processes within the digital library.
Does It Provide All the Answers?
This book certainly goes most of the way towards addressing all the major issues in a succinct and readable way. It clearly indicates just how complex setting up, developing and running a repository is and what a large commitment it is for any institution. It also assures the reader that if one were to try to sort out every possible aspect before accepting items, the repository would never get going.
Conclusion: Three Good Reasons for Reading This Book
Reading this book confirms the feeling of this repository manager that we are on the brink of a major change in scholarly communication and management of research output. We are so tantalisingly close, but not yet in a position to state that the change has definitely taken place. In ten years time the community may be wondering how life was before repositories were the norm. For example, the use of an IR as a means of doctoral thesis submission and retention seems so obvious when compared with the current situation.
Three reasons for any IR manager to read this book are:
- it is clearly written and explains a complex topic in easy to understand terms
- it covers most aspects of this subject and any IR manager would do well to work through all the issues covered
- it is the seminal monograph on the topic
It is also a pleasure to read [4].
The Edinburgh team states that "this book has been one of our contributions to the community in the hope that the hard-won lessons we have learned will make this process for other institutions a much more enriched and enlightened one." I think the team has achieved this aim.
References
- Editor's note: Ariadne has covered work in Edinburgh:
Stephen Pinfield, Mike Gardner and John MacColl, Setting up an institutional e-print archive, April 2002,
Ariadne Issue 31 http://www.ariadne.ac.uk/issue31/eprint-archives/
John MacColl, Electronic Theses and Dissertations: a Strategy for the UK, July 2002,
Ariadne Issue 32, http://www.ariadne.ac.uk/issue32/theses-dissertations/
Theo Andrew, Trends in Self-Posting of Research Material Online by Academic Staff, October 2003,
Ariadne Issue 37 http://www.ariadne.ac.uk/issue37/andrew/
Richard D Jones, DSpace vs. ETD-db: Choosing software to manage electronic theses and dissertations,
January 2004, Ariadne Issue 38 http://www.ariadne.ac.uk/issue38/jones/
Richard Jones, The Tapir: Adding E-Theses Functionality to Dspace, October 2004,
Ariadne Issue 41 http://www.ariadne.ac.uk/issue41/jones/
Moreover, there is also work by William Nixon and Morag Greig at Glasgow which has been covered and of possible interest to readers.
- These problems are being investigated by the VERSIONS Project. See http://www.lse.ac.uk/versions
- There is interesting work being carried out by the RepoMMan Project at the University of Hull in this area see http://www.hull.ac.uk/esig/repomman/
- In fact two chapters are online and available for free download from the Edinburgh Research Archive (ERA):
Chapter 1: The institutional repository in the digital library
http://hdl.handle.net/1842/858
Chapter 7: Case study: The Edinburgh Research Archive
http://hdl.handle.net/1842/859