At the Event: The EPrints UK Workshop
The workshop was aimed at those interested in setting up institutional e-print servers where the outputs of their organisation (journal articles, papers, reports etc) could be published, stored and searched via a central institutional server. The event was fully booked which perhaps indicates that universities, colleges, academics and librarians are increasingly recognising the value of the e-print publishing model.
The day was run by ePrints UK [1] (in conjunction with SOSIG), an RDN [2] project which aims to offer a new national e-print subject service by pulling together information from institutional servers and presenting it by subject discipline (via the RDN hubs).
The workshop looked at the main issues to be addressed in setting up institutional and national e-print services: the rationale, technologies and legal issues but also the people and political factors required to encourage widespread participation in this new and promising publishing model.
Introduction to ePrints UK
Marieke Guy, Project Manager, ePrints UK
The basic principle of the e-prints movement is self-publishing materials in digital format and then sharing it freely with others, using agreed standards. Marieke described how the ePrints UK project was one JISC [3] initiative that aimed to promote this vision for a new publishing model for scholarly communication.
The success of e-print initiatives depends on widespread adoption and use and this workshop aimed to support those working at spreading the word and starting to act.
A Whirlwind Guide to E-Prints
Philip Hunter, ePrints UK Project, UKOLN, University of Bath
Philip started with an overview of e-prints past and present, describing how by implementing some relatively simple ideas and Internet technologies institutions and individuals could significantly improve access to their scholarly publications. He outlined the rationale for institutions to start to address the crisis in the dissemination of scholarly communication caused by spiralling journal subscription costs and delays caused by existing peer review processes.
The Internet enables academics and universities to self-publish their own papers and make them easily accessible, offering an alternative to the traditional model of publishing via a commercial publisher. This model was first introduced in the 1990s within the physics community, where a critical mass of scientists came to use the arXiv ePrint server in the USA to both deposit their own papers, and find papers by others. The advantages were quickly apparent to many, as papers could be published very quickly, with no fee for either retrieval or submissions by users worldwide.
Philip went on to describe the Open Archives Initiative, the international movement that clearly identifies the potential advantages of e-print publishing and sharing, and offers practical models for implementation and use. The initiative has developed metadata standards and protocols to facilitate exchange of scholarly publications and Philip outlined these in his presentation.
To summarise the basic architecture: an institution can be either or both:
- A Data Provider: placing e-print metadata in a repository and making this available to others to harvest
- A Service Provider: harvesting and presenting metadata from repositories to provide an e-print service
Philip described how ePrints UK are currently building a service demonstrator where e-prints metadata harvested from a number of data providers is being pulled together to facilitate searching across archives. The demo currently contains over 11,000 records (of peer reviewed papers, theses, grey literature, collections, images etc) but it is hoped that as more institutions establish their own e-print servers over the next few years, this number will increase enough to develop a rich and valuable national e-prints service.
E-Prints: A librarian's perspective
Linda Humphreys, Faculty Librarian, University of Bath
Linda gave a persuasive presentation about the benefits of e-print services to users, libraries and academic institutions. She suggested that for users, they would offer wider and quicker dissemination of academic publications. For libraries and institutions they help to keep a record of institutional output, raise the profile of institutional research and help with the Research Assessment Exercise (RAE). For libraries, the e-prints model offers a better vision than the existing, highly costly journal publishing model.
Linda described her own work in setting up an ePrint server at Bath University. Bath now has a dwindling collection of print journals, relying instead on expensive inter-library loans. Having commercial e-journals has not made subscriptions cheaper and has made management of journal access and preservation more complex. The vision of a universal virtual library is appealing to librarians embroiled in the current serials crisis.
Institutional policy and metadata agreements are two key issues in setting up an e-print server and Linda gave a clear account of her views in both areas which will no doubt be of interest to other institutions.
What Technology is Involved?
Christopher Gutteridge, EPrints.org, University of Southampton
Christopher is the developer of the Web-based ePrints archive: GNU Eprints2, produced by the University of Southampton. He talked generally about the usual structure found with e-print archive software and briefly described the two main players in the field: his own EPrints2, which is based on Apache/Perl/MySQL, and DSpace, which is based on Java Tomcat/Postgres. Both provide general solutions, but stressed that both require extra work to customise them to an organisation's particular requirements. There are a number of other software packages available, recommended in the Open Society Institute Guide to Institutional Repository Software (referenced from Christopher's slides).
Christopher suggested that institutions need to decide up front what the focus of the archive is to be and listed the goals to define before setting up an e-print archive. He mentioned four possible main goals: the dissemination of research; the preservation of research; generating lists of publications, for CVs for example; and collecting statistics.
It is important that an archive builds up a critical mass and Christopher mentioned a few techniques such as importing as much material as possible to start with, to encourage use. He also suggested that the 'shame' technique of listing numbers of articles available in an archive for each member of staff is also a useful ploy!
Some mid-session questions raised the following views:
- PDF has become the favoured document standard, but everyone has their own way of doing things, so it's best not to be too prescriptive
- The general experience is that having users enter their own data is not a problem if good examples are provided
- EPrints software has a built-in ability to be able to import from bibliographic databases but this would require some work by a local programmer. Christopher is looking at writing code to create an interface for this
The next topic was cost: the software is free and open source, but you still need hardware (including backup facilities) and staff. The main cost is staff time, mainly in setting the archive up: you can install the software in a couple of days, but configuring it to how people want it can take some time.
Christopher then gave an overview of the EPrints2 software finishing with a statement of their design philosophy: that the supplied defaults are a good starting point, but that tweaks will be needed. EPrints is configurable, rather than perfect out of the box!
Setting Up Institutional and Subject (Research Funder) Repositories: Practical Experiences, Academics' Views
Neil Jacobs, Regard Information Officer, ILRT, Bristol
Neil manages the Economic and Social Research Council (ESRC)'s online research service Regard, and has been running an e-prints pilot study at ILRT (Institute for Learning and Research Technology) in Bristol (which hosts Regard) in association with IRIS (Integrated Research Information System), the University of Bristol's internal database of research outputs. Both Regard and IRIS are submission-based and have a history of having to persuade academics to submit their data. The main report of the pilot study is due in April 2004.
He wanted to look at the range of issues involved in setting up an archive, in particular the organisational, cultural, legal, and technical issues and presented the interim findings of the study.
With organisational issues, metadata quality was found to be a major concern, as were the problems of reconciling different metadata schemas, the lack of name authority lists, and the need to indicate the provenance of metadata.
In the cultural area there was concern from academics that peer review was still central to quality. Some of them also felt it best to stick with existing practices and try to work with existing archives rather than rocking the boat too much.
Legal issues that turned up included the fact that authors are generally uninformed about copyright issues but that there is a potential for the research councils to get involved in discussions between researchers and publishers.
There were also technical issues involved: setting up an archive is technically trivial, but linking it into legacy systems such as OPACs and institutional research databases is not.
The study also looked at the different approaches of institutional or subject-based archives. Both approaches can build on existing systems but as yet neither are sufficiently involved in the creation of standard metadata needed to sustain archives such as these.
The JISC FAIR Programme and Wider Scholarly Communications Activity
Chris Awre, FAIR Programme Manager, JISC
Chris Awre gave a broad overview of the FAIR (Focus on Access to Institutional Resources) Programme, of which the ePrints UK project is part. The programme represents a £2million investment from JISC spread across 14 projects (involving over 40 institutions across the UK).
The overall aim of the programme is to investigate the issues around the deposit and disclosure of institutional resources as a means of increasing the availability of scholarly communication. It was also expected that the tools and experiences that came out of the projects would feed into the wider community.
Chris iterated the point that cultural barriers were by far the greatest obstacle for the projects as they are going against the long-established practice of the publishing industry. Copyright was also mentioned as a barrier although the next speaker dealt with this issue fairly comprehensively. Again technical issues were not seen as a major barrier although Chris did note that a particular skill set was needed to maintain the repository, which institutions needed to take into account on a longer-term basis. He then went on to describe some of the specific projects and particular findings that had come out of the programme so far.
Finally Chris reported on the broader picture of the changes in scholarly communication, looking at some of the work that JISC and FAIR in particular have been doing in the area of Open Access (through self-archiving and open access journals). The current Government inquiry into pricing and availability of academic journals is also due to report in March 2004 which will consider some alternative models to the current situation including institutional repositories.
Intellectual Property Right issues: Identification and management
Laurence Bebbington, Law Librarian, Information Services, The University of Nottingham
IPR and Copyright are topics that send normally a shiver down any information professional's spine but Laurence started off his presentation by giving a very clear overview of the range of Intellectual Property Rights; from Patents, through to Confidential Information. He went on to highlight some of the issues around e-prints and open access, such as:
- Who is retaining the rights - the individual or the academic?
- Encouraging academics to retain copyright if possible
- Need for guidance on negotiating licence changes and proper legal advice if amending publishers' licences
While he suggested that copyright is generally at the root of the issue with e-prints, there is also the concept of Database Right to take into consideration, which is a view of the e-prints collection as a database. Database right can normally be asserted if it can be shown that "substantial investment" had been involved in obtaining, verifying or presenting the contents of the database. Typically database right would rest with the institution and it would need to consider the longer-term implications of maintaining this.
Laurence also highlighted some issues and risks around decentralised publishing such as plagiarism, defamation, errors, etc and illustrated this with an example of a recent case of plagiarism being spotted in a number of papers that had been submitted to the Physics preprint server ArXiv. The result was that 22 of the papers were withdrawn with the threat of a possible lawsuit for defamation.
The workshop finished with a Question and Answer panel session involving all of the day's speakers. The questions ranged from queries about database rights to the possibilities of building links between Open Archives and established full-text databases using OpenURLs.
The feedback from the workshop was very positive and some of the actions that participants suggested that they would be taking as a result of the workshop included:
- Plans to set up an experimental repository or progress their current repository
- Encourage interest in e-prints at their library or raise awareness
- Some institutions considering joining Sherpa
This was the first in a series of 5 workshops [4] taking place around the UK in conjunction with RDN hubs.
References
- The ePrints UK Project http://www.rdn.ac.uk/projects/eprints-uk/
- The Resource Discovery Network http://www.rdn.ac.uk/
- The Joint Information Systems Committee http://www.jisc.ac.uk/
- ePrints UK Workshops http://www.rdn.ac.uk/projects/eprints-uk/workshops/