DRIVER: Seven Items on a European Agenda for Digital Repositories
What is the current state of digital repositories for research output in the European Union? What should be the next steps to stimulate an infrastructure for digital repositories at a European level? To address these key questions, an inventory study into the digital repositories for research output in the European Union was carried out as part of the EU-financed DRIVER Project [1]. In this article the main results of the inventory study [2] are presented and used to formulate a European agenda for the further establishment of an infrastructure for digital repositories for research output.
The DRIVER Inventory Study
The DRIVER inventory study aimed to provide a complete inventory of the current state of digital repositories in the 27 countries of the European Union. It is a follow-up of an earlier SURF study carried out in 2005, which included 10 European countries [3]. The study was started in June 2006 and completed in February 2007. By a combination of a Web survey, publishing results on a wiki and telephone interviews, an attempt was made to make the inventory as complete as possible and to generate feedback amongst participants in the study. In total, 114 respondents with digital repositories participated in the Web survey. This study focused on OAI-compliant repositories containing research output.
Current State of the Repositories
In this section the most important results about repositories themselves are presented from the following perspectives: coverage, contents, access forms for the full-text records, work processes, software packages and accessibility by search engines, gateways or portals.
Coverage
The inventory focused on institutes who had already a digital repository implemented. From various sources addresses of potential digital repositories were collected and these were invited to participate in the Web survey. A combination of a wiki, an additional Web survey and telephone interviews were used to find additional information. From the data collection process in this study it was estimated that there are approximately 230 institutes with one or more digital repositories with research output in the European Union (of which nearly 50% participated in this study). The situation per country differs:
- In 7 EU countries there appears to be no research institutes with a digital repository for research output [4].
- 5 EU countries seem to be in a start-up phase, where a few institutions have set up such a repository [5].
- In 15 EU countries a sizeable proportion of the research universities has implemented a digital repository for research output: in seven of these countries it is estimated that more than half of the research universities have done so [6].
Contents
What are the contents of those digital repositories? Based on figures given by 104 repositories, it appears that average digital repositories contained nearly 9000 records (8984, as assessed in the second half of 2006). The large majority of these records (90%) related to textual research materials: these records can be divided into metadata-only records (61%) and full-text records (29%).
5% of the records relate to non-textual materials such as images, video, music and primary datasets. The 5% ‘other materials’ relate to learning materials, students papers etc.
What types of textual research materials are deposited? More than half of the textual materials relate to journal articles (53%), a smaller number relates to books or book chapters (18%). Theses, proceedings and working papers - often labelled as grey literature - represent some 30%.
Access Forms Offered by the Repositories
What forms of access for full-text records are offered by the repositories? Is Open Access the only form of access, or are other variants offered? The most important other variants are Open Access with embargo for a certain time period, Campus Access or not publicly accessible at all (archival purposes only). The results are presented in the bar diagram. Clearly, most repositories (95%) offer Open Access accessibility. Open Access with an embargo period for full-text records is only offered by 18% of the repositories. About a quarter of the repositories (26%) offer Campus Access or contain records with no access (14%). Other forms of access are offered by 8% of the repositories, such as available for a fee, in response to an e-mail request or restricted to members of a project team.
Work Processes
Which statement best describes the work processes of depositing of materials in the repository? | N | % |
Self-depositing by academics, quality control by specialised staff members | 32 | 28.1 |
Delivery by academics, depositing by specialised staff members | 30 | 26.3 |
Collected by staff members independent of the academics | 8 | 7.0 |
A combination of the above | 32 | 28.1 |
Other | 12 | 10.5 |
114 Answers | 100.0 |
How is the material deposited in a digital repository? The results of a question about the work processes of depositing the materials in the repository are presented in the table above. It appears that a procedure of self-depositing by the academics, with quality control by specialized staff members is most common (28%), closely followed by a procedure of delivery by the academics, and depositing by the specialized staff members (26%). Only 7% of the repositories followed a procedure, whereby the materials were collected by staff members independent of the academics. However, 28% of the digital repositories of the participating institutes followed a combination of the above-mentioned procedures.
Software Packages
Which software package is used for the digital repository? The results of a question about this are presented in the table. The main results are:
- The top two of the most frequently used software packages are GNU Eprints (24%) and DSpace (20%).
- Locally developed software packages are also frequently used (17%).
- The OPUS software package is also quite frequently used (10,5%), but its usage is mainly restricted to Germany.
- 14 other software packages were mentioned by the respondents.
In total 17 different software packages have been mentioned, while 19 respondents reported a locally developed software package. This means that the digital repositories in the European Union use at least 18 and probably more than 30 different software packages.
Which software package is used for the digital repository? | n | % |
GNU Eprints | 27 | 23,7 |
DSpace | 23 | 20,2 |
Locally developed software packages | 19 | 16,7 |
OPUS | 12 | 10,5 |
DIVA | 6 | 5,3 |
ARNO | 4 | 3,5 |
Fedora | 3 | 2,6 |
CDSWare | 1 | 0,9 |
iTOR | 1 | 0,9 |
EDT | 0 | 0,0 |
other | 18 | 15,8 |
114 Answers | 100,0 |
Search Engines
The contents of your digital repository are searchable via which of the following general engines/gateways/portals: | n | % |
General internet search engines such as Google, Yahoo, MSN etc. | 74 | 64.9 |
OAIster | 66 | 57.9 |
Google Scholar | 59 | 51.8 |
Open DOAR | 47 | 41.2 |
OAI Search | 39 | 34.2 |
Scirus | 21 | 18.4 |
BASE | 20 | 17.5 |
OPUS | 14 | 12.3 |
OASE | 8 | 7.0 |
MetaGer | 7 | 6.1 |
MEIND | 6 | 5.3 |
Citeseer: Computer Science | 5 | 4.4 |
PLEIADI | 5 | 4.4 |
Other | 21 | 18.4 |
114 Answers |
Via which channels is the digital repository searchable/accessible? The results of this question are presented in the table above. Over 50% of the participating digital repositories are searchable via general Internet search engines such as Google, Yahoo or MSN, via OAIster and via Google Scholar. All other search engines or portals access less than 50% of the participating digital repositories. It has to be emphasised that these findings reflect the answers of the respondents to the questionnaire and not actual searches using the search engines/gateways/ portals mentioned. Therefore the results might reflect only the awareness of respondents about the searchability of their repositories. However, if their awareness is accurate, there appears to be no single search engine, portal or gateway that can access all participating digital repositories.
Services Desired at a European level
After these factual questions on their repositories, the respondents were asked to give their opinion on a number of issues. Firstly, they were asked which services should have priority for further development at a European scale. The top four answers (selected by more than 25% of the respondents) were:
- general search engines, gateways and portals
- disciplinary and thematic search engines, gateways and portals
- citation index services
- preservation services
Factors Influencing Repositories
In two questions, the respondents were asked to select the three most important stimuli for the development of their digital repository and the three most important inhibitors out of 14 factors. In the table below all stimuli and inhibitors selected by more than 25% of the respondents are listed. The following factors are seen as most important [7]:
- The increased visibility for the publications of the academics
- A simple and user-friendly depositing process
- A mandatory policy for the deposit of research output by the institute
- An improvement in the situation with regard to the copyright of published materials
- Requirements by research funding organisations for the deposit of research output in repositories
- Awareness campaigns for academics
- Interest from decision-makers in the institute.
What do you see as the most important stimuli for the development of the digital repository and its contents in your institute? | N | % |
increased visibility and citations of the publications of the academics in our institute | 53 | 46.5 |
our simple and user-friendly depositing process | 50 | 43.9 |
awareness-raising efforts among the academics in our institute | 33 | 28.9 |
interest from the decision-makers within our institute | 30 | 26.3 |
What do you see as the most important inhibitors of the development of the digital repository and its contents in your institute? | N | % |
lack of an institutional policy of mandatory deposit | 57 | 50.0 |
situation with regard to copyright of (future) published materials and the knowledge about this among academics in our institute | 56 | 49.1 |
lack of requirements of research funding organisations in our country about depositing research output in Open Access repositories | 31 | 27.2 |
Towards a European Agenda
What is the current state of digital repositories for research output in the European Union? From this inventory study it is clear that digital research repositories are already well established throughout many countries in the European Union. In 2006 approximately 230 institutes had one or more digital repositories for research output implemented. In addition, from the contacts with respondents in various EU countries it appears there is a growing and active interest in implementing digital repositories at other institutes. Recent surveys in the USA show similar results [8]. Clearly, digital repositories for research output are on their way to becoming a permanent part of the scholarly communication and documentation infrastructure.
What should be the next steps in driving forward a connective and integrative infrastructure for digital repositories at a European level? The further deployment and development of the digital repositories will follow a two-tier approach:
- Deployment of digital repositories at research institutions that do not have one yet.
- Increasing the coverage of the existing digital repositories of published and unpublished textual research output, with a possible future expansion of the coverage of digital repositories to other, non-textual types of research output (e.g. images, video, and research datasets).
With regard to such a two-tier approach, an agenda for activity at the European level can be formulated. Based on the above results of the inventory study, such an agenda should include the following seven action points:
- Increased visibility by increasing retrievability: The increased visibility of academic publications is seen as a major factor in the development of digital repositories by the participants of this study. To increase the visibility is to increase the retrievability, which means, among others, accessibility for search engines. In other results from this study, it appears that no single search engine, portal or gateway can access all 200-plus digital repositories in the European Union. Indeed, the need for general and disciplinary/thematic search engines has the highest priority for services on a European scale according to the participants of this study. In addition, retrievability would be enhanced by better metadata, harmonised subject and/or keyword indexing etc.
- Best practice in the deposit processes: A simple and user-friendly depositing process is also seen as a major factor by the participants of this study. In other results from this study, it appears that there are a number of different work procedures for the depositing process in place. An effort to establish Best Practice for the deposit processes (possibly followed by an harmonisation effort) will facilitate an increase in the delivery of content to the digital repositories.
- Mandatory deposit: A mandatory policy for the deposit of the research output by the institute and - in line with this - requirements by research funding organisations for the deposit of research output in repositories, are seen by many respondents as very desirable in order to maintain and fill their digital repositories. However, institutional mandates are rather controversial, as some expected them to be counter-productive. Clearly, a nuanced approach to effective mandatory policies for institutes and for research funding organisations should be part of a European agenda.
- Flexibility in forms of access: The situation with regard to copyright of published materials is seen as a major inhibitor to the further development of digital repositories by the participants of this study. However, it also became apparent in this study that many digital repositories have no facilities for allowing other forms of access besides Open Access, such as Open Access with an embargo period or Campus Access. These variations in access forms might help to increase the coverage of published materials, in addition to further advocacy efforts with regard to the copyright policies of publishers. Again, such an approach, without watering down the Open Access vision, could be worked out in a European agenda.
- Awareness and interest among academics and decision-makers at research institutes: Other important goals for advocacy efforts, as seen from the perspective of this study, should be to create interest from decision-makers and to stimulate or support awareness campaigns among academics.
- Development of services: With regard to other possible services on top of the digital repositories, priority should be given - apart from the earlier-mentioned journal and thematic search engines - to citation index services and preservation services.
- Development of further technical standards and a possible close collaboration between the various software solutions: The need for technical harmonisation by the development of common standards is also evident from the large number of software packages in use by the various digital repositories. For the development of new services on top of the digital repositories, adherence to agreed standards and possibly close collaboration between various software developers is seen as crucial to the development of services on top of the digital repositories and should be part of any European agenda.
Acknowledgements
The author thanks Bill Hubbard (SHERPA), Leo Waaijers and Kasja Weenink (SURFfoundation) for their stimulating comments and suggestions.
References
- More information on the DRIVER project (Digital Repository Infrastructure Vision for European Research) can be found at the official DRIVER Website: http://www.driver-repository.eu/
- The full report is published as a white paper under the title ‘Inventory study into the present type and level of OAI compliant Digital Repository activities in the EU’, which can be downloaded at
http://www.driver-support.eu/documents/DRIVER%20Inventory%20study%202007.pdf - Academic Institutional Repositories, Deployment Status in 13 Nations as of Mid 2005; Gerard van Westrienen; Clifford A. Lynch; D-Lib Magazine, September 2005, Volume 11 Number 9.
- Bulgaria, Cyprus, Hungary, Latvia, Luxembourg, Malta, Romania.
- Estonia, Ireland, Poland, Slovakia, Slovakia.
- Austria, Belgium, Czech Republic, Finland, Greece, Lithuania, Portugal, Spain; half or more of the research universities: Denmark, France, Germany, Italy, Sweden, Netherlands, United Kingdom.
- The other 7 factors listed in the questionnaire were: integration/linking of the digital repository with other systems in the institute; the policy to safeguard the long-term preservation of the deposited material; search services as provided by national and international gateways; our institutional policy of accountability; coordination of a national body for digital repositories; clear guidelines for selection of material for inclusion.
- The Institutional Repositories SPEC Kit, Association of Research Libraries (2006, Charles W. Bailey,Jr.; Karen Coombs; Jill Emery; Anne Mitchell; Chris Morris; Spencer Simons; and Robert Wright; ISBN: 1-59407-708-8); Census of Institutional Repositories in the United States, MIRACLE Project Research Findings, Karen Markey, Soo Young Rieh, Beth St. Jean, Jihyun Kim, and Elizabeth Yakel, February 2007