The DARE Chronicle: Open Access to Research Results and Teaching Material in the Netherlands
While Cream of Science (Keur der Wetenschap), Promise of Science and the HBO Knowledge Bank (HBO Kennisbank) are among the inspiring results of the DARE Programme for the period 2003-06, what is more important in the long run is the new infrastructure that enables Dutch Higher Education and research institutions to provide easy and reliable open access to research results and teaching material as quickly as possible. Such open access ought to be the standard in a knowledge-driven society, certainly if the material and data have been generated with public funding. Universities, scientists and scholars appear to agree entirely, given the success of the Open Access petition that the academic world has submitted to the European Commission.
The infrastructure achieved in the DARE Programme encompasses open source applications and protocols that are now being used across Europe, thanks to the DRIVER Project and other efforts. The new SURFshare 2007-10 programme focuses on scientists and scholars working in 'collaboratories' and their aim of disseminating their research results by means of 'enhanced' publications, i.e. publications that include research data, models and visualisations.
The Programme Mission
Looking back, the mission of the DARE Programme could have been defined by the Berlin Declaration [1]. But that is only in retrospect; the elegantly worded Berlin Declaration was only issued after SURF had already approved the DARE Programme on 14 June 2002. SURF is the ICT (Information and Communication Technology) collaboration organisation that unites the research universities, universities of applied sciences (hogescholen), the National Library (KB) and national research organisations in the Netherlands, i.e. the Royal Netherlands Academy of Arts and Sciences (KNAW), the Netherlands Organisation for Scientific Research (NWO) and the Netherlands Organisation for Applied Scientific Research (TNO). Founded in 1987, SURF has three platforms: Research, Education and Organisation. DARE (Digital Academic Repositories) was the programme pursued by the Research platform in the 2003-06 planning period. The out-of-pocket budget for DARE came to EUR 5.9m and the participating institutions were to contribute another EUR 3m or so, in the form of staff effort. With the exception of the universities of applied sciences, which were not yet ready to participate in 2003, all the SURF institutions took part in the DARE Programme. A number of hogescholen did eventually join the programme in its final year (2006), and are fully involved in the successor to DARE, the SURFshare Programme.
The OAI Standard
The OAI-PMH standard [2] was essential to the success of the DARE Programme. Defined in October 1999 as the 'Santa Fé Convention' [3] by Herbert van de Sompel and his team and tested internationally over the following few years, the Open Archiving Initiative Protocol for Metadata Harvesting (OAI-PMH), version 2.0, was published on 14 June 2002. It was a symbolic coincidence that its publication took place on precisely the same day that the DARE Programme was approved in the Netherlands. OAI-PMH has turned out to be a very robust standard. Some one thousand repositories worldwide now base their interoperability on this protocol, and the original version is still being used.
OAI-PMH divides the world of information into two layers: a data layer (content grid) and a services layer. The idea is that every research institution has its own repository, i.e. a digital archive based on the OAI standard. The archive contains the research results of the relevant institution, i.e. articles, dissertations and other publications, but also, and increasingly, data collections, models, visualisations and teaching materials. The OAI standard makes these repositories 'interoperable'; in other words, the content can be harvested by anyone applying a fixed protocol and can be used to deliver added-value services in the field of research and education or for services of use to society as a whole.
The two layers are situated in two different cultures. The data layer is an international, public-domain infrastructure. It involves machine-to-machine interaction, with agreements on standards and protocols being crucial. The co-operating universities and research institutes are responsible for organising the data layer and for the quality of the material on offer. The services layer focuses on machine-human interaction. The services are demand-driven, scalable and tailor-made. They are provided in a competitive environment and must be exploited in one way or another. The data layer and services layer together form a symbiotic duo.
DARE Results
The first important milestone of the DARE Programme was achieved a year after it began with the launch of DAREnet [4]. On 27 January 2004, every Dutch university had its own repository based on the OAI standard, from which DAREnet harvested 17,000 open access publications. Today that number has grown to 130,000, with an average of 100 publications being added every day. Assuming that the growth can be attributed mainly to new publications, that means that more than half of new Dutch publications become available via open access.
Though prominent, the Dutch developments are not singular. When the DARE Programme began in 2003, there were 30 repositories worldwide, the result of experiments using the OAI protocol (which had been published six months earlier). Today, OAIster [5] reports that there are more than 850 OAI repositories worldwide containing 12 million records (although these are mainly bibliographical records and images). The OpenDOAR [6] register reports 926 repositories. Recent studies conclude that repositories are now broadly being recognized as essential infrastructure for scholarship in the digital world [7] [8] [9].
Issues Arising
Once the repositories had been established, scientists and scholars began to raise various questions. One related to the problem of duplicate effort:
"It's great that my institution has a repository, but now I have to enter my publication data twice: once in our Current Research Information System 'Metis', for my university's report on its activities, and then a second time in our repository. Can't this be fixed?" Since 2006, we've been able to give the following answer: "Yes, it's been fixed. You now only have to enter your publication into Metis because we've created a link between Metis and the repository. You only have to press the upload button on the new Personal Metis entry screen and your publication will automatically be sent to the repository as well."
Another question related to matters digital preservation and access: "Paper will survive for centuries, and usually the only tool you require for access is a pair of glasses. What about our digital material in the repositories?" On 28 November 2006, the full complement of DARE participants signed an agreement with the National Library of the Netherlands (Koninklijke Bibliotheek) concerning the long-term storage of open access content in its repository. More important than the basic storage it offers is the fact that the National Library can guarantee permanent accessibility. The agreement was based on a thoroughly tested two-way connection between each repository and the National Library's e-Depot. The connection ensures not only that material can be removed securely from the repositories, but also that it will be returned upon request.
There was also a question about the long-term storage of researchers' work: "If I change university or work for a research school based on a partnership between multiple universities, where do I store my material?" We are still working on a solution. It will be ready in October 2007 and will be called DAI, the Digital Author Identifier [10]. Every scholarly or scientific author in the Netherlands will be assigned a unique author number that will be appended to his or her publications. It will then be possible to assemble all of a particular author's publications with a simple press of a key, regardless of the repository in which the material has been deposited or the name under which the author has published.
Another question related to file formats: "My dissertation or book consists of various different files. Can it still be circulated as a complete unit?" That was naturally always possible by publishing a table of contents with links to the various files, but this was a non-standardised solution and it depended on the format (pdf, html, doc, rtf, etc.). Harvesters and even Google robots had trouble with this. The DARE Programme has produced a future-proof solution to this problem: a standard xml format, known as the DIDL container [11]. It is not only capable of assembling and maintaining text files in the right order, but it can also do the same for other objects, for example visualisations, models, and so forth. At the moment the format is used for files that are all located in the same repository, but in the future it can also be expanded to assemble objects distributed across various repositories. A visualiser was recently developed that makes it possible to read the contents of the DIDL container. What you see looks like an old-fashioned table of contents, but this time it is not only human readible but also machine readible as it is based on a standardised xml solution that is being used, among other things, for the future-proof storage of material in the National Library's e-Depot.
Services
SURF encouraged universities in the Netherlands to develop their own services from the very beginning. In addition to the research results offered on university Web sites, these efforts have resulted in approximately fifteen operational and active services [16]. In general, scholars and scientists are primarily interested in services related to their own field, and a number of DARE projects have keyed into that fact. By now, some of these have developed into robust international services, for example Nereus [17] for the economists or Connecting Africa [18].
Other services have also been set up, for example signalling and annotation services. In addition to their own publications, archaeologists now also have access to the first data collections in the repository, a solid step towards the future. Another example is the first real commercial service based on repositories. The purpose of SORD (Selected Organic Reactions Database) [19] is to extract the recipes for substances produced within the context of doctoral research from the relevant dissertations. SORD makes the recipes for these chemical reactions accessible via its database. Institutions that make dissertations available are given free access to the database; the pharmaceutical industry is required to pay for access. Any older dissertations on paper that are relevant are scanned with the author's permission. The digital version can then be included in the repository. All in all, this is a good example of how repositories can help improve access to knowledge.
Obstacles
Copyright law and the state of technology have prevented a genuine breakthrough with respect to open access.
Copyright
The existing practice in the Netherlands is that it is the authors who own the copyright to their work, and not the university or institute for which they work. As soon as the author of an article 'pays' for its publication by transferring the copyright exclusively to one publisher, it is the publisher who sets the price to be charged for access to the publication and the conditions for its reuse. The consequences of this model are clearly enough understood, especially in libraries.
One of SURF's key activities is to promote awareness of this situation among authors and to offer them other options. Increasingly, the publishing world is itself offering such options via other business models. According to these models, publications will not be paid for in the form of rights, but in cash. After it has been reviewed and accepted, the article can then be published in an 'open-access journal'. The major commercial publishers are resisting this model since it produces lower profit margins than their traditional copyright monopoly model. In so far as authors are still unable to publish in an open access journal, partly for that reason, we would argue in favour of a licence whereby the author gives the publisher the right to publish the article but retains all other rights, such as the right to place the article in a repository and permit access to it after a maximum embargo of six months. This Licence to Publish [20] was drawn up by Dutch and British legal experts. Spanish and French versions have now been produced, and versions for other European countries are currently under development.
Knowledge Exchange [21], a partnership consisting of JISC (UK), DFG (Germany), SURF (Netherlands) and Deff (Denmark), will be launching a campaign this autumn to promote the use of the Licence to Publish. Knowledge Exchange and SPARC Europe were also the drivers behind a petition asking the European Commission to follow the recommendations of a study that it had itself ordered. The study advises the Commission to apply the principles underlying the Licence to Publish and to take steps to improve the publishing market for scientific and scholarly articles. Within a period of just one month, some 24,000 individuals from the academic world had signed the petition, more than 500 of whom did so on behalf of an organisation [22]. It is the powerful publishers' lobby that continues to prevent the Commission from taking specific steps. As soon as this topic is raised in the European Parliament, the petition-related activities will be resumed.
Another new development to emerge from DARE Programme was the Licence to Deposit [23]. DAREnet was, fortunately, launched without complicated copyright discussions being necessary in advance. Now that authors have placed more than 130,000 open access articles in the repositories, it would be useful to let third parties know precisely how this material can be used. By issuing a Licence to Deposit, authors give the repository manager official permission to circulate their publication for free, non-commercial reuse with attribution of the work to the creator.
Various support services have been developed in this area, for example an informative Web site and a summer course for the staff of the local copyright information centres scheduled to be launched at a number of Dutch universities in September 2007. The point is to help authors to publish their material via high-quality channels without relinquishing their copyright. Queen Beatrix of the Netherlands offers a shining example. Her annual 'Speech from the Throne' is published in all the newspapers, but she never transfers her copyright to another party.
Technology
The quality of the knowledge services provided depends in part on the quality of the underlying data layer. The relevant technology is still under development. The DARE Programme produced a number of important applications, for example the SAHARA harvester, a fast search engine based on Lucene, and the DIDL container and visualiser mentioned previously. These are robust open source software programs [24] that are also used internationally, specifically in the European DRIVER [25] projects. In essence, the DRIVER Project is an expansion of DARE from fifteen repositories in the Netherlands to about fifty repositories around Europe and from one country to eight. The project was launched on 1 June 2006 and will be completed on 1 December 2007.
In addition to effective technology, efficient interoperability demands reliable agreements on the standards and protocols to be applied. The DARE 'Guidelines' were developed for this purpose; they will also be applied within DRIVER where they will serve as requirements for including a repository in the DRIVER network [26]. A validator will shortly be completed to check whether a repository complies with the Guidelines and report on those aspects that do not. Ideally, any failures detected would be corrected automatically. That is not yet possible, but developers are working on it.
DRIVER goes much further than DARE on a number of points. For example, it has a metadata store for storing all harvested metadata that have been subjected to a quality check. Service providers therefore no longer need to harvest all repositories themselves; they can start with the metadata store. The DRIVER consortium successfully submitted DRIVER II as a project under the EU's Seventh Framework Programme. It is expected to begin before the end of 2007. DRIVER II will go much further in terms of improving the infrastructure, which will be made suitable for enhanced publications (i.e. scientific or scholarly articles published along with the related research data, visualisations and models). The project will also demonstrate Web 2.0 services. The number of participating countries will increase from eight to perhaps fifteen, and the number of repositories will undoubtedly exceed 200.
Recognition
At the start of the DARE Programme, institutional repositories were often viewed as hobby projects for libraries. SURF took the decision to launch the programme on 14 June 2002 in part because it wanted to give a small, motivated and ambitious team the benefit of the doubt. Not all of SURF's participating institutions were by any means persuaded that this was the future. Holland Consulting Group, a private firm of consultants, evaluated the programme halfway through. By then the preconceptions had begun to crumble and approximately half of SURF's participating institutions understood the strategic importance of the new development. The remainder were still cautious, but they soon came round to the others' way of thinking. The final DARE evaluation demonstrated the complete conviction of the university managers.
It was also halfway through the programme, on 10 May 2005, that half the institutions signed the Berlin Declaration during the ceremony marking the launch of Cream of Science. By the close of the DARE Programme on 25 January 2007, the remaining institutions had also signed the declaration.
In short, at the moment all the participating Dutch institutions see easy and quick access to knowledge as part of their mission. Is that an important step? Yes, it is; ten years ago, knowledge communication was a blind spot at the universities. They felt responsible for generating knowledge, and therefore developed the necessary plans, attracted top researchers, set up laboratories, and so forth. But there was little interest in the results of their policy and efforts (articles, reports and contributions to conferences). Circulating the results was left to the individual author, despite the fact that a publication might very easily represent a value, in round terms, of EUR 100,000. Yet only in the instance of their dissertations did the universities take on the task of quality assessment and distribution.
That attitude has changed dramatically. The universities consider access to their scholarly and scientific research results and their reuse important, both as a means of accountability and for competition purposes [27]. They have adopted such a view both at mission statement level by signing the Berlin Declaration and at operational level by setting up repositories. The current position paper of the European University Association in response to the European Commission's Greenpaper on the European Research Area reflects this responsibility in a broader policy context on behalf of its 800 members [28].
Use of DAREnet peaked around the launch of Cream of Science in May 2005 and surged once again after the opening of the Promise of Science site in September 2006. The decline in the autumn of 2005 was the result of a technological malfunction that lasted weeks. Use has grown steadily since then to approximately 30,000 visitors a month in the last three months of 2006; a recent report by the current manager, the Royal Academy, shows that the site became even more popular in 2007. Scientists and scholars have now begun referring to their publications in DAREnet in their list of references. They evidently have confidence in the future of DAREnet.
DAREnet has also received recognition internationally. Googling 'DAREnet' produces 2.5 million hits and the site www.darenet.nl has a Page Rank of 7, comparable with major cities and the leading Dutch online auction site. In other words, DAREnet is the Netherlands' most important showcase for research.
Cream of Science serves as a reference for comparable projects in Britain and Germany, and has even caught the attention of Japan [29]. An international demonstrator has been developed as a follow-up to the Promise of Science dissertation site that harvests dissertations from repositories in five countries The project has been so successful [30] that it has now been decided to use it as a basis for offering dissertations from dozens of European repositories in DRIVER as a separate subsidiary collection, precisely as is done in DAREnet. It will be completed on 1 December 2007.
Success Factors
The timing of DARE could not have been better. The programme keyed into a growing awareness at universities and research institutes that they were not only responsible for generating knowledge but also for disseminating it. Scholars and scientists had come to much the same conclusion. As far back as 2000, some 34,000 scholars and scientists signed an open letter to their publishers urging them to make their articles freely accessible within six months - a plea that the publishers chose to ignore. Advances in technology had also come at just the right moment. The development of the OAI protocol and the metadata standard Dublin Core could not have been timelier.
Nevertheless, these factors do not explain precisely why DARE has been such a huge success; they are general in nature and could apply in many different countries. More specific in the case of the Netherlands is the unusual way in which Dutch Higher Education and research institutions banded together to tackle issues in the field of ICT under the name of SURF [31]. All sixty eligible institutions take part in SURF, and every four years they decide on the organisation's future. They do that by approving or rejecting its four-year plan. If it is approved, the institutions bear responsibility for the basic funding for that same period. SURF then goes in search of additional funds to finance its various programmes. This is an attractive formula for the government because the plans thus submitted are already supported on a national scale and their implementation is in experienced hands. The SURF formula is unique in the world, although there are organisations in the UK (JISC) and Scandinavia that bear a resemblance. One important difference, however, is that SURF was set up from the bottom up, whereas its sister organisations (and partners of SURF) are more or less top-down in character.
SURF assembled an enthusiastic and successful community for the DARE Programme, with a lively intranet and an intensive knowledge-sharing and decision-making structure, leading to a number of joint milestones such as DAREnet (25 January 2004), Cream of Science (10 May 2005), Promise of Science (13 September 2006) and the HBO Knowledge Bank (8 November 2006). The climax was the HonDAREduizend Project, which achieved the 100,000th open access Dutch publication in late December 2006.
Next Steps
SURF's strategic plan for the period 2007-10 was adopted unanimously in April 2006. The strategic plan provides the basis for the successor to the DARE Programme, SURFshare [32], which is being developed to reflect the research life cycle. The research life cycle describes the various steps of a research project, from funding to implementation to results and effects and (where possible) to new funding. It has been used to define those areas that have already been covered by the DARE Programme. SURFshare focuses on the areas that are still undeveloped, including the pre-publication phase. This is where the collaboratories or virtual research environments are situated, with scientists and scholars worldwide co-operating with one another on enhanced publications, regardless of time and place. An invitation to submit projects testing existing software for the scientific publication environment was issued in July 2007. The software packages involved are Sharepoint (business environment) and SAKAI (education environment). These pioneering projects are intended to produce specific insights and lessons that can be used to develop functional publication environments in which it is easy to work with different versions and that are characterised by refined access control, enhanced publications and interoperability.
The post-publication phase also falls within the remit of SURFshare, however. Repositories will have to be designed or re-designed to deal with complex or enhanced publications, making new demands on the structure and transportability of the metadata. SURFshare will be working closely with the OAI-ORE (Open Archives Initiative - Object Reuse and Exchange) initiative in the USA [33] on this aspect. One area of focus will be the long-term storage and accessibility of research data, involving new initiatives by the Royal Academy (DANS [34] for the social sciences and humanities) and the 3-TU consortium (for technical data). The structural relationship between the institutional repositories and the current research information systems will be improved; a strategic analysis was developed for this purpose in Knowledge Exchange and will be used as a guideline [35].
All these areas have important points in common with the new European DRIVER II Project. The participation of the Dutch universities has been organised in a Joint Research Unit under SURF's leadership. It is the first time that this structure - which was designed for research institutes that wish to co-operate with one another internationally as a unit in a European project - is being applied by a national team in a European infrastructure project, and as such it is a relatively high-profile move.
Publication initiatives will be launched or receive support without the quality assessment depending on authors' signing away their rights. An invitation has been issued to submit project proposals, including ones that involve experiments with open review processes or structured open annotation processes.
The second phase of the SURFshare Programme will also look at the effect of publication. The current impact factor is a one-dimensional and contested measure. The world of high-energy physics (CERN and Los Alamos) is working hard on improved and supplementary methods. The new approach taken in the UK's Research Assessment Exercise is also being followed closely. Where possible, the SURFshare programme will co-operate in such initiatives or develop its own.
Finally, SURFshare encompasses a crash approach to achieve parity in knowledge dissemination as quickly as possible between Dutch universities of applied sciences and Dutch research universities.
Acknowledgement
The author wishes to thank Annemiek van der Kuil, DARE's Community Manager, for her in-house reviewing of this article.
References
- Web site of Conference on Open Access to Knowledge in the Sciences and Humanities, Berlin, 20 - 22 Oct 2003 http://oa.mpg.de/openaccess-berlin/berlindeclaration.html
- Open Archives Initiative site http://www.openarchives.org/
- Herbert van de Sompel, Carl Lagoze, "The Santa Fe Convention of the Open Archives Initiative", D-Lib Magazine, February 2000 http://www.dlib.org/dlib/february00/vandesompel-oai/02vandesompel-oai.html
- Dutch research window DAREnet site http://www.darenet.nl/en/page/language.view/search.page
- Union catalogue of digital resources site http://www.oaister.org
- The Directory of Open Access Repositories site http://www.opendoar.org/index.html
- Gerard van Westrienen, Clifford A. Lynch, "Academic Institutional Repositories. Deployment Status in 13 Nations as of Mid 2005", D-Lib Magazine, September 2005 http://www.dlib.org/dlib/september05/westrienen/09westrienen.html
- Clifford A. Lynch, Joan K. Lippincott, "Institutional Repository Deployment in the United States as of Early 2005", D-Lib Magazine, September 2005 http://www.dlib.org/dlib/september05/lynch/09lynch.html
- Maurits van de Graaf, "DRIVER: Seven Items on a European Agenda for Digital Repositories", Ariadne, July 2007 http://www.ariadne.ac.uk/issue52/vandergraf/
- DAI (Digital Author Identification) Project site http://www.rug.nl/Bibliotheek/informatie/digitaleBibliotheek/dailang
- Jeroen Bekaert, Patrick Hochstenbach, Herbert Van de Sompel, "Using MPEG-21 DIDL to Represent Complex Digital Objects in the Los Alamos National Laboratory Digital Library", D-Lib Magazine, November 2003 http://www.dlib.org/dlib/november03/bekaert/11bekaert.html
- Martin Feijen, Annemiek van der Kuil, "A Recipe for Cream of Science: Special Content Recruitment for Dutch Institutional Repositories", Ariadne 45, October 2005 http://www.ariadne.ac.uk/issue45/vanderkuil/
- Promise of Science, doctoral e-theses in the Netherlands site http://www.darenet.nl/nl/page/language.view/promise.page
- Learning Objects Repository network site http://www.lorenet.nl/en/page/luzi/show?name=show&showcase=1
- HBO Knowledge Base, bachelor e-theses in the Netherlands site http://www.hbo-kennisbank.nl/en/page/page.view/hbo.page
- DARE-based services Web site http://www.darenet.nl/nl/page/language.view/diensten.diensten
- Nereus access to economics resources site http://www.nereus4economics.info/about_us.html
- Connecting-Africa, African Studies web portal http://www.connecting-africa.net/
- Selected Organic Reactions Database site http://www.sord.nl
- JISC-SURF copyright toolbox with, among others, publishing licences in English, Dutch, French and Spanish http://copyrighttoolbox.surf.nl/copyrighttoolbox/authors/licence/
- Knowledge Exchange site http://www.knowledge-exchange.info/Default.aspx?ID=1
- Petition for guaranteed public access to publicly-funded research results http://www.ec-petition.eu
- SURF's Web site with depositing licence (in Dutch only)
http://www.surffoundation.nl/smartsite.dws?ch=AHO&id=12625 - Open source OAI-repository tools like harvester, metadata store, indexer, drill down, search, rss etc. http://www.cq2.nl/page/meresco.page
- DRIVER site http://www.driver-repository.eu/
- Guidelines for DRIVER repositories (also in Spanish) http://www.driver-support.eu/en/guidelines.html
- Richard Poynder interviews Leo Waaijers in his series of OA interviews http://www.richardpoynder.co.uk/Waaijers%20Interview.pdf
- EUA's response to Commission's "Green Paper" consultation on the European Research Area, September 2007
http://www.eua.be/fileadmin/user_upload/files/Policy_Positions/EUA_Response_to_ERA_Green_Paper.pdf - Cream of Science in Japanese http://www.nii.ac.jp/irp/info/translation/feijen.html
- Drs. M.P.J.P. Vanderfeesten, "A portal for doctoral e-theses in Europe; Lessons learned from a demonstrator project" July 2007
http://www.surffoundation.nl/download/ETD_LessonsLearned_Annex.pdf - SURF Web site http://www.surffoundation.nl/smartsite.dws?id=5289&ch=ENG
- "SURFshare Programme 2007-2010, Condensed version"
http://www.surffoundation.nl/download/SURFshare%20programme%202007-2010%20Condensed%20version%20website.pdf - Open Archives Initiative Object Reuse and Exchange site http://www.openarchives.org/ore/
- Data Archiving and Networked Services site http://www.dans.knaw.nl/en/
- Mathias Razum, Ed Simoms. Wolfram Horstmann, "Institutional Repositories Workshop Strand Report. Strand title: Exchanging Research Information", February 2007 http://www.knowledge-exchange.info/Default.aspx?ID=164