From Nought to a Thousand: The HUSCAP Project
Hokkaido University launched its project to construct an institutional repository in early 2004. After a year of discussion, planning and preparation, we started soliciting content in July 2005. Within a year of that start, we had assembled a depository of approximately 9,000 documents. It is named the Hokkaido University Collection of Scholarly and Academic Papers (HUSCAP)[1]. Eight thousand of these documents are digitised collections of faculty journal back issues that have been published over the many years of Hokkaido University's history. Most editorial boards of academic journals at Hokkaido University have been considering different ways of publishing their journals electronically. Consequently the proposal to provide our repository as an electronic journal platform was both readily understood and welcomed. The HUSCAP Project scanned the back issues as PDF files.
This article will cover the remaining 1,000 documents, being the international and peer-reviewed journal articles written by researchers at Hokkaido University.
Hokkaido University is a national university with 20 faculties and 23 research centres. It employs 2,132 researchers with 6,357 graduate students and 11,640 undergraduates on roll. According to the Thomson Scientific database [2], researchers at Hokkaido University publish some 2,500 to 3,000 papers a year in the journals that are registered on that database. We have always considered these papers as one of the most valuable parts of HUSCAP. They have been central to our strategy on content recruitment. However, collecting them has been one of the most challenging aspects of this project.
Here we describe our efforts to recruit the first 1,000 papers to the HUSCAP repository, a number, we recognise, that is nonetheless still far lower than the number of papers produced every year.
The Open Access Landscape in Japan, Lessons from Existing Projects and Our Thinking
As of early 2004, the 'open access' movement and the concept of an institutional repository were little known in Japan. It was the National Institute of Informatics (NII)[3] that played an important role in promoting the ideas of open access and the construction of institutional repositories. The NII launched the NII Institutional Repository Portal (NII-IRP) Project [4] in June 2004, with six partner universities. The NII and its partners conducted a trial implementation of DSpace and Eprints software and investigated how to localise them to Japan. Under the project, promotional activities were conducted at universities in Japan. Hokkaido University, as a project partner, has had discussions with Stevan Harnad [5] who came to Tokyo at the express invitation of the NII. Furthermore, colleagues at Hokkaido have read many papers on the philosophical, practical and technical issues involved in the development of institutional repositories, papers which were translated into Japanese under the NII-IRP Project. Among these, the article on DAEDALUS in Ariadne [6] and the article on Rochester University's experience in D-Lib Magazine [7] were very helpful to us in advancing the HUSCAP Project.
Our strategy on content recruitment can be summarised as: informing the community and demonstrating the usefulness of open access but without necessarily giving a full explanation or engaging in heavy marketing. We hoped our institutional repository would play a central role in supporting the accountability of Hokkaido University to its stakeholders by providing free access to the research produced by the University. However, it would not make much sense for us to promote this work without being able to point to any achievements. Consequently we decided that we had to begin developing an institutional repository as a pilot project by ourselves and seek to build up a certain degree of content so as to gain the approval of the researchers we hoped to recruit.
In promoting our activities to researchers, we avoided the use of phrases like 'open access', 'institutional repository', 'self-archiving' and the like. Instead, we described the project as another activity of library collection development in the digital age. We asked the researchers to contribute their research papers in electronic form to increase our library collections and promised that we would process them in just the same way as traditional library materials, through acquisition, physical processing, adding them to the library catalogue and making them available to users. Our thinking was that, even if their first submission to the institutional repository only arose as a direct result of the invitation by the Library, once their paper appeared in the repository and they learned that users from around the world were frequently downloading it, they would understand the significance of open access and the institutional repository. Consequently they might be more willing to contribute a second and a third paper, which would lead, we hoped, to spontaneous self-archiving.
Our first step was to raise awareness of HUSCAP. We did not expect all researchers to understand everything about it, but we did expect that this action would leave them with the impression that the library really wanted their papers in electronic form without the absolute necessity of their understanding exactly why.
Broad-brush but Conspicuous Advocacy
What led us to this strategy of informing the community and demonstrating the usefulness of open access without necessarily giving a full explanation boils down to our initial experience. At the beginning of the project, we tentatively named the repository the Hokkaido University Research Repository. We produced a leaflet describing the 'serials crisis' and the concept of open access and sent it to all the researchers at the University. Some researchers said that they are not familiar with the term 'repository' and what the Library understood by the word. Others admitted they received so many leaflets that they tended to dismiss them as junk mail. It looked as if it was going to be an uphill climb to disseminate the notion of an institutional repository by means of a single sheet of paper.
So we gave up on the one-sheet explanation, deciding instead to devote ourselves just to making the project itself far better known. We decided to change the name of our repository to HUSCAP, after the blue honeysuckle (haskap in Japanese)[8]. This berry grows wild on Hokkaido, and the members of Hokkaido University are quite familiar with it. We designed a logo based on the haskap and put up posters with this symbol all over the campus, keeping the text down to a few explanatory sentences.
At the same time we also prepared a 12-page pocket-size guidebook to HUSCAP for those researchers who had expressed interest in the project having seen the posters. We made the guidebook available by leaving it near the posters. Furthermore we produced a flyer entitled "Someone's sure to be needing a paper of yours" and handed it out to users of the Interlibrary Document Delivery Service together with the guidebook.
In the first two months of the HUSCAP Project, we held over 30 presentation sessions for researchers. As with the poster, we did not go into too much detail, preferring only to notify them that the library was collecting e-prints. The researchers here are so busy that we limited all presentations to 15 minutes. We prepared a 10-page slide presentation based on the idea that this project aimed to collect research papers as a further library collection. This presentation confined itself to a minimum explanation of the open access movement and its benefits. For example only one slide was devoted to introducing Professor Harnad's research on the impact of open access [9]. When this slide was shown, the audience was heard to utter a collective and interested 'Hmm...!' It was just a brief mention of the open access movement but, we believe, a telling one.
To our regret, the audiences were smaller than expected. However, lively discussion arose every time after these 15-minute presentations, on copyright issues, the journal submission process and research life, all of which proved an invaluable opportunity for us to hear opinions on the institutional repository and scholarly communication in general.
However, ultimately, not all the researchers who supported the idea of an institutional repository actually contributed papers; but this was also the case with DAEDALUS and Rochester, as described in the Ariadne and D-Lib articles mentioned earlier [6][7]. Nonetheless, immediately following these presentations, paper submissions to HUSCAP did increase temporarily, but the increase was not maintained. We realised all too keenly that we needed sustainable tactics that would allow us to acquire content on a continuous basis.
Not Just Any Paper, But a Specific Paper
At the very outset of the project, we had asked researchers individually to submit their papers to the institutional repository. Using the Thomson Scientific database, we listed the papers published over the last two years for each author. We made inquiries about researchers' intentions as regards co-operating in the creation of an institutional repository. Furthermore we contacted 60 researchers who were interested in our project providing them with a list of their papers together with a polite letter requesting them to donate their papers to the repository. The largest number of papers listed was 27 and the smallest was 1. The overall total came to 226 and the average number requested was 3.76. However, only 25 papers were actually contributed, and these by 10 researchers.
As a consequence, we interviewed some of the 60 researchers contacted, and their responses can be summarised as follows:
- It is bothersome to look for manuscripts of past papers, which are often scattered or lost
- The manuscript is sometimes scrapped once the paper is published in a journal.
- It was unclear which paper was wanted. Did the repository want every paper ever written by the researcher?
Researchers who accessed our repository system expressed a variety of opinions:
- The metadata required are too complicated and exhaustive. The researchers did not want to spend more than a very short time on the submission process.
- The process looked troublesome at first, but turned out to be unexpectedly easy.
- It took a long time to understand how to make the first submission, but after that it was easier.
In light of these comments, we revised our strategy. The most important challenge seemed to be making the user's first-time submission easier. Towards this goal, we decided to abandon making requests for just any paper and instead focused on their most current one.
The actual process is detailed here:
- Search databases every Monday to find the latest papers written by Hokkaido University researchers.
- Check the RoMEO/SHERPA list [10] and select papers published in "green" journals from the database search results.
- Send concise e-mails to the authors, saying "Would you be willing to contribute your manuscript '______________' as an e-mail attachment for HUSCAP's collection? We have permission from the journal."
- Load all contributed papers into the repository.
- Send an e-mail thanking the author and outlining, for the first time, the benefits of open access in terms of research impact and our wish to have any other papers written by that researcher.
The results of this approach are shown in Table 1 below.
Period | Request | Papers requested | Papers received | Rate |
Mar.-May 2005 | Papers published in the last two years | 226 | 25 | 11% |
Jan.-Mar. 2006 | Papers appearing in the database in the last week | 409 | 201 | 49% |
We were able to obtain manuscripts from 49% of the authors. While this figure is not high enough yet, we think it may represent a far more effective way to collect journal articles for an institutional repository than by the retrospective method.
In addition, the request from the Library to the researchers has also made them more aware of self-archiving. Some of them therefore are starting to deposit every new article with us as it is published. Just as encouraging is the fact that a few other researchers also made inquiries about depositing all their past works.
It was instructive for us to have contact with so many researchers. Some of them now participate in the HUSCAP Project as early adopters. We also met a few researchers who held negative, skeptical or contrary views. Talking with them was stimulating and instructive, and encouraged us to reconsider our project and seek better policies.
Less Promotion, More Evidence
Now we are starting to show results. The HUSCAP Project has started to provide an e-mail service in which we notify authors of the monthly count of their articles downloaded from HUSCAP, based on the httpd access log. The e-mail contains the download count for each of that researcher's articles, and we offer the option of receiving the download count per domain (e.g. "xx downloads from .uk").
On 1 June 2006, the service made its debut with e-mails to 303 researchers who had contributed articles to HUSCAP enquiring whether they would like to receive the e-mail service. 40% said yes, and half of the 40% chose the version of the service giving the more detailed report.
Respondents' Preferences | Researchers | Ratio |
(a) Was sent an e-mail | 303 | 100% |
(b) Prefers brief e-mail | 60 | 20% |
(c) Prefers detailed e-mail | 61 | 20% |
(d) Prefers either of the e-mails (b+c) | 121 | 40% |
Though inevitably anecdotal, the responses from some of the users of the e-mail service are instructive:
- "I was surprised to hear of my paper being accessed so often."
- "Please send me the results in greater detail. I'm interested to know who's reading my papers."
- "This is a good way to understand who my readership tends to be."
- "As a researcher, it's a real encouragement for me to know how often my paper has been downloaded. From now on, I'll be sending my papers to HUSCAP whenever I have something valuable to contribute."
- "I felt gratified to hear someone was interested in our research and had read our paper. I had thought the research community in my discipline must be very small in Japan."
- "The download count gives us encouragement and enthusiasm. Thank you very much."
We expect the e-mail service to serve as a distinct incentive to researchers who are thinking of contributing other papers to HUSCAP. We do not have enough data as yet to be able to analyse the effect of the e-mail service on self-archiving; however, immediately after the first e-mails were sent, 22 papers were donated voluntarily to HUSCAP from researchers who had received them.
Wider Access
During the course of the 15-minute presentations, researchers often asked whether papers contributed to HUSCAP would be accessible through the databases they tended to use, such as Pubmed, SCOPUS and Web of Science. We generally replied that HUSCAP was crawled by Google and other Web search engines. This answer persuaded many, but there was a significant remainder who felt this was not sufficient.
We are therefore preparing a new access path to HUSCAP in order to address this concern. The ways of accessing institutional repositories will be multiplied, as seen in Thomson's Web Citation Index Pilot Project [11], in which 7 repositories are participating, and in Elsevier's SCIRUS [12], which includes T-Space in Toronto University and CURATOR in Chiba University as search targets. As a new means of accessing its institutional repository, Hokkaido University has launched a collaborative research project with the Openly Informatics Division of the Online Computer Library Center (OCLC), which has an '1CATE' link server [13]. We have developed a framework whereby users can access self-archived content in institutional repositories via link servers.
In order to respond to openURL requests and provide a link to the content, institutional repositories have to have rich and accurate metadata. HUSCAP is based on DSpace 1.3.2. The DSpace default metadata schema is based on the Library Application Profile of Dublin Core, in which citation information is edited into an element (bibliographicCitation) in free syntax without structure [14]. We modified and extended the DSpace metadata specification to allow DOI, PubMed ID, ISSN, E-ISSN, journal title, volume, issue, spage and epage all to be described as a separate metadata element.
Hokkaido University and OCLC also defined an experimental XML schema named ir.xsd [15], in accordance with which institutional repositories might reply to link servers. In addition, Hokkaido University has developed a function for DSpace that resolves OpenURL0.1 and OpenURL1.0 requests, and responds by sending the location information of the requested paper based on ir.xsd.
The screenshot below is the 1CATE test server navigation window, with links to the published version of the paper on Elsevier's Science Direct and to the author's version on HUSCAP.
Figure 2: 1CATE's navigation window including a link to HUSCAP
Navigation to the appropriate copy is an essential function of a link server. Collaboration between link servers and institutional repositories may provide considerable benefits to the users whose institutions are unable to subscribe to electronic journals. Moreover, our on-campus promotion and content recruitment is expected to reinforce the incentive to potential users. We may soon be close to answering yes to the question about wider access.
Conclusion
This article summarises what Hokkaido University has done in the first year since it established an institutional repository. The lessons from the HUSCAP Project may be summarised as follows:
- It is more important and effective to secure fresh digital literature produced day by day before it is lost, than to focus only on compiling past works.
- So that researchers might understand the aim of an institutional repository, and to recruit repeat submitters, it may be effective to notify them how much their papers in the institutional repository are read. This will make them aware of the value of open access.
Our goal is to shift from content recruitment by database-based individual requests to voluntary deposit, by holding steadfastly to our strategy of making our aims understood. Co-ordination between link servers and institutional repositories will underpin our arguments.
For institutional repositories it will be valuable for researchers to be aware of the benefits of self-archiving and to undertake the latter voluntarily, whether or not we mandate self-archiving. We will continue to look for the best way to get that first research paper and to make it easier for authors to participate.
References
- HUSCAP http://eprints.lib.hokudai.ac.jp/
- Web of Science http://scientific.thomson.com/products/wos/
- NII http://www.nii.ac.jp/index.shtml.en
- The NII-IRP Project http://www.nii.ac.jp/metadata/irp/ (in Japanese)
- Professor Stevan Harnad is currently Canada Research Chair in Cognitive Science at Université du Québec à Montréal (UQAM) and Professor of Cognitive Science at the University of Southampton. He is also an External Member of the Hungarian Academy of Sciences. He is moderator of the American Scientist Open Access Forum
http://amsci-forum.amsci.org/archives/American-Scientist-Open-Access-Forum.html - Morag Mackie, "Filling Institutional Repositories: Practical strategies from the DAEDALUS Project", April 2004,
Ariadne Issue 39 http://www.ariadne.ac.uk/issue39/mackie/ and http://www.nii.ac.jp/metadata/irp/mackie/ (Japanese Translation) - Nancy Fried Foster and Susan Gibbons, "Understanding Faculty to Improve Content Recruitment for Institutional Repositories". D-Lib Magazine, 11(1), January 2005 http://www.dlib.org/dlib/january05/foster/01foster.html and http://www.nii.ac.jp/metadata/irp/foster/ (Japanese Translation)
- Lonicera caerulea http://en.wikipedia.org/wiki/Lonicera_caerulea
- Stevan Harnad, "Comparing the Impact of Open Access (OA) vs. Non-OA Articles in the Same Journals". D-Lib Magazine, 10(6), June 2004
http://www.dlib.org/dlib/june04/harnad/06harnad.html and http://www.nii.ac.jp/metadata/irp/harnad/ (Japanese Translation) - SHERPA/RoMEO publisher copyright policies and self-archiving http://www.sherpa.ac.uk/romeo.php
- Web Citation Index http://scientific.thomson.com/press/2005/8298416/
- SCIRUS http://www.scirus.com/
- Openly Informatics Division, OCLC http://www.openly.com/ and its 1CATE http://www.openly.com/1cate/
- DSpace's metadata http://dspace.org/technology/metadata.html
- XML Schema for the project(ir.xsd) http://eprints.lib.hokudai.ac.jp/ir.xsd