Supporting Digital Preservation and Asset Management in Institutions
In the early days of the shift from paper-based to digital means of holding administrative records, research data, publications and other academic resources, those responsible for its safety tended to breathe a sigh of relief once they had got a category of material into digital form. Reduced to bits and bytes, all they would have to do is make regular backups, perhaps keeping a copy off-site in case of disaster, and all would be well. Increasingly, material of value to Further and Higher Education is produced and held only in digital form. Increasingly, those with responsibility for the care of this material are becoming aware that sound backup procedures are only the beginning of care. Physical carriers of digital material deteriorate; digital data can become corrupted; the hardware that reads particular carriers wears out and cannot be replaced when it has become obsolete; file formats become obsolete in the course of software evolution, as backward compatibility is lost over a succession of versions; older versions of software, even when these are available, may not work on new hardware or operating systems. Valuable digital assets of institutions are at risk of loss, in the medium-term as well as long-term future.
Challenges to preserving access to these assets are (at least) as much related to organisational process, policy and culture issues as to technical issues. The ways of working that worked in the pre-digital era may not transfer well or easily to a time when a high proportion of the information assets of institutions exist (and indeed are meaningful) only in digital form. Senior managers with finance responsibilities need support in assessing the costs and benefits of digital preservation. Institutions need processes in place to support decision making about what material requires active intervention and when. These decisions must be based not only on selection and retention policies, sometimes related to legal compliance requirements, but also on judgements of levels of risk to material, and on levels of risk from the loss of particular material or classes of material. These challenges are common across Further and Higher Education institutions, and institutions can benefit from collaboration and sharing of experience in meeting the challenges. They need support in developing organisational processes and technical systems for digital preservation and access management.
The Joint Information Systems Committee (JISC) aims to raise awareness of digital preservation issues across Further and Higher Education institutions, and to set in motion a process of integrating digital preservation and asset management into institutional strategies and operations. JISC is doing this through a programme of initiatives supporting digital preservation and asset management in institutions. £700,000 was set aside for this initiative, and proposals for projects were invited through JISC Circular 4/04 in the spring of 2004. An extremely strong field of proposals was received, so much so that further funding was allocated in order to fund more of the projects than originally planned. Eleven projects were funded, involving nineteen institutions, including those in Further and Higher Education, archives, and other bodies. Varying in length from six to twenty-four months, the projects began in late 2004 or early 2005, and all will have been completed by early 2007 at the latest. The focus of the projects is on providing practical support to institutions in ensuring ongoing availability of and future access to digital information of value to the Further and Higher Education community.
The Projects
Three themes for development were set out in the call for proposals.
1st Theme: Institutional Management Support and Collaboration
Theme one, institutional management support and collaboration, attracted six successful proposals. Two of the projects are producing and delivering training, while the other four will provide exemplar strategies for developing institutional digital preservation strategies and business processes underpinning long-term digital preservation or digital asset management.
The aim of the University of Glasgow project, An effective Strategic model for the Preservation and disposal of Institutional Digital Assets (eSPIDA) [1], is to develop and implement a sustainable business-focussed model for digital preservation, as part of a knowledge management agenda in Higher Education institutions. The model will include the relationships, roles and responsibilities, costs, benefits and risks inherent in institutional digital preservation, building on the experience of the DAEDALUS [2] and Effective Records Management (ERM) [3] projects, and on engagement with a range of stakeholders. A key to success is the building of trust between those who promote digital preservation and the University's decision makers, such that the latter can judge whether digital preservation is simply a cost, or whether it provides tangible and intangible benefits. Librarians, archivists, and information technology specialists at Glasgow will be engaged, alongside those with responsibility for business processes and strategic and financial planning. External expert advisors will be consulted, as well. The digital assets under consideration are materials related to research inputs and outputs, to institutional record keeping, and, to a lesser extent, teaching materials. The modelling will be implemented at the University of Glasgow, and this work will provide guidelines and a documented implemented case study for dissemination across the UK Further and Higher Education community.
Lifecycle Information for E-Literature (LIFE) [4] is a project that aims to explore and develop a life cycle approach to costing digital archiving for e-journals. University College London in partnership with the British Library will investigate financial and other issues around long-term digital archiving, including cost and risk comparisons of preserving a paper and digital copy of the same publication, and of a library in Higher or Further Education undertaking preservation in collaboration with another institution. The project will attempt to identify the point at which will there be sufficient confidence in the stability and maturity of digital preservation to switch from paper to digital archives for publications available in parallel formats. Based on a review of the existing state of knowledge, various methodologies will be implemented to provide case studies. A lifecycle approach takes a long-term view of stewardship of collections, starting at the point of acquisition (whether by creation, deposit or purchase), and defining the relationships between stages of an item's existence over time, identifying the costs of each stage. The extent to which the costing and risk models used for print format material can be applied to digital format material will be considered. Although the focus here is on journal articles, it is assumed that the lifecycle approach developed could be applied to other categories of material, such as e-learning objects and the products of digitisation projects. The findings will be evaluated and validated in the context of an international conference of practitioners.
John Wheatley College (JWC), in partnership with the Centre for Digital Library Research (CDLR) at Strathclyde University and the Scottish Library and Information Council, is undertaking a project focussed on curriculum-related material. Managing Digital Assets in Tertiary Education (MANDATE) [5] aims to develop a toolkit for digital asset management and preservation. In this case, the material to be preserved is that created in the development of a new programme of study, and includes learning/teaching materials, assessment materials and procedural documents relating to a comprehensive approval process. The toolkit will support a college-wide strategy for development and storage of this material. It will take into account issues of workflow within an OAIS (Open Archival Information System) framework, and the creation, storage, retention, retrieval, and preservation of material. The requirement for Freedom of Information compliance will be taken into consideration. Templates and software components to support the course development and approval process will be selected, created or further developed where required, and integrated. Based on a needs analysis, training will be provided to enable JWC staff to implement the workflow. The management toolkit, based on a research pilot in JWC, will be tuned to the Further Education environment, but might also have application within Higher Education. It builds on previous work undertaken by JCW within the JISC Records Management Programme, and will draw on the outcomes of the JISC-funded Metadata Workflow Investigation [6] of metadata creation and management processes within the JISC Information Environment (currently being undertaken by CDLR).
The aim of Managing Risk - A Model Business Strategy For Corporate Digital Assets [7] is to address the requirement for a digital asset strategy which combines academic and learning resources and corporate information at King's College London (KCL), and in so doing to provide a case study and model business preservation strategy which will address this common institutional need. The project will focus on the digital assets of the KCL Registry, Estates, and Facilities and Services divisions, the School of Nursing, and the School of Social Science and Public Policy. This will result in a strategy that bridges institutional requirements for business continuity and for digital preservation of academic resources. Examples of the range of material under consideration include records of the training of nurses, and commissioned teaching material for an e-degree in War Studies. As well as addressing the basic issues of 'who should do what, when and how', the project will evaluate the costs, benefits and risks of using external contractors to manage corporate digital assets and the use of consultants for asset survey work. This case study will be published during the current academic year. It will be of value to Further and Higher Education institutions, all of which must manage the risks of losing valuable digital assets through degradation of the physical medium and content corruption, of loss of reputation and income through breaks in business continuity, and of breaches of Data Protection, Freedom of Information and a wide range of equality law through poor coordination of the handling of assets required in respect of legislation.
Oxford University Library Services' METS Awareness Training project [8] aims to raise awareness of the Metadata Encoding & Transmission Standard (METS) [9], particularly with regard to its potential usefulness in the context of digital asset management. METS provides a schema for encoding descriptive, administrative, and structural metadata about objects in a digital library. It is designed not only to meet the needs of an individual digital library, but also to support interoperability across digital libraries. The project will revise an Oxford Digital Library-developed course, both to update it and to make it relevant beyond the Oxford Digital Library context. Revision will be followed by the delivery of the course in a series of seminars at six separate locations around the UK. The course seminars will include a brainstorming session on the possible usefulness of METS in institutions' own digital object management. Not intended as technical training, the course will provide advice on sources for those who need to know more about METS, and prepare them to pursue further training as needed, for example the two-day METS tutorial workshops which the METS Editorial Board will run. The training materials developed within the project will be made freely available for use by libraries.
A much more broadly-based training project, the Digital Preservation Training Programme [10] has the backing of the Digital Preservation Coalition (DPC), the Digital Curation Centre and Cornell University. With this backing, the University of London in partnership with the British Library aims to develop a modular training programme in digital preservation, with class-taught, online and off-line components. Training will be available to the widest possible audience through this mix of self-paced material, taught components and group exercises. The project builds on existing training initiative and material exemplars. These include the Cornell University digital preservation course, the DPC travelling one-day workshop, the Preservation Management of Digital Material Handbook [11], and training from existing JISC-funded services such as the Arts and Humanities Data Service (AHDS). Modules will be provided that are appropriate to meet the requirements of senior managers as well as practitioners and staff who are newcomers in the field of digital preservation.
2nd Theme: Digital Preservation Assessment Tools
One project was funded under theme two, digital preservation assessment tools. The Digital Asset Assessment Tool (DAAT) Project [12], will develop a tool to assess the preservation needs of digital holdings, allowing resources to be focussed on assets where risk of loss and cost of loss is greatest. The feasibility of building directly on the existing National Preservation Office Preservation Assessment Survey (NPO PAS) methodology and tool will be assessed. The development is lead by the University of London Computer Centre in partnership with the AHDS, and with piloting, testing and evaluation by the NPO, The National Archives, the British Library and the School of Advanced Study of the University of London, with further input from the DPC and the Digital Curation Centre (DCC). The piloting phase will be followed by revision and then beta testing followed by final revision. Guidance in its use will be provided when the final version of the tool is made available across the UK FE/HE community. Release of the DAAT tool will be supported by dissemination and training activities. The tool will be appropriate for use in a broad range of settings, including archives, libraries, data centres, computer services, and by research groups.
3rd Theme: Institutional Repository Infrastructure Development
Institutional repository infrastructure development is the third theme. There are few UK implementations of the Open Archival Information System (OAIS) Reference Model [13]. In spite of this, the OAIS model is a point of common reference and source of shared vocabulary among those concerned with digital archives, including institutional repositories. Funded projects will explore implementations of the OAIS model, and also the use of the Metadata Encoding & Transmission Standard (METS) in the context of preservation. Most currently available open source repository software applications do not have long-term digital preservation as a key goal of their design. In order to facilitate the incorporation of preservation planning and management into repository development, some projects are exploring the integration of preservation functionality in current open source repository software. Three projects were funded under this theme.
Assessment of UK Data Archive and The National Archives compliance with Open Archival Information System and Metadata Encoding and Transmission Standard (OAIS/METS) [14] is being carried out by the UK Data Archive (UKDA) at the University of Essex in partnership with The National Archives (TNA). Both organisations have mapped their systems and metadata to the OAIS Reference Model and METS to test the assumption that both broadly comply with the standards. Because UKDA and TNA have had responsibilities for digital preservation of material created in electronic form for some years, their systems and procedures were put in place before and during the development of these standards. The systems and procedures of the two organisations are similar, but there are differences. The assessment will provide a case study of how such institutions' operational structures can be informed by such standards, and can in turn inform the application of these standards in organisations generally, particularly with regard to their relevance to institutional goals and practical needs. The project is about to report at time of publication of this article.
The Preservation Eprint Services (PRESERV) Project [15] aims to implement an ingest service, based on the OAIS Reference Model, for archives built using Eprints.org software. (In OAIS, ingest is the process by which an item is brought in to an OAIS-modelled repository.) The University of Southampton, with The National Archives, will provide modular tools for metadata capture and file format identification and verification, the latter by linking Eprints through a Web service to PRONOM software. While these tools will be automated to the extent that this is feasible, it is recognised that fully automated file format recognition requires a higher level of up-to-date coverage and breadth of coverage than is likely to be possible. For the purposes of evaluation, the ingest service will be integrated into the deposit process of two existing institutional archives, at Southampton and Oxford Universities, subject to prior satisfactory testing on pilot archives. The British Library and Southampton University will build and test an exemplar OAI-based preservation service using preservation metadata collected using Eprints.org software. This service could in principle be used with any OAI-compatible preservation archive to create a software-independent preservation archive. The archives at Southampton and Oxford Universities will be used as the testbed for this service to provide additional distributed archival capability. In the longer term, this approach could be used to build an OAIS implementation over a network of distributed and cooperating services.
SHERPA Digital Preservation: Creating a Persistent Environment for Institutional Repositories [16] is a project that aims to create a collaborative, shared preservation environment for the SHERPA Project [17] framed around the OAIS Reference Model. The AHDS with the University of Nottingham will carry out this work, with additional support from the Consortium of University Research Libraries (CURL). The JISC- and CURL-funded SHERPA project set up open access e-print repositories in twenty partner institutions, and investigated key issues in creating, populating and maintaining e-print collections. Extending this collaboration into a full preservation service, by bringing together the SHERPA institutional repository systems with the AHDS preservation repository, will remove from each individual institutional repository the burden of adding its own preservation layer. The project will investigate the business case for this model and seek to establish an economic cost model that could be used to ensure its long-term sustainability. The technical challenges, metadata requirements, administrative and workflow processes of the preservation environment will be investigated. The model and working processes that the project will develop and implement are intended to be transferable to other repositories and services.
Cross-theme Project - Themes 1 and 3
One cross-theme project was funded, addressing both the institutional management support and collaboration theme and the institutional repository infrastructure development theme. The Personal Archives Accessible in Digital Media (PARADIGM) Project [18] aims to provide a best-practice template for establishing long-term access to private papers in digital form. (Initially, this project was named Digital Archival Exemplars for Private Papers.) As an exemplar, it will enable long-term access to the private papers of at least two contemporary politicians, one Conservative and one Labour. Project partners are Manchester University and Oxford University, whose libraries have strong archival collections of personal papers. Oxford administers the Conservative Party Archives, while Manchester administers the papers of the Labour Party on behalf of the People's History Museum. The strategies developed will be of use to any institution which collects, preserves and maintains access to private papers, and the results of the project will be made available as best-practice guidelines in the form of a workbook. In addition, the project will report on the experience of testing the application or relevant software such as DSpace or Fedora. The PARADIGM Project will address issues relating to various content types, organisational problems, and compliance with the Freedom of Information, Data Protection and Intellectual Property legislation, as well as comparing OAIS Reference Model and traditional archival accessioning workflows.
Conclusion
This initiative to support digital preservation and asset management in institutions is set within the broader context of the JISC Digital Preservation and Records Management Programme [19]. The Digital Curation Centre (DCC) [20], launched in November 2004, is a major outcome of that programme. In addition, a number of studies have been commissioned, including the eScience Curation Report [21]. JISC also supports the Digital Preservation Coalition (DPC) [22], and actively seeks collaboration with appropriate initiatives and organisations beyond the UK. An example of the latter is its participation in the Europe-based Task Force Permanent Access. In the near future, further studies will be commissioned and projects may be funded in the specialist areas of preservation of learning objects, images, and moving images and sound.
Another related programme is the Digital Repositories Programme [23] in which innovative development projects are moving forward with issues in setting up and using repositories. It brings together people and practices from across various domains, including research, learning, information services, institutional policy, management and administration, and records management. It aims to ensure coordination in the development of digital repositories, in both their technical and organisational aspects.
Taken together, the projects in this programme and related programmes have the potential to make a significant contribution to embedding within UK Higher and Further Education the digital preservation and asset management processes and the technical and human resources that make these processes possible.
References
- An effective Strategic model for the Preservation and disposal of Institutional Digital Assets (eSPIDA) http://www.gla.ac.uk/espida/dp.shtml
- DAEDALUS http://www.lib.gla.ac.uk/daedalus/
- Effective Records Management (ERM) http://www.gla.ac.uk/infostrat/ERM/
- Lifecycle Information for E-Literature (LIFE) http://www.jisc.ac.uk/index.cfm?name=project_life&src=alpha
- Managing Digital Assets in Tertiary Education (MANDATE) http://www.jwheatley.ac.uk/mandate/
- Metadata Workflow Investigation http://mwi.cdlr.strath.ac.uk/
- Managing Risk - a model business strategy for corporate digital assets
http://www.jisc.ac.uk/index.cfm?name=project_managingrisk&src=alpha - METS Awareness Training http://www.jisc.ac.uk/index.cfm?name=project_mets&src=alpha
- METS Official Web Site http://www.loc.gov/standards/mets/
- Digital Preservation Training Programme http://www.jisc.ac.uk/index.cfm?name=project_dptp&src=alpha
- Preservation Management of Digital Materials http://www.dpconline.org/graphics/handbook/
Editor's note: see also the article in this issue by Neil Beagrie on the development and use of the handbook entitled Digital Preservation: Best Practice and its Dissemination - DAAT (Digital Asset Assessment Tool) http://ahds.ac.uk/about/projects/daat/
- Open Archival Information System (OAIS) Reference Model http://ssdoo.gsfc.nasa.gov/nost/isoas/ref_model.html
- Assessment of UK Data Archive and The National Archives compliance with Open Archival Information System and Metadata Encoding and Transmission Standard (OAIS/METS)
http://www.data-archive.ac.uk/home/oaismets.asp - Preservation Eprint Services (PRESERV) http://www.jisc.ac.uk/index.cfm?name=project_preserv
- SHERPA Digital Preservation: Creating a persistent environment for institutional repositories http://ahds.ac.uk/about/projects/sherpa-dp/
- SHERPA http://www.sherpa.ac.uk/
- Personal Archives Accessible in Digital Media (PARADIGM) http://www.paradigm.ac.uk/
- Digital Preservation and Records Management Programme
http://www.jisc.ac.uk/index.cfm?name=programme_preservation - Digital Curation Centre http://www.dcc.ac.uk/
- Lord, P., and Macdonald, A., "eScience Curation Report: Data curation for e-Science in the UK - an audit to establish requirements for future curation and provision", prepared for the JISC Support of Research Committee (JCSR), JISC, 2003 [final report, appendices and summary briefing paper linked from] http://www.jisc.ac.uk/index.cfm?name=project_escience
- Digital Preservation Coalition http://www.dpconline.org/
- Digital Repositories Programme http://www.jisc.ac.uk/index.cfm?name=programme_digital_repositories