Digital Preservation Planning: Principles, Examples and the Future With Planets
The aim of this one-day event was to provide an informal, interactive workshop that allowed delegates to share knowledge and experience in digital preservation planning, strategy and policy setting and of Planets [1] tools and technology. The event was an opportunity for DPC [2] members as well as other organisations with an interest in digital preservation to learn about the approach of colleagues some way down the road with the process and to share experiences and learn about the tools and services which are being developed by Planets to support the process.
A mix of presentations, exercises and discussion faciliated the sharing of challenges and solutions. Ahead of the event, delegates were sent a questionnaire and some pre-workshop reading. The outcomes of the questionnaire will be used to inform both projects.
Scene Setting and Overview of the Day
Frances Boyle, DPC
Called upon to present at the last minute, I had not planned on delivering opening comments and chairing at the same time! This session considered the reasons for planning to preserve digital content. Key messages were:
- Planning is the cornerstone of preserving digital objects
- Planning should take into account the needs of an organisation, its collections and users
- Planning is an extension of local business strategies and priorities (IT, teaching and learning etc.)
- The process of preservation planning should be driven by an organisation’s needs and priorities rather than technology
The detail of the black art of preservation planning would be dealt with in the following presentations.
An Introduction to Planets
Andreas Rauber, Vienna University of Technology
In the opening session, Andreas Rauber explained that while Planets aims to provide a technological and architectural solution to the process of preserving digital content, at the heart of it, it takes account of the needs of a wide range of European cultural heritage institutions.
Planets is a four-year EC-funded project with the aim of assuring long-term acccess to Europe’s cultural and scientific heritage. Its technology platform and suite of tools and services will help to automate and execute digital preservation processes and aid the decision-making process. Planets will provide a framework that will help people engaged in digital preservation to answer practical questions such as:
- What are the organisational, users’ and content needs now and in the future?
- What strategies are available and what are the strengths and weaknesses of each?
- How do we test and evaluate a range of potential solutions?
- Why has a decision been taken?
Planets will extend the already established and trusted decision-making tools that support archiving and access in the analogue world.
This outline which highlighted preservation planning as a process that supports decisions to meet the local needs of an organisation and its collections was a theme that resonated with the practical concerns of the audience.
Constructing a Preservation Policy:The Case of the UK Data Archive
Matthew Woollard, UK Data Archive
This presentation was the first of the case studies from organisations which were actively engaged in digital preservation. This was an insightful and measured talk which outlined the work of the UK Data Archive (UKDA) and raised some fundamental questions about preservation planning pertinent to all organisations - a ‘joined-up’ organisation.
The UKDA is a mature service that has had a preservation policy in place for some time. A recent and refreshingly probing review considered whether current policies and practice based on archival considerations and the Open Archival Information System (OAIS) framwork [3] were the most suitable. The review recognised that change would inevitably affect other policies.
UKDA identified the OAIS framework statement that should ensure
‘that the information to be preserved is independently understandable to the designated community’
That is without the need for the experts who produced the information in the first place. Preservation plans must be tailored to the needs and means of the communities they serve.
Matthew highlighted familiar tensions between the tenets of sound archical practice: authenticity, integrity, reliability and that other Holy Grail – usability. He drew on the example of a 1970s social science survey stored on punched cards and how that data could be interpreted and rendered usable 30 years later.
A wide range of external influences on UKDA’s planning processes were detailed. They included standards (BS ISO 15489), legal issues, information security (BS ISO 27001, 27002), the policies and guidelines of other agencies. In his conclusion he pointed to the exemplars of good practice and existing tools within the community that make the development of local policies possible - and to the fact that we are now at a juncture where words must be backed up with action.
The lifecycle approach taken by UKDA has resulted in changes to some of its policies. They include monitoring of assessment, preservation of resource discovery metadata, verification of Archival Information Packages (AIPs), improved data management and edition control of implemented changes.
Going Digital: The Case of the Wellcome Library
Natalie Walters, Wellcome Library
Natalie offered a second glimpse at how practitioners in the real world must deal with the management of digital assets. She also shared interesting points about the Wellcome Institute and the ambitions of the archive sector [4].
The Wellcome Library already deals with digital material, which is reflected in the Library’s digital strategy. The changing profile of the Library’s collections has resulted in a digital curatorial post. Its role is to adapt current archival practice to suit digital needs and provide support on technical issues to colleagues. Other archival staff are being trained to meet these new challenges. The ‘Digital Continuity in Action’ Project addresses issues related to the transition to the digital arena including: adapting archival practice, outreach and support work for their donor community and new digital curation pages on the Wellcome Library’s Web site.
This year, the Library issued an Invitation to Tender for a digital preservation system. It is a process other institutions in the audience may well be contemplating. Natalie highlighted a fundamental difference in understanding and expectations between customer and vendor, based on the disparity in terminology used and a lack both of consensus on how to approach the subject and indeed of an understanding of the needs. The responses received suggested a fundamental choice needed to be made between either the ability to access content or to preserve it. Negotiations continue with one prefered vendor of the five. The audience’s interest in Natalie’s comments was reflected in their subsequent questions.
The Wellcome Library is following a pragmatic approach based on sound archival practice. It is not waiting on a perfect solution, just working towards one that will be good enough to meet key requirements. However, Natalie concluded with a word of warning: the digital world is far less forgiving than that of print!
Preservation Planning (Part 1): Workflow and the Plato Tool
Christoph Becker, Vienna University of Technology
And so to tools that may provide practical support with implementation. Christoph Becker opened the main Planets sessions with an outline of the principles behind and an introduction to the preservation planning tool, Plato [5], and how practitioners may apply them.
Starting with thoughts about organisations’ needs Christoph outlined considerations such as:
- Why are we in this place?
- Are there legal constraints?
- What type of growth rate do we expect from this collection?
- What influences should we consider?
- How will users access material?
Add to this a contextual layer which examines the interests of a range of stakeholders – domain experts, content creators, IT staff, administrators - whose needs must also be taken into account. The result is a mind-map [6] - not oblique doodlings as may have first been thought by the gathered throng but a decision tree - whether constructed in the digital world of Plato or the analogue world of post-it notes.
He went on to review stages in the workflow: define the requirements; identify and assess possible actions; analyse results; create the plan. Plato (short for Planning Tool) is the Planets tool that has been developed to automate the process.
To conclude the session, Christoph considered the example of Web archiving using Plato to create a decision-tree.
Preservation Planning (Part 2): Simulation and Practical Exercise
Andreas Rauber & Christoph Becker, Vienna University of Technology
Note to readers (and self!): take from this a tip on event management from yours truly. It is unwise to split a session across lunch. Lunch-time is a long time after a demanding morning and full and frank discussions with colleagues whilst partaking of culinary delights.
This session should have been run after lunch. It should have been longer. And, it should have been delivered in a classroom or laborabory where software could be downloaded in advance. And, there should have been a reprieve or some cue cards provided for the delegates prior to them getting up close and personal with the software.
Still, this session proved to delegates one of the highlights of the workshop. After a little to-ing and fro-ing we had the opportunity to sample – if only briefly – practical tools that may make digital preservation a reality.
There were four preservation scenarios:
- Word document in a governmental archive
- Word documents in an enterprise archive
- Word documents in an e-Learning environment
- Powerpoint presentations in an e-Learning environment
Delegates worked in small groups – and engaged in lively discussion – to consider the assumptions and decisions that should be made to support preservation of a particular collection and for a particular organisation.
Practical Session Roundup
Christoph Becker, Vienna University of Technology
This session was a demonstration of the Plato tool, which would be the next logical step after the creation of objective trees in the earlier session. Christoph concluded the session by outlining the roadmap of steps which should be followed:
- Workflow-defining and creating.
- Objective tree templates - going through the methodical step-by-step exercise to frame the issues.
- Characterisation of the collection - e.g. format identification, risk assessment, object comparisons.
- Discovering applicable actions.
- Building a preservation plan.
A revised version of the Plato tool is scheduled for October 2008.
Characterisation
Manfred Thaller, University of Cologne
Characterisation is a complex subject which Manfred reduced to one sentence.
Deciding that his plane ticket and the assembled audience merited a few more words he continued on an entertaining and illuminating journey.
- What is a file format?
- Why do we need to know about file formats?
- Which format should we choose (with reference to the work of the Florida Center for Library Automation [7])?
- How do we identify file formats? (Through file extensions or internal characteristics.) Plus
- Automating file recognition by drawing on established format registries.
Beginning with what appeared to be trivial changes to a simple jpeg image file, he highlighted the resulting damage. He then walked through a set of scenarios to identify the most appropriate format to use to preserve content for a simple Word document with a footnote. Outcomes conflicted, depending on whether the focus was on the rendering or the structure of the documents.
The prescient message was: characterising digital objects will never be viable if conducted manually. It can only be realised if automated to a point where the process can be condensed from months or years to seconds. Sensitive and confidential documents remain a challenge.
Testbed: A Walk-through
Matthew Barr, HATII, University of Glasgow
Matthew started the presentation with a definition of what a testbed was - which was useful to the non-technical members of the audience.
‘A controlled environment for experimentation and evaluation, with metrics and benchmark content that allow comparison of tools and strategies’
Matthew provided an overview of the Planets testbed - a key component in relation to identifying appropriate strategies and actions to preserve formats with particular characteristics. Incorporating a testbed into Planets supports digital preservation decisions. The testbed provides facts about the utility of tools and services regardless of institutional setting. He outlined how the testbed fitted into the Planets architecture as part of the characterisation activity stream.
A key reason for adopting the testbed approach would be to avoid duplication of effort amongst partners, to share results and to ensure a common understanding amongst the players. All of which are particularly pertinent in the area of preservation planning where there is still a requirement for a dedicated research environment to allow systematic execution of experiments by the participating partners.
Planets’ testbed was released in prototype in September 2007 and throughout Planets in March 2008. The current iteration (v0.6-07/08) includes file format converters, migration services, simple characterisations and an image identification service. Software can be downloaded [8]. There followed a walk-through of the testbed and the routes and decisions a user could follow.
The testbed will be open to external institutions to run experiments from April 2009.
Interactive Discussion Session
Professor Kevin Shürer, UKDA
And so over to the audience. This session invited the audience to ask questions relating to the needs within their organisations. To close the day, the delegates determined a number of issues that needed resolution:
- Strong messages enabling those in the front-line to convince managers within their organisations of the need to invest in preservation of digital content
- Continued learning in the digital realm from the archival tradition
- A solution to how Planets will support the challenges of legal admissibility of records
- A convincing sustainable business model for Planets which will extend and promote the value of the project’s work beyond its lifespan
- Education about, and involvement of, developers outside the project in Planets
The final part of the day was an opportunity for debate and discussion between the presenters and the audience. The session was ably led by Kevin Shürer. After much lively debate and exchange Kevin wrapped up by asking the delegates what were the key messages they would like the Planets’ team to take away from the day.
A recurrent theme throughout the discussions was sustainability and ongoing support for Planets once the project finishes.
Conclusion
Did the workshop meet its aims?
Feedback says: substantially yes. There was a high degree of interaction. There was consideration of policy and strategy. There was some – if limited – experience of tools. And a sense that solutions are begining to be developed. But, there is also, among those closest to the problem, a keenness to understand best practice and anticipation of tools to support the process.
It will be interesting to revisit the topic in a year’s time and to gauge how far those tools have been developed, in particular with regard to some of the questions raised in the discussion session. We hope that another event will follow in Spring 2009 – so watch this space – well, a DPC or Planets space!
Please note that all the presentations are available at the DPC Web site [9].
References
- Planets Web site http://www.planets-project.eu/
- Digital Preservation Coalition (DPC) Web site http://www.dpconline.org
- OAIS on the ISO Web site
http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=24683 - Natalie Walter’s presentation on the DPC Web site http://www.dpconline.org/docs/events/080729walters.pdf
- Strodl, S., Becker, C., Neumayer, R., Rauber, A. “How to Chose a Digital Preservation Strategy: Evaluating a Preservation Planning Procedure”,
http://www.ifs.tuwien.ac.at/~strodl/paper/FP060-strodl.pdf - Free Mind software http://freemind.sourceforge.net/wiki/index.php/Main_Page
- Florida Digital Archive Web site http://www.fcla.edu/digitalArchive/pdfs/recFormats.pdf
- Testbed software http://gforge.planets-project.eu/gf/project/ptb
- Digital Preservation Planning: Principles, Examples and the Future with Planets, July 2008 Workshop http://www.dpconline.org/graphics/events/080729PlanetsBriefing.html