Editorial Introduction to Issue 64: Supporting the Power of Research Data
In these cash-strapped times among all the admonitions to save money here, and resources there, I rather hope to hear much about the necessity of protecting and building the knowledge economy if the UK is to make its way in the globalised world, since we cannot pretend to compete easily in other areas of endeavour. Hence research has to be regarded as one of the aces remaining to us, and thus I hope the importance of gathering, managing and preserving for long-term access research outcomes will be widely appreciated and supported. Consequently I am pleased to see Ariadne's small contribution to this end in the form of articles from Dorothea Salo and Brian Westra on how the role of data stewardship can be adopted by institutions and the support that data needs assessment can render to e-Science researchers.
In contributing her views on Retooling Libraries for the Data Challenge, Dorothea Salo accepts that digital libraries and digital repositories should seem well placed to be stewards of digital research data. However, she points out that the characteristics of research data do not automatically suit them to library workflow and infrastructure. Library administrators will find that achieving a viable fit between the data and their systems is far from straight-forward. Dorothea provides an overview of the salient characteristics of research data for such purposes in terms of their usage and scope, pointing out that small volumes of data can present as many difficulties as the large volumes. Unless there has been some unanticipated conferring in preparation of their Ariadne articles, Dorothea and Brian Westra [1] seem to be thinking on much the same lines when it comes to the demands on projects made by large and small volumes of data. Not least are the difficulties that arise in terms of human resources where big data projects at least benefit rapidly from a relatively small amount of expert intervention unlike their smaller counterparts. She also describes the difficulties created by the wide variety of research data forms and the attendant variety of file formats thus generated. The data may be neither interoperable nor preservable but the way in which researchers interact with data must also be carefully understood. Dorothea identifies a further difficulty for libararies in that they cannot count on a 'clean slate'; the data in question may have been organised inexpertly or saved in formats ill-suited to long-term reuse, not forgetting that there is still a significant amount of research data in analogue form which remains expensive to digitise. The foregoing problems are then compounded by the culture of project organisation which does little to encourage the accumulation of good data practice, or encourage the curation of intermediate data unlikely to make it into final results and publications. The absence of widespread use of common data standards can only exacerbate the situation. Having provided an overview of the nature of data on the one side of the equation, Dorothea furnishes a portrait of the characteristics of digital libraries and how their technical and in organisational infrastructure are likely to match to a data stewardship role. A primary characteristic - that is far from promising in terms of handling the backlog - is the high-volume, labour-intensive nature of their digitisation process. Their curatorial approach is unlikely to cope readily with the diversity of format and (she would contend, sometimes dubious) quality of much of researchers' data. Dorothea then turns her attention to the characteristics of institutional repositories and identifies the reality of their being best suited to standard research publications, while other kinds of data struggle in such an environment as do their depositors. She also points out that IRs' responsibilities frequently extend beyond the boundaries inside which they were designed to operate. For example, a researcher has only to move on from their institution to create real difficulties. Meanwhile from the end-users' point of view, the interaction with IR software is frequently less polished than in the case of dedicated platforms such as SlideShare and lacks a flexible architecture. Having characterised so cogently the characteristics of the various candidates for the data stewardship role, the author goes on to offer recommendations for them all.
I am indebted to Brian Westra for his article Data Services for the Sciences: A Needs Assessment in which he points out that 'Academic libraries are beginning to examine what the expansion of data-intensive e-science means to scholarly communication and information services, and some are reshaping their own programmes to support the digital curation needs of research staff'. He states that where libraries are looking to support e-science work in the management of research data, 'A needs assessment can help to characterise scientists' research methods and data management practices, highlighting gaps and barriers, and thereby improve the odds for libraries to plan appropriately and effectively implement services in the local setting.' Brian describes the approach adopted by Oregon's Science Data Services Librarian in the conduct of a science data needs assessment. Notable was the decision to include a wide range of staff when recruiting participants to develop a project plan. The literature review provided useful resources and the Data Asset Framework (DAF) methodology was identified as a tested approach. The author points to the value of one of its recommendations, namely the benefit to be derived from a business case set in the local context as a convincing approach to obtaining managerial buy-in. It was interesting to note the effective use of face-to-face interviews combined with the nimble development environment offered by Drupal to create a clear understanding of context and needs on the human front supported by the opportunity to review and add information on scientists' data assets after the face-to-face interview, though uptake of the latter opportunity was underwhelming. Brian provides a summary of the data management issues most frequently identified by the interviewees. They included anxieties surrounding back-up and storage of data, data management tools, access issues, among others. Not for the first time, as does Dorothea Salo, Brian identifies the need for a common understanding of terminology. It is also interesting to note that the issue of 'big' and 'small data' surfaced in these interviews. Brian would be gratified to learn that respondents, while asserting that data management needs arose only in 'big data' projects, also admitted ultimately that small-scale datasets could be significant in their work, and, as Dorothea Salo would contend [2], in need of proper management. Brian reports that the needs assessment has shown itself to be useful in discovering the data curation requirements of research scientists - and that the employment of the DAF assessment tool was 'critical to our ability to implement an assessment in a timely and successful manner.'
Ariadne can perhaps claim something of a passable record in its efforts to keep track of the developments in digital repository software [3][4][5][6] over the years and so it is pleasing to be able to publish the contribution by Ed Fay entitled Repository Software Comparison: Building Digital Library Infrastructure at LSE. LSE Library collects materials for both research and teaching across a range of boundaries and its Digital Library Management and Infrastructure Development initiative has been engaged in the comparison of repository software, a summary of which Ed kindly provides in his article. He begins by providing an overview of the project's methodology, explaining how they reached the capability to conduct an effective analysis of different repository software. By early 2010, the project was in a position to understand the core functional criteria necessary to carry out a high-level comparison of the repository software options. Ed goes on to describe the various materials held in the LSE's digital collections, how they are currently stored and managed and their related functional requirements in terms of operation and preservation. Ed then goes on to describe LSE Libraries' digital technical aims for these diverse holdings and the project's decision to adopt a modular approach to the development of the new functionality and the associated benefits, such as, for example, the manner in which a modular architecture facilitates interfaces with existing systems such as catalogues. In reporting the testers' findings, Ed rightly emphasises the fact that repository software will prove more or less useful in the particular context of the testers. Ed goes on to detail the testers' findings in terms of functional area, testing criteria and a summary of findings and also supplies notes on the resource implications for the project's choice in terms of the human effort as well as the anticipated effort required to set up the chosen repository software. Ed completes his contribution with some down-to-earth conclusions about the development and maintenance of digital library infrastructure at the local level.
In Rewriting the Book: On the Move with the Library of Birmingham, Brian Gambles introduces the motivation behind the Library of Birmingham project entitled 'Rewriting the Book' which seeks to address the digitally driven user-oriented future the Library envisages. BG outlines for us the explosion in mobile device usage with the advent of the iPhone and the enormous take-up of apps in the cultural heritage sector with advances in GPS applications offering users considerably more than the apps delivering static content. The increasing influence of mobile usage has brought about a change in the traditional library-library user relationship where greater emphasis is placed upon collaborative working and the contribution the public can make. In the light of the increasing acceptance of the omnipresence of mobile connectivity, Brian details for us the strategy has evolved at LoB by way of response. Brian details the principal characteristics of mobile connectivity as they realate to the changes libraries and archives must contemplate. To emphasise the point, he describes the effects of just a few of the disruptive technologies that have emerged in the past few years. Brian then provides an overview of LoB's thinking aloud about the mobile systems under current consideration such as iPad, iTunes U, etc., the latter for example representing much potential as a means of disseminating better expert library and archives knowledge. He also emphasises the importance his library places upon information on users' experience and mobility. The difficulty, he perceives, lies in how to discover exactly what that is.
In Public Library 2.0: Do We Need a Culture Change?, Sarah Hammond looks into the state of UK public libraries' involvement in social media, taking forward previous research which examined the engagement of public libraries with, in particular, blogging. Her findings are not altogether encouraging. A significant number of blogs has become inactive though few are actually defunct. Findings from the USA were not dissimilar. The public library-based blogs did not compare very favourably with HEI counterparts. Neither were the causes of the relatively low level of public library engagement with blogging particularly surprising: technological and organisational barriers, together with lack of knowledge, engagement and resources. While the emphasis on the worst difficulties besetting intending bloggers seemed to differ between US and UK respondents, a theme of operational resistance to professional adoption of blogging nonetheless emerged either side of the Atlantic. Notably the author is able to discern green buds of recognition among senior managers that were less than apparent as recently as 2008. Moreover, whereas some practitioners in public libraries and elsewhere have come to regard Web 2.0 applications as a catalyst, the re-structuring of public library organisations does not provide an environment in which staff will naturally tend to experiment, let alone adopt. Managers hungry for justificatory usage statistics would be misguided in laying too much reliance on the quantity of users' comments, Sarah points out, but she does recommend that organisations using blogs should actively manage links to and from their home page in order to sustain their blogs effectively. She likewise highlights the importance of content feeds in keeping material fresh and up to date. She points to research that indicates, whatever managers may feel, that library staff derive considerable benefit from reading other public library staff's blogs, not just in terms of information and improvements in practice, but from the standpoint of staff morale. Even more compelling are the benefits that emerge when the users are able to read their public library's blog. Researchers have discovered notable and increased usage of resources that have been publicised in a library blog. Arguably, where public libraries' adoption of blogging and other Library 2.0 applications can have most telling effect is among young people. Given the low floor of usage by that age group, any increase in their uptake will be measurable to the delight of any hard-pressed manager. Equally significant, of course, is that it is an age group that, as early adopters, will embrace Library 2.0 opportunities much more than average. She goes on to describe success factors that have been identified in thriving blogs. One, very importantly (and close to my heart), identifies the importance of attention to content rather than vehicle, and equally the value of integrated promotion based on widely linked content. Given she had favourtable things to say about the 23 Things programme, I feel certain Sarah will be pleased to see the contribution from the following public library colleague.
While accepting that some of the statistics for use of social networks were doubtless open to debate, Helen Leech argues in 23 Things in Public Libraries that there is little doubt that social media and Web 2.0 applications have changed many users' behaviour radically. As a consequence, so have practices within public libraries. Helen explains the drivers behind greater public library response to increases in use of social media and Web 2.0 applications as diverse as the push to involve the public in Race Online, the shift away from email to more collaborative methods of communication, the impetus from government in the form of Communities in Control, as well as professional and economic drivers. Helen consequently regarded Helen Blowers' 23 Things as a welcome answer to the training needs of many colleagues in public libraries. That it was gratis and unrestricted in blog form was, for the time, revolutionary. Since 2006, some of Blowers' material had aged and so moves were made to update it and also orientate it towards UK users in the public library domain. Helen describes efforts to create the new material more appropriate to the needs of public library staff. This work was not without its own difficulties in terms of technology use, but it also benefitted from feedback from their supporters. Areas of effort were also identified that may prove common to many public library authorities, eg, bench-marking. 23 Things does strike the interested bystander as something of a welcome development.
In her article Trove: Innovation in Access to Information in Australia, Rose Holley describes a major development in Australian digital cultural heritage. In fact she even welcomes the disruptive technologies that are emerging, of which Trove arguably is representative. She seems them as an advantage since they create a role for the interested public. Rose maintains that the public's increasing involvement is to be welcomed since it emphasises society's interest in creating cultural content. She makes the case for libraries as a vanguard of innovation in terms of their greater sustainability in comparison with commercial organisations such as Google. She also maintains that their aim to provide universal access sits well with current trends in innovation in this direction and points to the National Library of Australia's strategic objectives as evidence; and Rose sees Trove as an outcome of this approach. Trove differs from most search engines since it explores resources held on the deep web as in collection databases. However the Trove team wishes to see more commonly used engines harvesting their resources as well.
In their description of the work and impact of Intute, Angela Joyce, Linda Kerr, Tim Machin, Paul Meehan and Caroline Williams remind us in their article Intute: Reflections at the End of an Era that in the 1990s there were not many services that brought together academic content that was easily accessible. The authors point to the enormous influence that Google and subsequently Google Scholar had upon their field as they reached deeper into the hidden web, but they also point to the failings they exposed in inexperienced hands as indicated by Brabazon [7]. A key aim of Intute has been to support the development of critical analysis of burgeoning Web content. They also point to the research-related work that has derived from initiatives such as the ViM (Value for money in automatic Metadata generation) Project, and now PERTAINS and EASTER projects. They accept that communicating to researchers the developments in other specialist Web content had not been straight-forward. The authors describe the effect of reduced funding on service capabilities and the project's efforts to explore other business models than grant-funded operation. The project's unique selling point of human selection and description of Web sites appeared to lose out to the current trend Web 2.0-driven contributions, yet questions that arise in respect of Web 2.0-driven contribution patterns have been asked before, principally will such a model jeopardise public trust [8] in the service?
In FRBR in Practice: The Celia Library for the Blind, Finland , Wendy Taylor and Kathy Teague introduce their work by describing the services offered by the RNIB which include the need to catalogue all RNIB holdings on its accessible formats catalogue. It was to see whether FRBR might offer them a means of investigating the 'possibility of cataloguing the accessible format, e.g. Braille at the manifestation level rather than as a holding attached to the bibliographic record describing the print book' that they sought and won the Ulverscroft/IFLA Best Practice Award. Wendy and Kathy also provide a brief comparison of the two libraries, there being significant differences between the charity, RNIB, and the state-funded Celia Library. They also describe the particular nature of cataloguing material in multiple formats for people with visual impairment, which is further complicated in Finland by the fact the country is relatively rare in that many of its citizens are readily conversant with the language of their neighbour Sweden and so the need to describe the correct language format is a frequent one, facilitated by the nature of FRBR.
As usual, we offer our At the Event section, which has seen what I suspect may be a record number of contributions from across many fields of endeavour, combining fresh instances of events Ariadne frequently covers as well as reports from new areas of interest which I hope you will also find interesting. This issue also offers reviews on on a work on building and supporting online communities, a Festschrift celebrating the work of Professor Peter Brophy, founder of the Centre for Research in Library and Information Management, a manual to help support your use of an iPad, and work which examines the future of digital information and emerging patterns of scholarly communication. In this last instance, readers who know or have read reviews by Muhammad Rafiq in the past will doubtless join with me in relief at the news that none of his family in Pakistan has been affected by the appalling floods that continue to ravage the country at the moment. There seems little sign of a let-up and the numbers of people seriously affected are on an unprecedented scale [9][10][11].
I will no longer, however, enjoin you to peruse the news and events section Ariadne has offered; in recognition of the changing times, we are discontinuing the Newsline and have already instituted an Ariadne twitter channel [12] which will serve you with news items relating to Ariadne's output, authors and other related matters.
But as ever, I do hope you will enjoy the bumper Issue 64.
References
- "Data Services for the Sciences: A Needs Assessment" Brian Westra, July 2010, Ariadne Issue 64
http://www.ariadne.ac.uk/issue64/westra/ - "Retooling Libraries for the Data Challenge" Dorothea Salo, July 2010, Ariadne Issue 64
http://www.ariadne.ac.uk/issue64/salo/ - "Fedora UK & Ireland / EU Joint User Group Meeting" Chris Awre, January 2010, Ariadne, Issue 62
http://www.ariadne.ac.uk/issue62/fedora-eu-rpt/ - "E-Archiving: An Overview of Some Repository Management Software Tools," Marion Prudlo, April 2005, Ariadne, Issue 43
http://www.ariadne.ac.uk/issue43/prudlo/ - "Versioning in Repositories: Implementing Best Practice' Jenny Brace, July 2008, Ariadne Issue 56
http://www.ariadne.ac.uk/issue56/brace/ - "DSpace vs. ETD-db: Choosing software to manage electronic theses and dissertations" Richard D Jones, January 2004, Ariadne Issue 38
http://www.ariadne.ac.uk/issue38/jones/ - "The University of Google: Education in the (Post) Information Age" (Review), Judy Reading, January 2008, Ariadne Issue 54
http://www.ariadne.ac.uk/issue54/reading-rvw/ - "Cautionary Tales: Archives 2.0 and the Diplomatic Historian" Michael Kennedy, October 2009, Ariadne Issue 61
http://www.ariadne.ac.uk/issue61/kennedy/ - Bereaved Pakistanis speak of flood horrors, BBC News, 5 August 2010 http://www.bbc.co.uk/news/uk-10879273
- DEC Pakistan floods appeal: UK donations reach £10.5m, BBC News, 12 August 2010 http://www.bbc.co.uk/news/uk-10952844
- Disasters Emergency Committee (DEC) http://www.dec.org.uk/
- Ariadne twitter channel http://twitter.com/ariadne_ukoln