Web Magazine for Information Professionals

OCLC-SCURL: Collaboration, Integration and Recombinant Potential

Pete Johnston reports on the New Directions in Metadata conference, 15-16 August, in Edinburgh

The problem of "navigating a rich and complex information landscape" took on a new dimension as I traversed Edinburgh's High Street on a bright Thursday morning at the height of the Festival. Fielding a barrage of enthusiastic invitations to attend a bewildering range of performances, I headed across town to the University for the "New Directions in Metadata" conference [1], organised jointly by OCLC [2] and SCURL [3].

Michael Anderson (Pro-Vice Chancellor, University of Edinburgh) welcomed delegates to Edinburgh, and made an appeal for us to bear in mind that the true value of the services we build around metadata will be measured by how well they meet the requirements of the user. The Research Support Libraries Programme, for example, had been driven by the needs expressed by the academic research community.
In his welcoming address, Jay Jordan (President and CEO, OCLC) emphasised that OCLC's work was informed by a global perspective, and their activities were underpinned by a co-operative and collaborative approach.

The keynote paper, "Metadata in a distributed environment: interoperability as recombinant potential", was presented by Lorcan Dempsey (Vice President, Research, OCLC). Lorcan argued that metadata provides the vital "intelligence" to allow both software tools and human users to "behave smarter" and to deliver richer, more useful information services. If in the past the emphasis had been on creating and using metadata to support resource discovery, the attention given to specifications such as Metadata Encoding and Transmission Standard (METS) [4] highlighted that we had now moved firmly into a phase where we were exploring the metadata requirements for a wide range of operations associated with the full "life cycle" of a resource - with a recognition that those operations may require richer, more complex metadata.

Lorcan sketched three scenarios:

· the provision of "utility" services providing access to commonly useful metadata to support discovery of, access to and use of distributed resources - as exemplified in the shared services planned for the JISC Information Environment [5], a context to which several presenters would return in the course of the conference;
· the lifecycle approach to digital content management, with a recognition that metadata was critical both in minimising the long-term costs associated with maintaining and preserving digital assets and in maximising the return on investment in digital content creation by ensuring assets were usable;
· exploiting distributed metadata to enhance the value of digital resources through collaborative efforts such as the Eprints-UK project [6]

In each case, metadata was creating a new potential for adding value to resources and delivering more useful services. Further, that metadata may be provided from diverse and distributed sources, with the enhanced value delivered through infrastructural components that enabled the effective integration of those sources: the whole could be very much more than the sum of the parts.

The three presentations given during the remainder of the afternoon developed various elements of this introduction.

John MacColl (Edinburgh University Library) explored the challenges presented to librarians by the use of Virtual Learning Environments, and particularly how the present approaches to VLE implementation and use might threaten the role of librarians as skilled mediators of information. The challenges are both organisational and technological. The ANGEL project, funded under the JISC DNER development programme, seeks to address the latter by building middleware services to improve integration between VLEs and library systems - with metadata (about resources, licenses, and access rights) at the core of those services [7].

Chris Rusbridge (University of Glasgow) emphasised the great potential of the Open Archives Initiative Protocol for Metadata Harvesting [8] for resource disclosure. Obstacles to implementing an institutional repository were typically organisational and cultural rather than technical, including the "chicken-and-egg" problem of achieving the critical mass of contributions that led academic authors to recognise the usefulness of submitting their work to the repository. The Daedalus project at Glasgow (funded under the JISC FAIR programme) will explore some of these issues in developing a network of institutional repositories [9]. Chris also suggested that the capacity for metadata to provide the "intelligence" in the system was critically dependent on the quality of that metadata, and in a context where metadata was created by authors rather than expert cataloguers that quality may be lower.

In the final presentation of the first day, Stuart Hunt (OCLC PICA) presented a critique of the "application profile" model [10] of metadata schema implementation from the perspective of structural linguistics. Stuart presented the view that, if metadata schemas were languages, and individual terms or elements were "signs" within those languages, the value of an individual sign derived at least in part from its context and its relationship with other signs. The premise that a sign can be placed in a new context without changing its value ignores, or at least undervalues, this structural aspect of language.

The second day opened with a session on collection-level description. I gave an overview of the area from the perspective of the Collection Description Focus, looking at how the idea of the collection had been used by different information management traditions, the work done within the Research Support Libraries Programme on modelling collections and their description [11], and examining briefly how collection-level metadata was being used to support resource disclosure and discovery in various contexts, particularly in the JISC Information Environment.

Dennis Nicholson (Centre for Digital Library Research, University of Strathclyde) and Gordon Dunsire (Napier University) reflected on the application of the RSLP model in the SCONE project [12]. Gordon explored the adoption of the "functional granularity" approach to defining a "collection", and highlighted similarities and differences in the nature of collection-level and item-level metadata. Dennis stressed the human and community aspects of the work, particularly in mediating areas such as the description of collection strength, where the potential for subjective judgement is high.

Marie-Pierre Détraz (CURL) reported on CURL's experience of using OCLC/Lacey's iCAS automated collection analysis service for the holdings of six UK libraries, to explore the usefulness of this tool in supporting collaborative collection management [13]. Marie focused particularly on the challenges of using automated methods to classify records that did not include standard classification numbers. This process had increased the percentage of records analysed by iCAS, but the results suggested that its use for specialist libraries may require some refinement.

The following session, titled "Making Common Sense", included three presentations under the broad area of classification and ontologies.

Peter Burnhill and David Medyckyj-Scott (EDINA) presented an overview of the value of referencing by place and the enhanced potential for location-based searching if referencing is based on a co-ordinate system rather than on a "controlled vocabulary" of place names. The Geo-Crosswalk project, also funded under the JISC DNER development programme, is exploring the usefulness of the approach in the context of a prototype digital gazetteer service within the JISC Information Environment [14].

Diane Vizine-Goetz (Office of Research, OCLC) argued that until now many valuable services built on classification schemes and ontologies had been embedded within particular metadata systems or applications. Web services technologies provide a basis for making such services available in more flexible modular forms. Both Google and Amazon have recently provided SOAP interfaces to their metadata databases, and these are already being exploited by third parties to build new services [15]. The potential of existing rich bibliographic metadata resources might be exploited through the provision of services such as automatic classification (or verification of existing classification) and classification mapping to enhance the value of metadata records.

Paul Miller (UKOLN) argued that the capacity for "personalisation" was an increasingly important factor in the interfaces that service providers offer to users. To deliver such features effectively, systems require more information on individual users and their preferences. Taking an example from the context of describing educational resources, Paul described the challenge of categorising the "level" at which a learning resource is aimed, with, in the UK alone, several overlapping schemes in operation [16]. Storing data for personalisation may also raise significant issues of privacy.

Deborah Woodyard (British Library) surveyed current activity in the area of metadata for digital preservation, and the efforts to create practical implementations to test the high-level models such as the Open Archive Information System (OAIS) [17]. Deborah argued that the automated generation of preservation metadata was an area which required further research.

Richard Ovenden (University of Edinburgh) presented an entertaining reflection on the themes of the seminar and suggested that the emphasis on describing collections as well as items, on a "lifecycle" approach which recognised the long-term value of the resource, and on curatorship as well as access, were all evidence of a (belated?) recognition of the archival approach to resource management. Organisational change is both reflecting and encouraging an end to the historical compartmentalisation between the information management professions.

In his closing comments, Ian Mowatt (University of Edinburgh) traced a number of tensions which had emerged during the conference, while sounding a note of pragmatism. He noted the impossibility of predicting everything the researcher might want from a system, and the importance of balancing what may be "best" in theory against what is "achievable" in practice: from the user viewpoint, something may be better than nothing if it is available when required. Ian returned to the opening theme of collaboration and co-operation and suggested that the potential for truly valuable services would be realised on the basis of partners sharing their separate human and informational resources through collaborative ventures.

The conference covered a broad range of topics within a relatively short time. It revealed a sense of cautious optimism that specifications like the OAI harvesting protocol and the various Web Services specifications were beginning to form the basis of useful services. Many of the more difficult challenges to be faced were cultural and organisational, with agents either reasserting the importance of their "traditional" roles in new contexts or adapting their approaches to new circumstances. The argument for collaborative approaches was a compelling one, and one which the participants appeared to welcome.

References

[1] The conference programme is available at <http://www.oclc.org/events/ifla/preconference/>
[2] Online Computer Library Centre (OCLC) <http://www.oclc.org/>
[3] Scottish Confederation of University and Research Libraries (SCURL)< http://scurl.ac.uk/>
[4] Metadata Encoding and Transmission Standard (METS)<http://www.loc.gov/standards/mets/>
[5] Documents on the Information Architecture for the JISC Information Environment are available at: <http://www.ukoln.ac.uk/distributed-systems/jisc-ie/arch/>
[6] Eprints-UK <http://www.rdn.ac.uk/projects/eprints-uk/>
[7] ANGEL <http://www.angel.ac.uk/>
[8] Open Archives Initiative (OAI) <http://www.openarchives.org/>
[9] The Daedalus project <http://www.lib.gla.ac.uk/daedalus/index.html>
[10] Rachel Heery and Manjula Patel, "Application profiles: mixing and matching metadata schemas", Ariadne Issue 25 (Sep 2000) <http://www.ariadne.ac.uk/issue25/app-profiles/>
[11] RSLP Collection Description project,<http://www.ukoln.ac.uk/metadata/rslp/>
[12] The SCONE project <http://scone.strath.ac.uk/>
[13] The CURL iCAS Collection Analysis project <http://www.curl.ac.uk/projects/icas.html>
[14] The Geo-Crosswalk project <http://edina.ac.uk/projects/crosswalk.html>
[15] Details of the Google Web services API are available at <http://www.google.com/apis/>
Details of the Amazon.com Web Services API are available at <http://associates.amazon.com/exec/panama/associates/join/developer/resources.html>
For an example of a service using these APIs, see <http://mockerybird.com/index.cgi?node=book+watch>
[16] MEG Educational Levels <http://www.ukoln.ac.uk/metadata/education/documents/ed-level.html>
[17] Open Archives Information System (OAIS) Reference Model <http://www.ccsds.org/RP9905/RP9905.html>

Author details

Pete Johnston
Research Officer
UKOLN
University of Bath

p.johnston@ukoln.ac.uk