Down Under With the Dublin Core

tony gill; paul miller

Down Under With the Dublin Core

Paul Miller and Tony Gill offer a view of the recent Dublin Core metadata workshop in the Australian capital, Canberra.

Continuing a long and glorious tradition, the 4th Dublin Core Workshop [1] last month went to a really nice country and picked one of the least lively settlements in which to meet. Admittedly, in the company of such as Dublin (Ohio, USA, rather than the somewhat more picturesque capital of Eire) and Coventry, Canberra did rather manage to shine.

Nobly sacrificing sleep, wintry weather and the monotony of their offices for the higher cause that is metadata, the authors and two other UK representatives (Dave Beckett from the University of Kent at Canterbury and Rachel Heery from the UK Office of Library & Information Networking, UKOLN) descended upon an unsuspecting Australia. This article offers a light-hearted and personal view of the proceedings, Australia, and airline cuisine. To ensure maximum jealousy on the part of you, dear reader, sunny photographs are spread liberally throughout.

As regular readers of Ariadne will be aware [2], [3], [4], the Dublin Metadata Core Element Set, or Dublin Core, has been under development since the original workshop in Dublin, Ohio, back in 1995 [5] and is now increasingly to be found in the web pages of projects such as the Arts & Humanities Data Service, as well as in a number of metadata creation tools. Not content with merely embedding Dublin Core in web pages, a number of projects in the UK and elsewhere are also experimenting with the use of Dublin Core-like records as a means of providing uniform access to more detailed holdings maintained in a variety of more complex formats.

However, despite widespread adoption of Dublin Core around the world, the precise semantics of implementation have never actually been formally agreed, and consequently several ‘flavours’ of Dublin Core have begun to emerge. Part of the aim of the Canberra workshop was to explore the syntax behind Dublin Core implementation, and to draw up a single set of recommendations for its implementation within the HyperText Markup Language (HTML); probably the greatest single deployment medium at present.

Canberra, viewed from Mount Ainsley

Figure 1: Canberra, viewed from Mount Ainsley

We’re off to Australia

The workshop was organised by OCLC, the National Library of Australia and DSTC, and was held at the National Library in Canberra. Perhaps best known to most people only from the last chilling words in Jeff Wayne’s War of the Worlds, Canberra is actually the Federal capital of Australia and lies between the rival cities of Melbourne and Sydney in its own little territory - the ACT, or Australian Capital Territory nibbled out of New South Wales.

Imagine Milton Keynes. Take all the people away. Add offices for every Federal government department, High Commissions for the Commonwealth countries and embassies for everyone else. Slice the top off a hill and stick Parliament in it (not a bad idea, we hear you cry!). Throw in a big artificial lake, marsupials, some hills and a space-age telecommunications tower and you’ve just about got Canberra. It’s not really as bad as it sounds, and was certainly nicer to visit than our disparaging contacts in Sydney implied (they made it sound like Milton Keynes, Coventry and Dublin, Ohio, all rolled into one)!

The Australian Federal Parliament, Canberra, ACT

Figure 2: The Australian Federal Parliament, Canberra, ACT

However, your diligent roving reporters didn’t want to arrive at the workshop suffering from the ill- effects of a 22 hour plane journey (where did Wednesday go?), so felt duty-bound to spend some time researching the amenities of Sydney whilst acclimatising. Notable highlights include the Art Gallery of New South Wales, established in 1874 and now housing eclectic collections from around the world; the immaculate and tranquil Botanical Gardens; the remarkable Opera House, designed by Jørn Utzon and completed in 1973; the revolving restaurant at the top of a very tall tower; the surfer’s paradise of Bondi Beach; and Sydney Harbour Bridge, a miracle of civil engineering and Commonwealth co- operation in its day, and still the widest bridge in the world.

The Art Gallery of New South Wales - and breakfast!

Figure 3: The Art Gallery of New South Wales - and breakfast!

The view over Sydney from Centre Point tower

Figure 4: The view over Sydney from Centre Point tower

The Sydney Opera House

Figure 5: The Sydney Opera House

The Sydney Harbour Bridge

Figure 6: The Sydney Harbour Bridge

In the same spirit of international collaboration, the Archaeological Computing Lab of the University of Sydney [6] were treated (or should that be subjected?) to a lecture on the archaeology of York (the one in Yorkshire), including a number of slides showing the city under several feet of water. At no other time did Britain feel so far away!

The acclimatisation period also coincided with the Mardi Gras, an annual gay pride carnival, and the biggest event in Sydney’s entertainment calendar. Unfortunately the suggestion from the Archaeology Department that step-ladders would be helpful in order to see the supposedly-colourful parade turned out to be less tongue in cheek than anticipated! Although Canberra could not hope to compete with Sydney in the fun stakes, it did nonetheless offer the chance to see some of the countryside - the workshop organisers arranged an excursion to the Tidbinbilla Nature Reserve, allowing participants to see some of the more exotic wildlife such as koalas, kangaroos, cockatoos and emus in their natural habitat. It also had a very passable curry house.

Fancy cataloguing for ADAM? Tony finds some cheap labour

Figure 7: Fancy cataloguing for ADAM? Tony finds some cheap labour

The workshop

The workshop itself took place over three days, and drew together more than 60 participants from several countries who between them represented much of the current thinking on Dublin Core from the libraries, computer science and implementation communities (not to mention a combined total distance travelled of more than a million air miles).

The fifteen elements [7] finally agreed upon in December of 1996 (Title, Creator, Subject, Description, Publisher, Contributors, Date, Type, Format, Identifier, Source, Language, Relation, Coverage and Rights) remain unaltered following the workshop, but the syntax for implementation and notions of ‘appropriate’ element use both came under close scrutiny. Differences of approach that have been apparent for some time on the mailing lists associated with Dublin Core came to the fore, with the formation of two definite extremes of opinion, from the vanilla ‘qualifiers over my dead body’ Dublin Core of fifteen elements and nothing else to the opposite ‘if it moves, slap a qualifier on it’ camp advocating an (optionally) much extended Dublin Core. It is interesting to note that the minimalist group is almost entirely made up of librarians and others with experience of large and extremely complex cataloguing standards such as the MAchine Readable Cataloguing (MARC) used in the libraries world, whilst the extenders tended to be ‘techies’ or implementers from non-traditional cataloguing backgrounds.

Prior to the workshop, many implementers had been making use of the fifteen core elements, as well as the optional qualification provided by SCHEME and TYPE sub-qualifiers. Under this system, SCHEME was used to denote that values were drawn from a recognised standard, such as the Library of Congress Subject Headings (LCSH), and TYPE was used to sub-divide existing Dublin Core elements by, for example, dividing the Creator element into email address, postal address, name, etc. Syntax, although not formally agreed, was often of a format similar to

META NAME = DC.creator CONTENT = “(TYPE = email) apm9@york.ac.uk”

or

META NAME = DC.date CONTENT = “(TYPE = created) (SCHEME = ISO31) 1997-04-02”

This somewhat clumsy format was a major discussion topic in Canberra, although the proposed solutions also presented problems for those wishing to implement Dublin Core in the short term.

Dublin Core syntax wars

And then there were three… Despite continued concern as to the extent to which they might be used, the workshop participants agreed in principle that an ability to qualify Dublin Core elements with some form of qualifier was useful. It became plain that the functional distinction between SCHEME and TYPE was less than clear to a significant body of users and this was one reason for suggesting a slight change to the informal syntax outlined above. Rather than continue to embed both SCHEME and TYPE information within the CONTENT area of HTML’s tag, it was proposed that TYPE information instead be appended to the element name, giving

META NAME = DC.creator.email CONTENT = “apm9@york.ac.uk” or, with a SCHEME,

META NAME = DC.date.created CONTENT = “(SCHEME = ISO31) 1997-04-02”

Misha Wolf of Reuters made a vocal case in favour of internationalisation of Dublin Core, and advocated the embedding of language information with the metadata. The fifteen elements currently include a Language field, which pertains to the language of the resource being described, but the Canberra workshop also recommended the addition of a LANG qualifier to the existing SCHEME and TYPE. This qualifier relates to the language in which the metadata is expressed, and may well be different from the contents of the Language element itself. Language information might be embedded, thus:

META NAME = DC.creator CONTENT = “(LANG = en) Miller, Paul”

The three qualifiers (TYPE, SCHEME and LANG) together are now - despite the existence of two of them pre-Canberra - to be known as Canberra Qualifiers and remain, like the elements themselves, optional; where you don’t need or want one, simply don’t use it.

Standards be damned…

Even with the movement of TYPE out of brackets and over next to the element name, this syntax undeniably remains somewhat unwieldy, and the inclusion of bracketed element qualifiers within the CONTENT area is rumoured to make extraction of the actual metadata value (‘Miller, Paul’, for example) tricky for some tool developers.

However, this syntax neither breaks the current Document Type Definition (DTD) for HTML, nor appears to cause problems for any of the existing HTML validation tools and, for a community hoping to set new metadata standards, compliance with existing standards must surely be no bad thing…? In an attempt to increase the ease of parsing and interpretation, a second syntax was put forward in Canberra, which appears to be in line with draft proposals before the World Wide Web Consortium, W3C. The W3C proposals, however, are at least six months away from implementation, and if used at present, the second Dublin Core syntax will cause most HTML validation tools to report errors. Nevertheless, it is neater than the first, and certainly easier for both computers and humans to handle. As such, the workshop recommended this as a syntax to aim towards in the medium to long term, assuming that the W3C proposals go as expected.

With the proposed syntax, the existing

META NAME = xyz CONTENT = “xyz”

becomes

META NAME = xyz
TYPE = xyz
SCHEME = xyz
LANG = xyz
CONTENT = “xyz”

For example:

META NAME = DC.date
TYPE = created
SCHEME = ISO31
CONTENT = 1997-03-07

Or, where TYPE is appended to the element name:

META NAME = DC.date.created
SCHEME = ISO31
CONTENT = 1997-03-07

Note that, unlike the earlier examples, the value of CONTENT (1997-03-07) is not enclosed inside quotation marks. As with much of HTML, quotation marks are only required where there are spaces included as part of the value. i.e. 1997 03 07 would need quotation marks (“1997 03 07”) whilst 1997-03-07 does not.

Other syntaxes were also explored, including an interesting proposal to embed Dublin Core metadata within a PICS-NG (the Next Generation of the Platform for Internet Content Selection format) header, but these may be even further off than the W3C-compliant syntax. Future Ariadne articles will discuss these options in more detail, and keep readers up to date with developments as and when they happen.

The journey home

After three long days, the fourth Dublin Core meeting was drawn to a close and the participants, exhausted by both mental exertion and jetlag in many cases, started to drift away. However, conscious of the taxpayers right to expect value for money, your selfless correspondents changed their flight details in order to attend the Australian National Metadata Seminar, held the following day and also at the National Library of Australia.

The large audience received presentations both from Dublin Core 4 participants (e.g. Stuart Weibel, Carl Lagoze, Rebecca Guenther and our very own Rachel Heery), and by others active in the field of resource discovery (for example John Perkins of CIMI, who was kind enough to insert a URL plug for the AHDS/UKOLN Visual Arts, Museums & Cultural Heritage Metadata Workshop into his Powerpoint slides). The strategic significance of metadata for networked information resource discovery has certainly not been underestimated Down Under!

[Ed: If you want to see what a large collection of metadata people look like, then take a look at the picture taken at this conference in the Ariadne Caption Competition section.

Then it was a quick dash to the airport, a short flight to Sydney, then a somewhat longer flight to Hong Kong, for a two night stop-over (rest assured that this was not at the taxpayers’ expense!). Landing at Kai Tak airport was as interesting as we’d been promised, and no doubt the residents of the numerous tower blocks could see the whites of the pilot’s eyes!

Claimed as a Crown colony in 1842, ownership of Hong Kong will pass to China at midnight on 30 June 1997, due to the expiry of the 99-year lease on the New Territories. With a population of just under 6 million people squeezed into just 415 square miles (14,457 people per square mile), Hong Kong is one of the most densely populated areas of the World - although this didn’t prevent your correspondents from investigating Victoria Peak, the Buddhist Monastery at Lantau, the cuisine (apparently some of the best Chinese food in the World), the shopping and of course the lively ex- patriot nightlife.

The Buddha on Lantau, and a (rather small) Tony

Figure 8: The Buddha on Lantau, and a (rather small) Tony

Finally the time came to spend all our remaining currency on duty-free and head back for Blighty. We landed at a dark, cold and foggy Heathrow at some obscenely early time of the morning, aching and tired but with a renewed resolve to work towards an international standard for resource description.

A veritable feast…

Many of the important conclusions from Canberra are still being mulled over with, for example, small groups tackling the refinement of some of the less well understood elements, and a number of reports being compiled. As well as the findings of these working groups, Stu Weibel from OCLC is tackling the official workshop report and there are at least two documents being carried forward for entry into the RFC process. Whilst citation details for these (very, at the moment!) draft documents are not yet available, future issues of Ariadne will hopefully carry details as they become available.

Also, Dublin Core metadata tools in the UK [8] and elsewhere [9] will soon be adapted to reflect the altered syntax - pay them a visit, and start stuffing your pages with Dublin Core!

Sunset over Hong Kong Harbour

Figure 9: Sunset over Hong Kong Harbour

Acknowledgements

Thanks are due to Lorcan Dempsey of UKOLN for locating the funds to enable UK participation in Canberra, and to Chris Rusbridge at eLib whose money it was. On behalf of all who were there, thanks once more to the people at the National Library of Australia (particularly Rachel Jakimow and Tonya Beeton, who managed to deal with all the problems we could throw at them with ease) and DSTC for organising and running an extremely slick meeting, and to Stu Weibel for staving off his jet lag sufficiently to keep control of the (occasionally fractious) rabble…

Both authors are, of course, open to invitations to attend similar events anywhere in the world… [Ed: Not much chance of that after this article…]

References

The 4th Dublin Core Metadata Workshop
http://www.dstc.edu.au/DC4/
Miller, P., 1996, Metadata for the Masses,
http://www.ukoln.ac.uk/ariadne/issue5/metadata/
Knight, J., 1997, Will Dublin form the Apple Core?
http://www.ukoln.ac.uk/ariadne/issue7/mcf/
Knight, J., 1997, Making a MARC with Dublin Core.
http://www.ukoln.ac.uk/ariadne/issue8/marc/
Weibel, S., Godby, J., Miller, E. & Daniel, R., 1995, OCLC/NCSA Metadata Workshop,
http://www.oclc.org:5047/oclc/research/publications/
Department of Archaeology, University of Sydney,
http://felix.antiquity.arts.su.edu.au/
Dublin Core Element Set reference definition,
http://purl.org/metadata/dublin_core/elements.html
The Dublin Core metadata creator,
http://www.ncl.ac.uk/~napm1/dublin_core/
The Nordic Metadata Projects Dublin Core metadata template,
http://www.ub.lu.se/metadata/DC_creator.html

Author Details

Paul Miller,
Collections Manager
Archaeology Data Service,
Email: Email: apm9@york.ac.uk
Web page: http://ads.ahds.ac.uk/ahds/
Tel: 01904 433954
Address: Archaeology Data Service, Kings Manor, York, YO1 2EP

Tony Gill,
International Liaison Officer,
ADAM eLib Project,
Email: tony@adam.ac.uk
Web page: http://www.adam.ac.uk/
Tel: 01252 722441
Address: Surrey Institute of Art & Design, Farnham, Surrey, GU9 7DS