r·cade: Resource Centre for Access to Data on Europe
In spite of closer European integration and significant and rising levels of interest in European information and data from users in all sectors, especially higher education, the supply of information is still sporadic, and access is often difficult. Certain research funds within the social sciences are increasingly now targeted towards comparative research projects which cover at least three European countries, so availability of suitable statistical data and other information has become essential. The r·cade project was set up to improve access to these European (and other international) data, and address the surrounding issues, for example: the negotiation of contracts with data providers, the acquisition, documentation and preservation of data, and the development of suitable documentation for data users. The lack of cohesion within the field of European data had been previously noted at an international level, and this helped form the rationale behind the establishment of r·cade:
“The European data base is not well integrated; large scale research is hardly co-ordinated; measurement instruments and data representation lack compatibility; data access and data protection regulations differ, and even information about the availability of information is not easy to obtain. In short, the criteria for efficient organisation of data bases have hitherto been defined from a national perspective, and, even within nations, there is little co-ordinated data resource management.”
European Science Foundation, 1992
Setting up the r·cade project
The r·cade project was established in January 1995. It is funded by the ESRC, and is a joint venture between two centres of expertise:The Data Archive [1], at the University of Essex, and the Centre for European Studies, based at the University of Durham.
The Data Archive acquires, stores and disseminates computer-readable copies of social science and humanities datasets for further analysis by the research community. It currently holds over 5000 datasets, and is committed to improving the quality of data and their secondary analysis, via the provision of information and the various activities (such as workshops and user groups) it sponsors.
The Centre for European Studies[2] is a new interdisciplinary research centre. It is involved in substantive research into the European labour market and the European economy. Linked to the Centre is the National On-line Manpower Information System (Nomis)[3] which disseminates on-line considerable quantities of labour market statistics to over 600 sites in the UK. Nomis users span central and local government, academia, and the private sector.
The r·cade project directors are: Professor Denise Lievesley, of The Data Archive, Professor Ray Hudson of the Centre for European Studies, and Mr. Michael Blakemore, director of NOMIS. An Advisory Board has been established for r·cade, which meets twice yearly, under the chairmanship of Prof. The Hon. David Sieff, of Marks & Spencer PLC. Membership of the Board is drawn from academia, research, and the business sector.
r·cade’s objectives
The objectives of the r·cade project are as follows:- to establish a central collection of European statistical data drawn from a wide range of different sources;
- to provide researchers, teachers and other users in government and the private sector, with an easily accessible and rich source of data and accompanying documentation;
- to work towards preservation of the increasing number of data sets which are being created in this field, so that users can access historical material;
- to supplement data documentation particularly in relation to data quality;
- to keep track of and document changing definitions and geographies.
Help for data users
The r·cade service can help users to find out whether required data exist and where they are located; we negotiate on behalf of data users to obtain access to data collected by a wide range of organisations throughout Europe and further afield, and aim to provide access to the data in common formats over the electronic network and on other media. So far, r·cade have made significant achievements with data acquisitions, and negotiations for data are continuing with major international organisations. Consultancy work has been undertaken by leading academics on behalf of r·cade in the areas of crime and law, health and the environment to verify the needs of data users and to locate relevant data sources. Suggestions for suitable data from researchers in all major subject areas across economics and the social sciences are welcome, and will assist r·cade to set priorities for acquisition. r·cade aims to be a user-led organization, so please contact the r·cade office (details below) with your comments.
Current r·cade data holdings
At present, r·cade hold data from three major providers: Eurostat (the Statistical Office of the European Communities) [4], UNESCO (United Nations Educational, Scientific and Cultural Organization), [5], and ILO (International Labour Organization) [6].Eurostat have provided r·cade with their NewCronos database, which comprises statistical information over a wide range of subjects - economy and finance, population and demographic data, energy and industry, agriculture, employment and labour, transport and research & development. This is a popular database, well-used by researchers in government, academic and business sectors in many European countries.
UNESCO have provided their Education Database, which covers education at all levels from pre-primary to higher, educational expenditure, and ‘foreign’ students. Again, this is an important and popular database, provided by the world leader in the field of education.
ILO data includes part of their Laborsta database, covering a wide range of employment and labour-related issues. The ILO have developed and promoted international standards for the collection of labour statistics, which are widely used by other statistical organizations. This database is a comprehensive and high-quality time-series.
Screenshot of the r·cade site
Data processing and documentation
Once r·cade receive data from a provider, each data set undergoes extensive checking and ‘data cleaning’ where necessary, to prepare it for addition to the r·cade on-line system, and any queries may be raised with data providers at this stage. Data are then added to the internal development database, and are thoroughly checked and tested by the whole team. Data notes and warnings are added at this stage. Documentation is then prepared and added to the on-line system, linked to the data set, and again tested for functionality and suitability. The primary source for data set documentation is always material produced by the data provider to accompany that particular data, and this may be supplemented where necessary for further detail, usually from the data provider’s related publications. Primary documentation standards vary widely between data providers, but every effort is made to provide as much detail as possible about the data added to the r·cade on-line system, and present it in a ‘user-friendly’ manner. Once checking and documentation are complete and the team are satisfied that the data set is of acceptable quality as far as possible, it is released to the r·cade on-line system and becomes available to outside users. r·cade do everything possible to ensure the data released is of good quality, and generally succeed, but sometimes circumstances are beyond the team’s control - for example the original data may be very sparse, or the original data providers may have experienced problems inherent in the collection of cross-national data, such as lack of international comparability or a lengthy delay between data collection and data set production.
Data supply to users - on-line and off-line
Using the process noted above, r·cade have made much of the data so far acquired accessible via a custom-built on-line system. However, we are also able to supply data from holdings not yet available on-line. These data may be provided either as whole datasets or by subset, via a range of media - CD-ROM, or floppy disk, for example, or even hard copy in some circumstances. Whilst users are encouraged to become registered subscribers to the r·cade on-line service, which allows them to access the system when and where they need (see next section), r·cade do supply data to off-line users on an ad hoc, one-time basis and are able to quote a price before supply, taking into account the nature of individual requirements and circumstances. r·cade welcome enquiries from all types of data user, whether they wish to subscribe or not. Data will also be available via The Data Archive [1] , for those users who wish to obtain bulk data and process it at their convenience.Potential users should note that charges are made for all data supplied by r·cade, but efforts are made to keep costs as reasonable as possible - depending on user status, the amount of data supplied, whether the user is a registered r·cade on-line system subscriber and charges imposed by the original data provider. Please contact the r·cade office for further details.
The r·cade on-line system and user registration
r·cade’s software development team have built an on-line system for the service using Oracle RDBMS technology. The interface is designed to be easy to use, even for the novice computer user, and is available via JANET, over the Internet, or via modem using BT’s global network switching (GNS) system. Interested parties may apply to become registered subscribers to the on-line service - whilst this is not free of charge, r·cade aims to keep costs to the minimum. Registration has many benefits for those who may use a lot of data - they may use the on-line system at their convenience rather than having to make repeated requests for data to the r·cade office, and additional benefits such as detailed user manuals and data set guides are supplied automatically on registration. These documents are updated regularly, and provide much more information about the data and data providers than may be practically included on-line or via the r·cade web site. The on-line service was officially launched on October 1st 1996, and the number of subscribers is currently rising weekly. This has enabled r·cade to maintain a healthy regular customer base, and to build a rapport with them - user feedback on all aspects of the system is encouraged, and means the r·cade team can plan and provide an effective service, which responds to user needs and concerns, both technically and with regard to broader issues.
The r·cade World Wide Web site
The r·cade web site has undergone considerable development since it was established - it is a major access point for the service, and has produced a greater volume of enquiries about data and the more general aspects of the project than other publicity material or conference presentations, etc. In this sense it is an extremely important marketing tool for the r·cade project, and every effort is made to keep the information held there as current as possible. Interested parties can learn a lot about the data r·cade hold from this web site, without the expense of subscription to the service. All on-line documentation which accompanies data sets available via the r·cade system is held on the web pages, including details of data set coverage and available variables, though of course not data itself at this stage. Enough information about each data set is available via the web site to enable potential data users to make a decision ! about suitability of a particular data set for them, though (as mentioned above) more detailed documentation is available to registered subscribers. Links to other international organizations and data providers are also to be found at the web site, along with background information about establishment of the project and its progress. The r·cade web site is at : http://rcade.essex.ac.uk [7]
The r·cade team
Besides the three project directors, the r·cade team is divided between Essex and Durham - three software developers and the r·cade secretary complete the Durham team, based with Michael Blakemore at the Mountjoy Research Centre, University of Durham, whilst the Essex team includes a project manager, web site/software developer and information officer, based with Professor Lievesley at The Data Archive, ensuring a spread of expertise between the twin project sites.References
[1] The Data Archive Web site,http://dawww.essex.ac.uk/
[2] Centre for European Studies Web site,
http://www.dur.ac.uk/~dcm0www/eurostud.htm
[3] NOMIS Web site (currently under construction),
http://www-nomis.dur.ac.uk/
[4] Eurostat Web site; this is a huge site with plenty of information about Eurostat’s activities, currently searchable in English, French or German,
http://europa.eu.int/en/comm/eurostat/
[5] UNESCO Web site; a huge site with plenty of information about UNESCO’s activities, not just their statistical collections. Mostly English, some pages available in French or Spanish.
http://www.unesco.org/
[6] ILO Web site; lots of information about ILO’s activities within the field of labour and employment. Includes a searchable database and is available in English, French or Spanish.
http://www.ilo.org/
[7] Rcade Web site,
http://rcade.essex.ac.uk/
Author Details
Sharon Bolton,r·cade Information Officer,
The Data Archive,
University of Essex.
Email: sharonb@essex.ac.uk
Tel: 01206 872569