Anoraks and cardigans (2): New Text Search Engines, Bath, April 1996
The 1996 Bath meeting raised some interesting questions - some explicit and others less so - like "Do we really need librarians?" And "Is traditional information work such a backwater that it takes 30 years for technology to transfer from the research laboratory into professional practice?"
On display - an impressive array of computer-based systems for the retrieval - and analysis - of information. All offer natural language input with best match output - based on statistical techniques pioneered by the late great Gerard Salton. Salton was very much present in spirit - a warm tribute being paid by conference organiser Ev Brenner in his opening remarks, and many of the speakers referring to the important influence of Salton's work on the new breed of commercial systems.
The systems on view offered much. In summary, they represent great advances in our attempts to:
- Save the time and/or expertise demanded of the end user: Natural language querying; transparent access to heterogeneous information sources from a single request; automatic repeated searching; merging of results.
- Be less narrowly literal in handling information requests: Fuzzy matching to compensate for errors e.g. resulting from OCR conversion; automatically handling semantic relationships in order to retrieve more than was literally asked for; retrieve information requested in one language in other languages.
- Learning: "Discovering" concepts; automatically learning to achieve a better match between retrieval and the user's conception of what is and is not relevant.
- Pre-processing information for easier retrieval: Automatic thesaurus generation; automatic indexing.
- Integrating retrieval across media: E.g. indexing and retrieving information at bit level, allowing fuzzy retrieval of patterns in text, sounds and pictures via one unified mechanism.
- Getting to grips with the content of information: Not only searching for and retrieving information for the busy end user, but also using natural language techniques to extract and summarise relevant content.
Whilst the new generation of commercial systems generally offer statistical techniques in conjunction with Boolean, a number of speakers injected a more competitive note in their remarks. We were reminded that research consistently tells us that statistical, natural language approaches out-perform Boolean - the mainstay of the traditional (human) librarian.
So have the anoraks - having finally attracted real money from the sharp suits - really de-skilled the cardigans? Well, yes - but there again, no.
As Peter Schauble from Eurospider noted, casual end users using statistically-based systems can retrieve information as effectively as trained intermediaries using Boolean. However, statistically-based systems can also boost the effectiveness of the trained intermediary in allowing the formulation of much more complex queries - less susceptible to the casual user - entailing possibly hundreds of search terms.
And as Neil Infield, Information Manager at Hermes Investment Management Ltd., observes in a recent article (Infield, 1996), the advent of powerful end user searching (a) releases professional intermediaries from less enjoyable routine searching, creaming off for them more interesting, complex searches; and (b) creates an appetite amongst end users resulting in their appreciation of, and demand for, the more sophisticated searching skills of the professional.
Interestingly in this context were the models - at least embryo models - proposed by a number of speakers at the conference, who attempted to map Boolean and other IR approaches onto different types of need. I feel sure that it is at this level of sophistication that the debate about the role of professional intermediaries, and the relative merits of Boolean vis-a-vis statistical approaches, must advance. I have a number of friends who possess a cardigan. They sometimes come and go in anoraks. Some have occasionally been observed in a suit. Appropriate dress for different conditions and occasions.
A final word of congratulation to the conference organiser, Ev Brenner. He achieved a rare and enjoyable blend of the academic and commercial, somehow persuading suppliers to talk about ideas rather than give us the hard sell. I was not alone in noting that, paradoxically, the products on offer gleamed and impressed all the more because of this.
Reference
INFIELD, Neil (1996) Dealing with disintermediators. Library Manager, 18, 5.