Web Magazine for Information Professionals

INFOMINE

Steve Mitchell describes INFOMINE, an impressive attempt to build a Web-based virtual library for the academic community.

 

The original need and context for the development of INFOMINE and the academic virtual library

Immense potential for communicating important information, immense chaos in finding useful scholarly and educational tools as well as what promised to be immense user interest and acceptance, were conditions that characterized the Web in 1993. INFOMINE [1], a virtual library (VL) currently providing organised and annotated links to over 8,500 librarian selected scholarly and educational Internet resources, was created in January of 1994 as a response to this situation. It was a response to the realisation that librarians and librarian-designed finding tools could play significant roles in making the Web a more useful environment for large numbers of researchers and students. It was apparent to those participating in INFOMINE that this information environment was our business. At the same time, the impression of Gopher, paramount at the time in terms of user interest (and still useful), was one of an information version of Kansas in contrast to the Web’s promise of the Emerald City. Now, more than three years later, much of the Web’s early promise has been borne out and our efforts appear to have been well spent. We and other librarians have been participating in developing the newest form of mass media, one used by millions of people on a daily basis.

Today, after personally gaining thousands of hours of experience with robotic search engines and other finding tools and after spending hundreds of hours helping and instructing patrons in the use of these tools [2], I feel as strongly as when INFOMINE began that for the majority of Web users and the majority of skill levels among them, it and similar tools play a continuing and growing role as a major, crucial type of Internet finding tool. When used appropriately, most Web users experience significant benefits in using VLs to locate relevant resources.

A virtual library, for our purposes, is an Internet finding tool that features a collection of selected, organized and generally enhanced links to useful Internet resources. The common bond among VLs is that a subject expert or some knowledgeable person has made active choices regarding the inclusion of each resource, its description and how it will be presented or ordered in the VL. Beyond sharing this approach, academic virtual libraries come in many different flavors ranging, on one side of the spectrum, from simple lists of titles (actually deluxe bookmarks) to, on the other side, attempts to do almost full MARC cataloging and situate this work in powerful database management systems.

The roles of virtual libraries and search engines

It is noteworthy that some of the major search engines (e.g., Lycos with its a2z service) now employ virtual library approaches. Sometimes these represent themselves as rating or reviewing services. Conversely, a number of virtual libraries like Yahoo and INFOMINE now provide convenient, well integrated access to many of the larger search engines, such as AltaVista, to augment their service. What’s obvious to me about this convergence is that the niche of the virtual library is being carried forward and strengthened and this is despite much thought occurring three years ago that we would be seeing AI/smart search/fuzzy logic augmented super search engines effectively covering the majority of our needs. Not that we don’t remain hopeful and optimistic that this may someday occur - see AltaVista’s LiveTopics for an interesting, practical development.

These days successful searching is not a question, at least for those line or reference librarians who frequently work with patrons using the Internet, of preferring one type of finding tool in exclusion to others. Rather, it’s a question of knowing when to effectively employ which type of tool for which type of search. At the same time, it’s also a question of matching user skill levels with the appropriate type of finding tool. For instance, the use of Alta Vista or HotBot might best not be encouraged among users who are overly challenged by your online public access catalogue and/or who are looking for broad search concepts. Similarly, you wouldn’t usually challenge most general VLs to ring up good results in searching for an Arctic insect species, rare chemical or arcane poem.

Striking a Balance

The choice of flavour for the site designer depends on the designer’s notion of, for lack of a better phrase, ‘striking a balance’ between the numbers of important scholarly and educational resources proliferating on the Web and your resources for handling them. For, from the outset, most realise that it usually won’t be the virtual library type tool that will be able to maintain a comprehensive view of all relevant sites on the Web (considering that tools like AltaVista and HotBot can’t do this). This balance, where the line is drawn in the sand, in turn is completely dependent on the designer’s gut conception of roughly how many relevant resources are out there and projected user needs which are, in turn, often dependent on intuition, hopefully bolstered by a considerable number of hours of experience in using and working with the Web and working with the patrons who use it.

Flavours of virtual library are further dependent on the amount of value that is assigned to the variables in the ‘value added’ equation. All virtual libraries are attempting to add value to the links they collect if only by grouping similar, well-selected titles together. Of course, one person’s conception of significant value added can differ substantially from another’s.

Half of the value added equation concerns the amount and type of service to be provided or level of value to add depending on assumed user needs. This service level to the user is comprised of a number of possible factors involving designer decisions including: depth of description needed (whether to do one or more indexing or even cataloguing schemes; whether to do annotations/key words/title enrichment and so on); comprehensiveness needed (number of resources to reasonably include - does one stay at some conception of a ‘reference level’ collection or go beyond this); the utility/ease-of-use/number-of-access-points inherent in the VL system needed, among other factors.

The other half of the value added equation is of course the ‘bottom line’: what monetary and personnel resources are available. Relatedly, those involved in VLs will often try to determine whether collaborative effort is possible to reduce personnel and/or systems costs. Of course, the more sharing that can occur the less expensive the VL is to develop and maintain and/or the greater the value/coverage that can be added. For academic VLs, increasingly, it is the degree of effective collaboration that determines the scope, range and depth and, ultimately, service value of the VL. In this regard, I’ve seen a number of VLs paring down their scope. Often this comes in the form of either choosing some level of a ‘reference collection’ approach and/or pursuing greater subject specialization. A smaller number are remaining general and/or expanding in scope.

With the vehicle so created in weighing the above considerations and striking a balance among them, one sets sail. Fortunately, maybe the crucial factor here is in the expertise and commitment of those sailing as much as in the vessel they’re crewing (libraries not generally being known for having access to large pools of resources). And so, I’ve seen extremely useful, relatively short but impeccably well chosen link collections containing only titles while I’ve also seen well-financed efforts with lots of participants that were much less useful. I’ve also traversed VLs that were all content and no organization and, conversely, sites that were all organization and no content. Various balances can be struck with the value added equation being solved in different ways and resultingly (usually dependent on the designer’s correct intuitions/experience with user needs coupled with actively seeking user input, commitment to making the tool a success, and success in garnering resources and collaborative effort) you either have a useful project or not. I’ve personally been surprised by the variety of approaches found in apparently thriving and successful VLs.

INFOMINE’s description and the balances struck

INFOMINE has been relatively successful. It now records over 100,000 accesses each week. More importantly, as an indicator of its utility, it has 3,000 - 5,000 other Web pages linking to it (funny how those Alta Vista link searches can vary). INFOMINE has received a Point Top 5&##37; of the Web award and a Magellan 4 Star Rating. It is included in several subject areas in the Argus Clearinghouse, was mentioned in a2z (the Lycos-associated index) as one of the top 25 science and technology sites, and is in PC Computing’s Map to Navigating the Web. INFOMINE has also been noted in CyberHound (Gale Research), the InterNIC Scout Report, and the Los Angeles Times (2/3/97), among others. Incidentally, in regard to such services, have you ever had the rare pleasure of justifying your efforts to supervisors by intimating slyly that your project has just received the highly coveted Four Bones rating – ‘got the bones, boss…’, or shown them a complimentary magazine rating that looked like a kids treasure map, or pointed them to rating service icons that look like trading cards for sports stars? What follows is a brief description of INFOMINE (for a fuller one with technical details, see reference [3] ) along with thoughts about the balances we’ve struck in charting INFOMINE’s course.

INFOMINE provides numerous access points to the information it contains. These include several indexes and a search engine. It provides annotated and Library of Congress Subject Heading (LCSH) indexed links to resources relevant to the University of California (UC) and to the entire academic community. INFOMINE is divided into 8 major subject discipline areas, two examples of which are the Biological, Agricultural, and Medical INFOMINE and the Government Information (U.S. Federal level) INFOMINE. With approximately 2,500 resource links each, these represent INFOMINE’s most comprehensive subject collections. Given the subject breadth of many of our files, such as Bio/Ag/Med, we are able to supply a very interdisciplinary focus.

Balances: While hardly the most natural language for users and with its flaws well-noted by many, LCSH remains THE U.S. academic library descriptive language. Users are aware of it. Librarians are familiar with it. Its usage implies that INFOMINE will, with some modification, be compatible with library LCSH-based online catalogs of print resources at some point. In order to accommodate LCSH with the Web, we have adopted our own core approach [4]. This approach is pragmatic and has meant that we can use LCSH and do so while spending less than 20 minutes per record added. This is crucial. Many indexing/cataloguing schemes in operation or being promoted would result in much longer indexing times which, given the Internet numbers, would restrict them at the start from including more than a very limited set of the possible good, solid resources available. Currently, we’re again looking at the Dublin Core and are very excited about its prospects but only in so far as it will work, in practice, within the time limit mentioned.

Librarian experience in selecting and describing resources is crucial. Experienced bibliographers and selectors apply their expertise in evaluating Internet resources. Resources are added to INFOMINE utilising a simplified input form. No knowledge of HTML is necessary thus freeing up contributor time to find and evaluate resources rather than manipulate HTML. INFOMINE was designed to provide ease of use for contributing librarians with varied microcomputer skill levels. This approach assures that all generations of librarians, representing many different levels of personal or professional interest in information technology, can contribute their subject expertise.

Balances: No one knows the information content and value of scholarly or educational resources better than the academic librarian. This is expensive time, say compared to a student employed surfing for Yahoo, but it is our belief that the high value and solidity of many of the resources out on the net warrant this investment. This is especially true if one believes that we can do this kind of work and, moreover, exercise professional self-reliance and leadership instead of simply letting commercial, usually non-library focused, operations, which are free for now, do this work for us, do it less well and later charge handsomely for it. As a friend pointed out: ‘Do you click on the ads?’ And, what does not clicking ultimately mean for commercial services needing to recoup costs?

Though INFOMINE was at first exclusively a UC Riverside project, it now includes librarian participation from all 9 UC campuses and Stanford University. One of the major benefits of the INFOMINE project is that, for many librarians, it is acting as a model for multi-campus shared Internet collecting activities and has stimulated important dialogue on the issues, opportunities and challenges involved in this [5], [6].

Balances: The more people participating, of course, the more disciplines that can be reasonably well covered. In addition, while different campuses and colleges often do have somewhat different collecting needs, it has been our impression in looking at the VLs or subject guides of similar campuses and systems that there is a great deal of redundant collecting effort being expended. The downside here in trying to reduce this through collaborative effort is the considerable amount of time required to do the organising. Many libraries seem bound by time honoured, print world based conceptions of ‘areas of influence’ and cooperation. More generally, in addition, most are understandably trying to get a grip on what the Internet revolution means overall for shared collecting and cooperatively sponsored access to both free and subscription based resources.

We feel that there exist a great number of Internet resources of very high quality which are at least as useful as their print counterparts. Resultingly, we have chosen to annotate and use LCSH to index those pointed to via INFOMINE. Librarians who add resource sites to INFOMINE add significant value by providing a customized annotated paragraph in addition to linkable subject, keyword, and title words. Most importantly, subjects applied average about 6 per record while key words average over 6. These numbers of terms for retrieval are much greater than many library related databases, which traditionally average around three subject headings. It is the annotation and indexing sophistication which assist the users by allowing them to be better able to find and then quickly evaluate a resource’s relevance, in relation to other related resources, before attempting to access the site.

For users, INFOMINE is a value added Internet finding tool not only because of its enriched content but also due to enriched access resulting from the multiplicity of access points through which users can easily find the information contained. These include:

  1. Boolean searching;
  2. Browsing through our Table of Contents (titles interfiling under their subject terms), Title, Subject and Key Word indexes; and,
  3. Browsing hyper-linked Subject and Key Word Indexing Terms embedded within each record’s long display.

Moreover, as mentioned above, few other virtual libraries provide so many indexing terms per record. This alone has been a major contributor to INFOMINE’s use value.

INFOMINE is based on a hypertext database management system. It was one of the first Web sites to combine the power of the Web with that of a database management system. This allows participants to add, edit, provide access to and generally manage several thousand records easily. For example, all indexes (e.g., The Table of Contents) are automatically generated by the database management system rather than manually. This saves time for contributors (e.g., instead of going to several HTML index pages in order to effect changes in a record, you simply access the editor form and make the changes on this single form while the database management system then automatically changes the indexes specified). For users, unique searches yield dynamically created, unique HTML results documents customised to their interests.

A great number of VLs, in comparison, rely primarily on browse access (though a number of these are now adding simple text search capabilities) offered through a specific set of already created, static indexes ordered via a subject organisation or classification scheme, often pyramidal. Difficulties with this approach include that the hierarchical browsing systems may or may not be familiar to the user. Resources with multiple subjects can be hard to find if they haven’t been laboriously placed within the many relevant subjects they may cover. It can also be difficult to arrive at a useful balance between generality and detail. Never the less, such tools are often very useful.

INFOMINE too features a structured, browse approach to finding (and faces some of the same types of challenges mentioned above) as one good way to present its contents (visible in our LCSH organised Table of Contents and Subject Indexes).

In addition, though, INFOMINE goes beyond browsing by emphasising our search engine, which has always been a central focus of our resource. Relatedly, we also try to go beyond LCSH by applying key words (which often include common, specialist and natural language terms that aren’t in LCSH) to adjust for some of its shortcomings as well as to simply apply more handles through which a resource can be found. Overall, it is interesting to note that the last year especially has seen a convergence among VLs where those that were browse oriented are now offering text or other forms of searching while those that were placing the emphasis on searching via database management systems are now augmenting their service by creating static or static-like indexes. INFOMINE, for example, does something like this with its General Reference Resources feature (see the top of the INFOMINE home page).

Balances: There has been much discussion of the advantages of various organising schemes for VLs. Just as we use an LCSH core concept, others are gainfully employing LC Classification Numbers, the Dewey System and other traditional general and specialized (e.g., MeSH headings in the medical area) organising schemes. Some of these are based in hyper-text enhanced relational databases. Others are hierarchically arranged in hyper-text augmented tiers of static HTML pages. Some are oriented towards more graphical, almost shelf-based, organizing principles. Many utilise a number of these concepts in one tool.

Those VLs that work best seem to generally share the following traits: They add more rather than less value in regard to the value added equation; they usually are hybrids bundling more than one of the approaches mentioned above; the access points they provide are accessible via both search and browse modes; regardless of the specific approach or mode, they provide numerous access points through which the information contained can be accessed/discovered; they are intended to handle larger rather than fewer numbers of resources and cover more rather than fewer numbers of disciplines; they link conveniently at numerous points to other VLs and search engines; and, maybe crucially, because this makes all the above possible, individual entries or records can be added and be well-described in a minimum of time.

Our collecting goal has been to include items of greatest use to UC faculty and students. Generally this has meant selection of the largest, most comprehensive and highest quality resources. Occasionally it has meant collecting other less central resources in order to meet specific research needs. This mirrors our print format collecting efforts and policy at UC Riverside and most UC campuses.

Balances: There has been much discussion of the need to rigorously formalise collection development policy in regard to collecting for VLs (and other access tools) and to provide in-depth training for this. This may be worthwhile. In the mean time, we’ve found that the transfer of Internet collecting skills to print savvy librarians has been generally pretty effortless and has almost always resulted in well-chosen selections without a great deal of extra training. The content is THE thing and that is what most of us know instinctively well. Still, for some it is more difficult than others and generally this difficulty breaks down by generational lines. An interesting development to me is that some seem tempted to apply more stringency to Internet collecting standards than they sometimes do to their print collections. And, while most of us would agree that the poster child for grey literature is the Web, others can overemphasize quality criteria. The tonic to this is to have such people take an honest browse through their print stacks.

INFOMINE’s Drawing Board

Education:
Not a new challenge to many of you, our biggest and continuing job, like yours, will remain that of educating librarians and information specialists as to the value of Internet information provision and of the great contributions that many libraries are making in both creating and providing enhanced, easy access to important research and educational tools. Librarians can play a major, proactive role in guiding and shaping the Internet revolution if we so choose. And, of course, as we educate our own, so must we continue to educate our researchers and students. In addition, we need to continue to address the need for education around and participation in cooperative, shared efforts on the part of libraries.

Systems:
We are looking at means to incorporate full-text indexing of sites in our selected INFOMINE domain of resources as a complement to our current approach. With this development, expanded INFOMINE searching techniques (such as the use of adjacency operators) would be provided. This would be our “power search” option and a good addition to our current service.

References

  1. INFOMINE Web Site,
    http://lib-www.ucr.edu/
  2. General Internet resource finding tools: a review and list of those used to build INFOMINE,
    http://lib-www.ucr.edu/pubs/navigato.html
  3. Mitchell, S. and Mooney, M., 1996, INFOMINE: A Model Web- Based Academic Virtual Library. Information Technology and Libraries, 15 (1),
    http://lib-www.ucr.edu/pubs/italmine.html
  4. Mitchell, S., 1996, Library of Congress subject headings as subject terminology in a virtual library: the INFOMINE Example. In: Proceedings of “Untangling the Web” Conference, University of California, Santa Barbara, 1996,
    http://www.library.ucsb.edu/untangle/smitch.html
  5. Mooney, M., 1996, Linking Users to Internet Government Information Resources through INFOMINE. DLA Bulletin, 16 (1),
    ftp://ftp.dla.ucop.edu/pub/dlabulletin/issue35/infomine.txt
  6. Baldwin, C. and Mitchell, S. 1996, Collection Development Tools/Methods for Virtual Libraries and Subject Lists in Selected Major Subject Areas: A Panel Discussion and Presentation. In: Proceedings of “Untangling the Web” Conference, University of California, Santa Barbara, 1996,
    http://www.library.ucsb.edu/untangle/index.html

Author Details

Steve Mitchell
Science Reference Librarian,
Email: smitch@ucrac1.ucr.edu
INFOMINE Web Site: http://lib-www.ucr.edu/
Address: Bio-Agricultural Library University of California, Riverside Riverside, CA 92521, US