Search Engines: Using the Right Search Engine at the Right Time
My recent articles for Ariadne have tended to focus on specific developments in the field of Internet search, so I thought that it was about time to get back to some basics and have a look at which search engine to use for particular types of query you may have. The idea for this column arose out of my own bookmarks, which I arrange according to particular criteria, and I created a simple Web page on my own site to list them [1]. This obviously struck something of a chord, as I have days when this particular page has been viewed over 7,000 times. Consequently I thought I would expand on this a little, and go into slightly more detail on some of them. I'm often asked, 'Which is the best search engine to use?' and I always try and answer this tactfully by saying that there isn't a 'best' search engine - that's like saying 'what is the best reference book to use?', which is of course a meaningless question. Effective researchers or information professionals use the most appropriate resource for the particular circumstances in which they find themselves, but of course in order to do this, it's necessary to appreciate fully all the various options that are available.
The following is therefore a description of some of the search engines that I use on a regular basis; not necessarily the best (although I tend to think that they are), and by no means comlpete.
Precision Searching
Most of the time people know exactly what they want to find; they will have a clear idea of the subject matter and appropriate keywords to use. One tip here before we continue however; it's always a good idea to concentrate slightly less on your question, and slightly more on how it might be answered on a Web page. Don't look for the question - look for the answer! A quick example here: if you wanted to find out how many companies around the world have Web sites, that's actually quite difficult to find (believe me, I've tried!), so rather than think of keywords that might narrow your search down, think how that answer might be expressed on a Web page - with this example it might well be '95% of companies world wide have Web sites'. So what you actually want to search for is a phrase 'companies worldwide have Web sites'. Of course, you'll get some false hits, but you may well quickly strike lucky and find the answer to your question. If you don't, try and rephrase the 'answer' slightly differently.
However, if we take as a given that you know the information that you want to find and you can define that clearly, a good type of search engine to use in this instance is what I refer to as a 'free text' search engine - that is to say, a search engine that leaves you free to type in any text you like. The obvious search engines to look at here are Google [2], Yahoo! [3] and the up and coming MSN Search [4]. All of these engines trawl the Internet looking for Web pages and then indexing the content, down to the individual words on the pages. Consequently they are very good for precision searching. Type in your keywords and off you go! However, you'll almost certainly get too many responses - this is one of the disadvantages of this type of engine, so you can quickly narrow your search by putting your keywords into a phrase, surrounding it with double quotes. This will narrow your search and hopefully give you a much more precise answer. This sometimes falls down slightly though - I was recently reading an article that complained that a Google search for "is there a God" didn't work, because Google viewed 'is' and 'a' as stopwords which it ignored, leaving the search strategy as 'there God' which didn't make much sense. An easy way around this problem is to simply say to Google that you want those search terms included, and this can be done by adding a + symbol immediately preceeding those stopwords. As a result, "+is there +a god" pulls up the desired pages.
Quick Answers to Quick Questions
This is fine as far as it goes, but sometimes you don't want to wade through page after page of results to get the particular piece of information that you need. Luckily however, it's increasingly less and less necessary to do this, since search engines are being improved to the extent that they can quickly provide you with a simple answer to your question. MSN Search has incorporated the Microsoft product Encarta into their search interface which is a very sensible idea - not only does it give them more leverage, but it allows us, the searching public, to freely use a commercial product. Simply type in the question that you have, such as 'what is the population of India' and click on Encarta, and the search engine will pull the data directly from that source. Google has also recently jumped on this bandwagon (and what a surprise that is!), and you will find that in some, though not all cases, if you type in a question you will get a quick answer at the top of the screen prior to the results that you normally get.
However, there are other options available to you, and it's worth trying these out, just to see if one or other of them tends on the whole to give you better results. Brainboost [5] is a new search engine, but it seems to work reasonably well, and usually gave me exactly the answer that I was looking for. Factbites [6] is another engine worth trying, although with this one I found that a more general question worked rather better than something very specific, so rather than asking 'when was such and such born?' I discovered that just typing in their name gave me all the details that I needed, including place and date of birth. Another good standby is Answers.com [7] which is viewed by many as the best resource of all for factual information.
Overviews of Subjects
You may have frowned slightly earlier when I said that if you know what you're looking for, use a free text search engine, and I wouldn't blame you if you thought 'what else would I want to do with a search engine?' There may be times when you want a wider variety of options available to you, or you may not know your subject in enough depth to know what keywords to choose, or it may be difficult to decide on the best ones to use in certain situations.
It's at times like these that you may wish to consider using an Index, or Directory search engine. These search engines list Web sites (rather than pages) in different categories, from the broad to the specific. They allow you to start your search from the broadest possible base and the sub-headings will help guide you into exactly the right area, finally giving you a listing of Web sites that you can then visit to take your search to the next stage, of trying to find the exact piece of information you need. The classic example here is Yahoo! which in the last few months has been hiding what was once its pride and joy, the directory section. Yahoo! seems to be concentrating more on rivalling Google, and attacking its main competitor on the free text front, and spending less time on the directory approach. Consequently it's a good idea to have a couple of other examples to use, and the Open Directory Project [8] is another possible option (which is also used by Google to provide their own Directory). There are some more rather more specialised engines you can use however, and one that I often fall back on is the Librarian's Index to the Internet [9] or various virtual libraries such as SOSIG [10] that actively promote good quality sites; anything that doesn't match up to their strict criteria for inclusion gets dropped in the bit bucket. You can find a good listing of virtual libraries at Pinakes [11], and these are also useful if you have an enquirer who wants 'the best Web sites in the world' on a particular subject. While you can never be sure that the listings always are the best of course, it's a quick, easy and very reliable way of taking a good stab at pointing them towards high quality material.
Alternatively, use search engines to suggest categories for you, created 'on the fly' in accordance with your search. Teoma [12], Wisenut [13] and the slightly more recent Clusty [14] will all suggest ways of narrowing your search, based on the initial search string that you input. If you're completely stuck for ideas, it's always worth seeing what one of these can throw up by way of suggestion.
Re-sorting Results
Free text search engines work on the basis of relevance ranking; that is to say, they will give you results in the order that they think is most appropriate. Unfortunately this doesn't always work very well - they're only computers after all! Increasingly search engine providers have realised that searchers like to have a little more control over the way that results appear on the screen in front of them, and there is a slow, but growing trend to allow results to be manipulated. Although Google itself doesn't allow re-ranking, Google Personalised [15] does; simply set up a profile of general subject areas that particularly interest you, and run the search. The slider bar that you'll see above the results can be moved to the right, and the results, while still being the same set, will be re-ordered to focus more closely on your particular interests. It doesn't always work (it is in beta after all), but I've been reasonably impressed with the results. MSN Search Builder (an option from the main search box) also allows you to re-run a search to pay more attention to 3 specific elements - precision, currency and popularity. Simply move the slider bars up and down, and this adds extra syntax to the search which can then be re-run according to those preferences.
It's not only the big names who are working in this field either. Some of the smaller engines are also providing options in this area - Exalead [16] re-ranks by date, and the multi- or meta-search engine ZapMeta [17] re-ranks on popularity, title, source and domain.
Multi-search Engines
Having already mentioned one or two of these in passing, it's worth pausing for a moment to consider them in a little more detail. Using a single search engine is like using one reference book, or asking for one person's opinion; can you ever be sure that they're right? Running a search across a number of search engines helps a lot here - if all the search engines are pointing you towards certain sites then you can be reasonably assured that you've found the right material. Doing this individually is however time-consuming. A multi-search engine will take your query and pass it out to a number of other search engines, get the results, collate them and then display them to you. Nothing particularly new with this of course; they've been around almost as long as there have been individual search engines. My particular favourites are eZ2Find [18] which gives a host of informative categories - and not just Web pages - and links to specialised databases for example, Ixquick [19] that tends to have a slight UK bias to the engines that it uses and Fazzle [20] that offers a useful percentage indicator of how appropriate it thinks the results are.
Catching What You've Missed
Multi-search engines are also useful for something else as well - showing you what you're missing! TurboScout [21] allows searchers to input their search term and then run that search across 23 different engines quickly and easily. Jux2 [22] has a useful interface that lets searchers view results from Google, Yahoo and Ask Jeeves, and Thumbshots Ranking [23] shows in a graphical format the position of different Web pages between different search engines, and also shows what pages are found by one engine and not by another. A quick exploration with any of these tools clearly demonstrates that one search engine is not suitable for all needs!
Other Notable Mentions
I've already mentioned over 20 different search engines and I've hardly even scratched the surface! Rather than simply finish at this point I thought I'd quickly refer you to a few other engines that I cannot do without, and hope that you have 5 minutes to spare in order to check some of them out. Icerocket [24] is a fairly recent addition to the search engine flock, and what particularly impresses me is that they are constantly innovating and adding new resources; they really are racing to catch up with the top names, and while they're not there yet, take a peek - you might be pleasantly surprised. Mooter [25] and WebBrain [26] provide information in a graphic format; it doesn't particularly appeal to me that much, but I know that a number of people (particularly students) really enjoy the unusual presentation format. Turbo10 [27] is a good engine for searching over 700 'deep Web' engines to locate material that the main engines have probably missed (or more likely been unable to index).
Conclusion: So Many Engines, So Little Time
That's just a very brief outline of some of the engines that I use on a regular basis; I'm sure that you'll be familiar with some, or perhaps even most of them, but I hope that I've pointed you towards some that you've not explored yourself, and intrigued you enough to pay them a visit. I've not even had a chance to touch on finding people, finding images, multi-media or search engines for children but perhaps I'll have an opportunity in another column.
References
- Which search engine when? http://www.philb.com/whichengine.htm
- Google http://www.google.com
- Yahoo! http://www.yahoo.com
- MSN Search http://search.msn.co.uk/
- Brainboost http://www.brainboost.com/
- Factbites http://www.factbites.com/
- Answers.com http://www.answers.com/
- Open Directory Project http://www.dmoz.org
- Librarian's Index to the Internet http://lii.org/
- SOSIG http://www.sosig.ac.uk
- Pinakes http://www.hw.ac.uk/libWWW/irn/pinakes/pinakes.html
- Teoma http://www.teoma.com
- Wisenut http://www.wisenut.com
- Clusty http://clusty.com/
- Google Personalized http://labs.google.com/personalized
- Exalead http://beta.exalead.com/search/C=0/2p=1
- Zapmeta http://zapmeta.com/
- eZ2Find http://ez2find.com/
- Ixquick http://www.ixquick.com/
- Fazzle http://www.fazzle.com/log.jsp
- TurboScout http://www.turboscout.com/
- Jux2 http://www.jux2.com/
- Thumbshots Ranking http://ranking.thumbshots.com/
- IceRocket http://www.icerocket.com/
- Mooter http://www.mooter.com/moot
- WebBrain http://www.webbrain.com/html/default_win.html
- Turbo10 http://turbo10.com/