Since the inception of the Internet, one of the major problems that has been discussed and written about is how to effectively find needed information, and how in depth that information needs to be. This thinking is reflected in a work published by the University of Sheffield stating "One of the main problems encountered when using the World Wide Web is finding useful information/things quickly." (www.shef.ac.uk 1995, p.1) The complexity and growth of the WWW can be confusing to new users and the amount of information often seems to overwhelm the novice user. Current literature regarding how to discover what the Internet holds and efforts at how to reach this information in an efficient manner will be discussed in this paper.
This paper was compiled by searching on First Search (library literature), Proquest periodical abstracts, Infotrac, library collection catalogs at Drexel University, as well as by searching online within the Internet, primarily using the LYCOS search engine. I also browsed articles in the periodical collection at Drexel when I found a promising title.
One of the major inventions used today to try to solve the problem described above is the "search engine." Another option to use when searching for information on the Internet is a listing that has been divided into categories by a human who has assigned sites and/or articles. These are called subject directories.
Brandt (1997, p.47) chooses to categorize the directories and search engines as one. He states that "Almost any grouping of computerized information that can be searched electronically could qualify as a search engine." Notess (1997a) and Minkel (1997) are the only authors who distinguish the two types of search tools. Most of the other references used to write this paper group search engines and directories together in their writings, and while usually mentioning both, do not distinguish their differences.
When a person uses a search engine, the result will often be too many or not enough hits, (responses that meet the requested search parameters) and, in some cases, inappropriate returns. The cause for these problems could be an inappropriate choice of search engine, search parameters that are either too wide or too narrow, search terms that are either incorrect or poorly stated, or the fact that the information is not actually on the Internet.
Another not unexpected response to the Web community's request, when asking for information searches, is for the service providers to produce more and more search engines. There are at least 200 available now. See The Searcher (Lacombe, 1997), Global Searches, (O'Shea, 1997), and Tools for Web Exploration (Waterluv, 1997) for lists.
Many books exist as well with listings of search engines, and there are also many "web yellow pages" being produced, which list sites by category.
Cost also becomes a factor when searching for some piece of information. Speaking as an information specialist, Bates (1997, p.47) asks "Is my time worth more than the $4.00 I save when I retrieve that Washington Post article from its Web page, or is it a net savings for me to download it from Dow Jones News/retrieval?"
The user must choose the correct search words and know how to run the search. To try to eliminate some of the problems associated with poor or incorrectly worded search parameters, Notess (1997a, p.65) states "Unique keywords, especially ones that uniquely identify a topic, are much better suited to large search engine queries." This may, however, result in too few or no responses, and, in some cases result in overwhelming responses, as he illustrates: "A search for "drosophila" finds thousands of hits due to the vast amount of information on the fly, so you may want to try that one in a subject directory." (Notess (1997A, p.66) The searcher obviously cannot afford to search through that number of responses.
Another cause of too few responses, (or none at all), noted by only one author (Levine, 1997, p.107), in a sidebar entitled The number one reason your searches don't find anything, writes "Well, maybe not your number-one reason, but our number-one reason: We spelled one of the search words wrong. Check carefully." This, of course, can cause great frustration.
Study the way that search engines work, and one finds that the logic of the search engine is similar to that of a reference librarian answering a question. The patron, or user, inserts the search term or terms, and then the search engine goes out onto the Web and finds the word or words specified. Similarly, a reference librarian has to translate a question into a few major words (usually nouns or noun phrases) with which to search the available resources.
The fact that locating needed information is sometimes a difficult and time consuming task is widely agreed to by authors in recent publications. "No search engine can find everything all of the time, even using the best tips and tricks." (Brandt, 1997 p. 47). This statement conveys the need to use more than one search engine when searching for information on a topic, and will be discussed below. Minkel (1997, p. 102.) warns that "jumping into one of the many search engines armed with only a search term or two-can be time-consuming: success is far from guaranteed." One author writes "Tracking down relevant information on the Internet remains an elusive quest." (Notess, 1997a, p.63)
Scisco exclaims "My browser has more bookmarks than Barnes and Noble. I'm connected to so many newsgroups people think I'm Rupert Murdoch. I've got more plug-ins than Circuit City. So how come I still can't find what I want, protect my children, or send secure messages on the Net?" (1997 p.110) Mendelson asks "Who has time to hang around the Web, even for important information searches?" (1997 p.227)
Clearly, the authors quoted above agree that current search techniques on the Web need some kind of improvement.
An attempt to solve the problem of too many search engines has been answered, in part, with the invention of the "mega" or "meta" search engine. These "mega" or "meta" search engines search multiple single search engine sites simultaneously and give the number of responses from each site. Thus, in most cases, the user receives a list that looks similar to this:
In some cases this may help save time by eliminating the need to search each search engine site individually. Some search engines search more deeply, or into parts of the Internet that others do not. Some only search the Web, some search gopher sites, some search FTP sites, and still others search newsgroups. Many search engines search all of these categories and some can be manipulated by the user to search only one part of the Internet. Others allow the user to search only for a specific type of document, i.e. "pictures" or "top 5% sites."
Regarding the merits of these types of "mega" search sites, Lange (1996, p.1) applauds the "All In One Search Page," stating that it is "awesome, and should provide you with everything you need in terms of web searching... a definite must for bookmarking..."
Brandt (1997, p.50) seems to concur that the invention of the "mega" search tools are necessary. "When you use a given search engine, you are only searching what it has in its database. And even between the big general purpose ones- AltaVista, Lycos, Excite, Opentext-there are differences, omissions, and exclusions. Thus, using a search engine that searches several search engines makes sense."
However, other authors are not nearly so positive about "mega" search sites. "Searching multiple systems that use different search syntax presents numerous difficulties" states Notess (1997b, p.89). Scisco agrees: "Online search engines like Alta Vista are hit and miss, and using a whole bunch of them at once is like juggling balls, pins, and torches". (1997, p.114) Another author states "I've never been crazy about the spotty results I've received from them." (Minkel 1997, p. 104)
Will the trend of "mega" searching continue? Brandt (1997) thinks this is probably the case. He states that "mega" search engines will become more common and even necessary. "... As the number of search engines you want to query increases from something like nine or 20 to something like 30 or 50, meta-search engines will become essential" (1997, p.50). Given that there are now many search engines, (over 200 as previously noted), it seems clear that these mega sites can help the problem by cutting down on the time spent on individual search engines. Syntax problems do exist, and the fact is that the user may need to alter the search parameters to meet all of the different engine's idiosyncrasies. Some search engines support full Boolean searches, others need a "+" in front of the search words, some need to have a phrase enclosed in quotes, and others have the ability to limit or restrict some words.
I believe that the Web needs the major search engine providers to come to some standard method of searching or indexing. This problem is not unlike the problem that exists with cataloging books and assigning subject headings. Of course, if all the search engine providers came to some agreement about syntax and indexing, they would need to sell their service on something other than the selling points they now use. Easy access, speed of response, and the breadth and depth of what each search engine covers would become more important to the user. Analogous to niche marketing, it may make sense that search engine providers become very specialized and not try to handle everything that is on the Web.
One other group of search engines that are becoming more well known as the Web increases in the numbers providers of information and users, (with which a corresponding increase in slowness of response becomes more common), are those that have been categorized as "off line browsers." These tools go into the web and find sites related to the chosen topics which are specified by the search parameters. They group these files and save them for perusal at a later time, while the user is off line. The main advantage is that the user does not have to sit and wait for a response. The main disadvantage is that the user may not have what is needed when the results are looked at later, (you can't change the search parameters right away). These off line browsers are the topic for another paper, however.
In conclusion, feelings are mixed about the merit and efficiency of current search engine technology. None of the authors spoke to the issue of training users, which I feel would be of major help with the problem of too few or too many responses.
Authors also feel differently about the usefulness and efficiency of the "mega" search. I feel that they have some merit in regard to saving time in searching. However, until search engines and their indexing techniques become more standard, the results will remain mixed. The authors of the current literature reviewed here all agree to some extent that changes are needed in the field of search technology on the Web. As far an new technology goes, other than the mega search and off line browser search, I found no references to anything more innovative in the current literature.
Submitted by Pete Llewellyn
INFO 500 Review of Literature
Summer 1997 Drexel University
