I’m an unabashed Google user. I think Google has long provided the best search results on the Web and I don’t see any sign that’s going to change any time soon. The reason I think Google has so totally eclipsed its competitors like Yahoo!, Ask and Excite (remember them?) is that Google is the search engine that follows what I like to call the “smart firehose” principle.
Google spends millions and millions of dollars every year on tweaking their search results to make them better. Engineers at Google constantly ask themselves: How can we give people the information they want in as few clicks as possible? Can we add context-relevant information such as a map, movie showtimes or images in order to make the search results more useful? In other words, Google would rather just give you the information you need if it possibly can, instead of sending you somewhere else.
When you search for something using a standard Google search (that is, at Google.com or through your browser’s search box), the search engine doesn’t separate relevant results, forcing you to click various sections of results. Instead, Google just gives you a list of the best results, depending on what you’re searching for and regardless of what type of result it is; consequently, a Google search results list will include links to web pages, maps, images, videos and more, all in one list. This “smart firehose” model works well for Google because it gives good results and then trusts people to make the right choice.
Libraries, in comparison, are woefully behind in search. Catalog searches are almost always totally separate from research information, so in order to find good information about, say, diabetes, a user will need to do multiple searches; one for the library’s catalog and at least one for the research databases. Often, users will need to go into several different research databases and perform individual searches.
Library users need a smart firehose.
Let’s start with how things are done right now by two libraries: Chicago Public Library and New York Public Library.
Fig. 1 shows the results we get when we use the search box in the top-right corner of the front page of Chicago Public Library’s website to search for the term “diabetes”.
The complete lack of results pertaining to research databases leaves the user totally ignorant of the top-notch information they could be getting. As it stands, no user will ever know, based on what they see here, that Chicago Public Library has access to lots of high-quality, authoritative information in the research databases.
It is also worth noting that even if a user chooses to use CPL’s Advanced Search page, no results from the research databases are shown. None.
Fig. 2 is a bit deceptive. While it looks like a lot of good information, it’s actually the same problem–from the opposite end–that we saw in Fig. 1. If the user does discover the link from CPL’s front page to online research (buried “below the fold” on the page for most users in a link labelled “Choose a Research Topic”) and follows that to the page for Health & Medicine research databases, they can then use the Quick Search box to do another search for “diabetes”, the results of which are seen here. Note, by the way, that the Quick Search does not include the Health and Medicine Reference Collection (which has several extremely good articles on diabetes) or Medline, both of which would be extremely useful to someone researching the topic.
The results here are good, though, even if it did take two difficult-to-find clicks to get to the right place from which to search.
New York Public Library does even worse. Using the search field on New York Public Library’s website and leaving “Everything” as our search type, we see this. Instead of getting the results we want, we get no results whatsoever; just a tally of the number of hits in each of the different collections. Users will have to click each link to see what results they’ve gotten.
These segregated results serve only to confuse the user. Why separate results? The business of a library is providing good information, so we shouldn’t be hiding (or worse, giving no clues to the existence of) authoritative information.
Now, let’s look at search done right by the reigning champion: Google.
Now, this is search done right. By going to Google.com and searching for “diabetes”, we are presented with these results. I would like to point out several things:
- The number of results is off to the right, just above the advertisements, because it is almost incidental to the actual results.
- The number of results are not separated into how many of each type of results were located. To Google, a good hit is a good hit.
- The results include information from several types of sources, all presented in ways that are immediately apparent to users.
Google, of course, has a number of proprietary tools to create that list of results, collectively called PageRank. No one outside of Google knows precisely what is involved in determining PageRank–something Google is, for various reasons, very anxious to keep secret–but from the company’s history, we have a rough idea.
Larry Page and Sergey Brin’s rough idea for determining how valuable a page on the Internet was that the more sites that link to a particular page on the Internet, the more valuable that page must be, hence an increased PageRank. Of course, PageRank has evolved considerably since its days as Larry and Sergey’s project at Stanford (then called “BackRub”).
Integrated library systems (ILSes) and vendor-supplied research databases have tools like this, too: the number of times a title has circulated and the number of times users have viewed a database article. Sure, truly intuitive ranking systems are more complex, but it’s a start.
But this brings us back to the original problems that this essay is determined to discuss. Why can’t results from the catalog and the research databases be merged into one list that gives users what they want, regardless of source? At the very least, why can’t a catalog search results page offer a link to relevant research databases when a user performs a search? (E.g., “You appear to have been searching for ‘diabetes’. Have you tried our health and medicine research databases?”)
Here is one last figure to consider:
The three default icons under “Are You Looking For?” have been supplemented with a fourth: “research databases”. (Excuse my oversimplified illustration; the icon is supposed to be a mortarboard hat, if you can’t tell.) Also, the first result is a hit from a research database, along with a paragraph taken straight from an article, describing what diabetes is and offering a link to more information there. Users can immediately grasp that the first result is a relevant one and, entirely without requiring any extra effort of the user, more high-quality information is presented to them.
Integrating all of the information a library can offer in one easy-to-use list is a difficult task, but it is one that must be undertaken if libraries are to remain relevant and useful to patrons. This change would be a momentous one, since companies that make all different types of software–catalogs, databases and ILSes–would have to hammer out some sort of standardized API for these sorts of things. (This, though, is an expansive topic that is probably best left for a future essay.) It is, unfortunately, something which I fear may not happen for some years, by which time more of our “mind share” will have been taken away from us by Google. If libraries are to remain relevant, we have to put all our information at the users’ fingertips, regardless of source and with one simple search. We need our own smart firehoses.
Tags: Catalogs, Google, Libraries, OPACs, Search, User Experience, Users





