Cornell University

Catherwood Library

Catherwood Library, Ives Hall, 607-255-5435

Question of the Month

From the Catherwood Library reference librarians

October 2006

PLEASE NOTE: The Reference Question of the Month is kept current only during the month for which it was written. Archived questions will not be updated, and over time may contain inaccurate information or broken web links. We provide archived questions as a service, since much of the information will remain accurate and of continued interest to the ILR community.

Question: Why doesn't Google always find what I need?  What is the deep web?

Answer: Web crawlers, such as Google, do not index every page on the Internet. The pages that they recognize and index are often referred to as the surface or visible web. Pages that are invisible to the search engines, either by design or construction, are referred to as the deep or invisible web. These include pages generated dynamically by database software, often databases of important scholarly materials.

There are tools being developed to crawl and index the deep web. Webmasters and developers are also working to expose their information to search engines. However, as part of the research process, users must recognize the existence of the deep web and take steps to find the information they need.  

How can you find this information?

Remember when you look at search results they are not complete; you may be missing valuable information.

  • Think databases. Remember that databases and other non-static html pages are not being indexed and crawled, including images or video files. Some of the most important tools are the databases available through your library. At Cornell, patrons can use the Find Databases tool. Catherwood reference staff also provide a listing of Article Databases.
  • In some cases, users can search on broad terms to locate and navigate to individual web sites. Then use the site search tools to expose hidden pages.  
  • Most importantly, ask the reference librarians. In addition to helping you discover resources, they can provide advice on how and what to search. See Catherwood's Ask a Librarian web page.

For additional information:
 
Wikipedia’s entry for Deep web.

The University of Albany Library’s Internet tutorial, The Deep Web.
 
The 'Deep' Web: Surfacing Hidden Value, White Paper from BrightPlanet.

Deep Web Research, an Information Blog developed and created by Marcus P. Zillman for monitoring deep web research resources and sites on the Internet. Site includes – Academic and Scholar Search Engines and Sources.

— Researched by Mary Newhart