Search Tips Report: Information You Can Use

A monthly newsletter from Eipert Information Services

January 2004                                Number 1

Search Tips Report was a free email newsletter from Eipert Information Services, featuring practical tips about business and sci/tech information sources and research strategy for you to apply in your own business. 

Search strategy tip: a few good business reasons to search archived web sites

Tracking websites back in time through Internet Archive's WayBack Machine and/or search engine caches can provide valuable clues when trying to learn more about a company or person, or when looking for anything that has otherwise vanished from the current Web.

Why might a look at an older version of a web site be of value?

Older versions of a company's web site can be useful for gathering competitive intelligence, addressing such issues as:
a) how a company presented itself in the past as opposed to now,
b) what a company did before being merged or acquired,
c) how a product was marketed in the past, and
d) what jobs a company advertised last year.

Other examples:
a) Technical information or product details useful in verifying prior art for patents may be available only on a company's earlier sites.
b) A web site that formerly had been conducting illegal activities on the Internet, and has been removed from the current Web, might still be found in its archived version as evidence for the use of investigators.

Biographical information about an executive who no longer works for the company might be found on a company's old site. Sometimes information about people can be gleaned from their old student or personal home pages.

Other uses are limited only by the searcher's imagination. Business professionals could produce case studies of failed companies based on information from vanished web sites. Web site designers could study how the design of web sites has changed. Public policy professionals could analyze how an organization has changed its opinion or bias over the years.

WayBack Machine

The WayBack Machine (http://www.archive.org/web/web.php) is based on a permanent archive of the web that was begun in 1996 by Internet Archive, a public nonprofit organization with the goal of universal access to our cultural heritage. Internet Archive has crawled the Web every six months or so since then to build a huge database or archived web sites. Enter the URL of a particular web site in the WayBack search box to see a set of links to dated former versions of that site. This is an incredible archive-just don't expect it to contain every web page ever published.

Example: Enter “yahoo.com” (without the quotation marks) in the WayBack machine search box at http://www.archive.org/web/web.php and click “Take Me Back” to see old versions of Yahoo.

The most reliable and comprehensive way to find archived sites occurs when you already know the URL of a site. The Internet Archive has recently offered the beta version of a new full text search of part of the Internet Archive by keyword. This offers the promise of finding information by topic, or of finding a site when the URL is unknown.

Search engine caches

The cache feature of search engines, primarily Google, can be used to see a fairly recent version of a web page. Take advantage of the fact that search engines actually search their own cached versions of web sites rather than the actual current sites. If a "cached" link is present in Google's search results, it will lead to the version of the page Google used in its search, not the current page. Typing "cache:" in the search line, and then the URL directly after the colon, is an alternate way to see the cached page. Because of the way that Google crawls the web, you might even see cached pages from two different dates using both methods.

The date of any cached page can be difficult to predict, since Google re-indexes some sites much more often than others. Google does not list the date that any particular page was cached, and cached pages will disappear once Google has indexed the site again.

Example: Type “cache:www.yahoo.com” in the search box at www.google.com - or - type "yahoo" in the search box, run the search and then click the "cached" link in the first search result. Note the date and time near "In The News" on the right hand side of the resulting Yahoo page. Compare this to the current page at www.yahoo.com.

A few other search engines also have cache features. Check out the "cached" or "archived copy" links in the search results from www.gigablast.com, www.incywincy.com, www.yuntis.com, and www.daypop.com.

*** Contact Sue Eipert (seipert@eipertinfo.com) at Eipert Information Services for customized research of proprietary business and sci/tech databases, as well as the Internet, for marketing, R&D, strategic planning or litigation support. ***


Eipert Information Services
seipert@eipertinfo.com



Copyright © 2000-2008 Sue Eipert