How the Web Was Won: Google’s Technology
In the Web’s early days, full-text searches ranked their results according to information contained on Web sites themselves like the prominence of a certain word. If, for example, you wanted to learn about buying a small dog and you searched for dachshund, your list of sites was likely to be organized by which ones had the most instances of the word “dachshund.” That might well have been a site set up by a woman in Boise who painted cartoon dogs onto sweatshirts, the schedule for a group of people in Sacramento who have dachshunds with ingrown toenails, and the Daytona Dachshunds Little League roster. You could search through thousands of pages before you hit any useful information.
Even if you narrowed your search to something like “dachshund breeders,” you might still have gotten sites run by pet food conglomerates or veterinarians or any company that set up its Web pages to draw people with an interest in dogs. In short, it was maddeningly hard to get relevant search results.
Enter Google. In 1995, Sergey Brin and Larry Page met in the graduate computer science program at Stanford University. Their idea was to create a search engine that would rank search results not on data that could be manipulated by Web masters, but by using the strength of the Internet itself through community input. Their technology evaluated a site primarily on how many other sites linked to it, and ranked search results accordingly. Thus, their searches tended to return results that lots of other people found useful, resulting in a surprisingly valuable system.
By 1998, Brin and Page had dropped out of Stanford to start Google. In its first year, the company run by four employees out of a garage in Menlo Park, California answered about 10,000 search requests per day. Today, the Web is home to about a dozen very popular search sites and likely thousands of less well-known ones, but Google’s computers handle more search requests than anyone else’s, over 250 million per day.
Google is the reigning search champ not because the company has clever marketing (it doesn’t) or a killer online dating service (again, no dice), but because the site is easy to use and effective.
Tip: Wonder what all those people are searching for? Google provides snapshots of its search activity, by month and by year, at Google Zeitgeist, www.google.com/press/zeitgeist.html. This is the perfect place to find out if anyone still cares about Martha Stewart or whether The Apprentice is declining in popularity.
How the Ranking Works
Google uses a number of elements to decide whether a Web page is a good match for a particular search. First, it looks at links. Links from one Web page to another don’t appear spontaneously; people have to make them, in effect saying “Look here and here and here.” Because each link thus represents a decision, Google infers that a link from one page to another is tantamount to a vote for the second page. Pages with lots of votes are considered more important than other pages. For example, if a million baseball-fan Web sites all have links to MLB.com (home of Major League Baseball), Google’s logic is, “Hey, that’s an important site for people searching for the word baseball.”
In addition, Google ranks the pages that cast the votes, based on their own popularity, and gives more weight to the votes from heavily linked-to pages. Finally, Google uses this information to assign Web pages an appropriate PageRankGoogle’s term for statuswhich it calculates on a scale from one to ten.
Note: The term PageRank is actually based on the name of one of Google’s founders, Larry Page, not on the idea of Web pages.
But all that jazz would lead to nothing more than an interesting hierarchy of Web popularity if it didn’t take into account the words you’re searching for. So when you query Google, it combines PageRank with an additional system for matching text which looks not only at the content on a first layer of pages, but at the content on pages linking to them to produce a list of pages that is, more often than not, relevant.
In all, the Google equation, or algorithm, incorporates 500 million variables looking at everything from links to the position of your search terms on a page. And most searches run in much less than a second.
Because the site’s methods are so complex, it’s toughthough not impossibleto jigger a page in order to improve its rank in a Google search.
Comparing Google with Other Searches
Most of the time, you’ll probably decide which search site to use based on the relevance of its results. But these days, many search sites return similar results, which means you might want to make your choice based on factors like speed and site design. It’s akin to buying a car today: most automobiles will get you where you want to go, but they differ in reliability, smoothness, and style.
Google: The Missing Manual, 2nd Edition
By J.D. Biersdorfer, Rael Dornfest, Matthew MacDonald, Sarah Milstein
Pub Date: March 2006
Print ISBN-10: 0-596-10019-1
Print ISBN-13: 978-0-59-610019-3