Monday, June 6, 2011

Rank tolls

This web ranking tool accepts a URL for a web page and fetches two numbers.

The first is Google's PageRank. Without a doubt, the PageRank number that Google makes available is increasingly worthless. This may be part of their grand design to put pressure on marketers who trade and sell links in order to improve rankings. Prior to 2003, a PageRank number made it easy to evaluate how much a link was worth. But by now, of course, Google prefers that all e-marketers buy ads instead, because this is 99 percent of Google's revenue.

We used to also fetch the Alexa traffic rank for a domain. Alexa's numbers come from the "phone home" feature on Alexa's toolbar. This ranking reflects the tastes of those who tend to install toolbars in general, and Alexa's toolbar in particular. Alexa only ranks by domain, which means that the traffic to any page on that domain counts toward that domain's total. However, Alexa refines their counts further so that in the end, the ranking is a reflection of the unique visitors to a domain, calculated once per day and averaged on a moving basis.

The second number reported by this web ranking tool is a count of total external backlinks from Yahoo. An external backlink is a link to a specific page from outside of that page's own domain. Yahoo has much better reporting for backlinks than Google. It's possible that Google uses a better index of backlinks for its internal ranking calculations, but their public reporting of backlinks is basically worthless. If the site that interests you is a blog, don't be surprised if Yahoo's backlink count seems extremely high. Blog software specializes in facile linking, which is another way of saying that bloggers spend all day swapping blogrolls and linking to each other's trivia. It doesn't mean much. [UPDATE: As of August 2010, it appears that Yahoo has disabled the technique we used to show backlinks.]

Web ranking is important to millions of webmasters because it determines how well their pages show up in search engines. Very few engines bother to carefully analyze the actual content of a page. Instead, they take the easy way out and rely on external factors such as linking, combined with the anchor text in links. Titles and headlines are also important. If a web page doesn't show up in the first 20 results of most popular engines for a webmaster's chosen keywords, then that page cannot be considered competitive.

Web traffic and linking follow a power law distribution curve. In simple terms, this means that something like the top 10 percent of all websites get 90 percent of all web traffic and all external links. The bottom 90 percent of all sites share the 10 percent of web traffic and external links that remain. If you have a sense of what people mean when they say that "the rich get richer," then you know what it's like to be a poor webmaster who is told by his boss to increase traffic to a site.

A power law distribution becomes a straight line when graphed on a log scale. For this reason, both the PageRank scale and Alexa's graphs are logarithmic. Up until mid-2003, the 0 to 10 PageRank scale appeared to have a log base that averaged between 4 and 6. This meant that you needed about five times the number of links to increase your PageRank by one digit, if everything else was equal. But since 2003, it has become clear that high PageRank pages now have amazing power to confer PageRank on others. Today it seems that with only a few links from pages that average, let's say, a PageRank of 7, the target page can already expect a PageRank of 6.

This recent phenomenon is not the classic way of computing PageRank. Some suspect that the classic calculation required so much overhead, that Google has been faking it since mid-2003. In any event, PageRank is one of dozens of factors that affect how a page will rank in Google's index. If the anchor text in links to a page correlate with the search terms entered by the user, then this is more important than the actual PageRank of that page. The classic PageRank calculation had nothing to do with words or content; it was precomputed on only the number and strength of links.

On Alexa, high traffic domains get ranked with lower numbers, so that the number one domain on the web ( has the highest traffic. Reports on actual traffic from various webmasters have resulted in a plot of Alexa rank against daily unique visitors (Alexa itself does not offer this information). The formula for the curve that best fits these plots takes Alexa's traffic rank number, raises it to the power of -0.732, and multiplies the result by 7 million. This tool used to fetch the Alexa rank and make this calculation for you, but our confidence in Alexa's rank is now so low that we don't bother.

Remember that those who use Alexa's toolbar do not reflect the average web surfer. Search engine optimizers, for example, are more aware of Alexa than the general population, so SEO domains do much better in Alexa than other domains. One of our domains,, has nothing to do marketing, search engines, or traffic. Alexa has always reported a ranking that translates into one-fifth as much eyeball traffic as it actually gets, according to our logs. Our Google scraper,, is credited with only one-tenth of its actual traffic. The reason is simple: searchers use a Google scraper to protect their privacy, and therefore are not likely to install the Alexa toolbar and let Alexa see everything they do. At the other extreme, this domain,, shows ten times more traffic in Alexa than it actually gets. (Marketers are attracted to this ranking tool, and more of them have the Alexa toolbar installed.)

Alexa's numbers are best for domains that rank under 10,000. They're useful up to about 100,000, but mostly to see a time-line for a single domain, rather than comparing one domain against another (graphs going back two years are available at Alexa). With ranks over 100,000 the sample is so unreliable that Alexa often won't show graphs. Finally, any site that requires a subdomain cannot get useful statistics from Alexa, since the traffic ranks are typically based on the domain as a whole.

Now that we've qualified everything to death, have fun with our web ranking tool!