What's wrong with Google's numbers?

Tags: google, search, numbers, rant, what the chuff

I noticed recently that Google shows a massive amount of results, even for searches that you’d think are fairly uncommon. I also noticed that the number of results seems to change a lot in the same search. Are they up to something? Are these numbers made-up? Is there a difference between a result of the search and a result that’s shown to the user?

The first search

For the first test, I used something very similar to a real search I made today. I entered the term, SharePoint InfoPath “form not visible” (including the quotation marks) – I had created a custom form for a SharePoint list using InfoPath, but the page was showing nothing at all for certain users. Now, remember; we’re not here to judge how badly-crafted my search terms may or may not be!

The first page of results popped up and, unsurprisingly, boasted an enormous number of results – “about” 60,700 here.

Search1a

But, as I started going through the results page by page, I noticed there’re only 9 pages of results. Clicking on the link to page 9 actually took me to page 3. At the bottom of this page was the message below, saying that Google was not showing results “very similar” to the 28 already displayed.

Search1b

28 is a considerably smaller number than “about” 60,700, no matter what school you went to. Perhaps these omitted results account for the remaining 60,672 results? Let’s see – I clicked on the link to show the omitted results, and got this:

Search1c

Not only did it make up for the missing 60,672 results, it added another 52,300 results. There should be a lot more than 9 (or 3) pages this time, then. Wrong – only 12.

Search1d

And, what’s more (or less, in this case), there’re now only 118 results.

Search1e

To recap, we started off with “about” 60,700 results, which turned out to be 28, then re-ran the search with omitted results to get 113,000 results, which turned out to be 118.

The second search

Just to look into this a bit more, I’ve done two more searches for this post. For the second search, I entered "IIS 7.5" "server error in application" PHP. The first page of results told me there were about 103,000 results.

Page 22 turned out to be the final page of 211 results with similar results omitted. After re-running the search with the omitted results included, it still showed 103,000. The final page of results in this case is page 57, but this time it still said there were 103,000 results. However, 57 * 10 (results per page) is 570 which, again, is nowhere near 103,000.

The third search

For the final search I entered "iPhone 4S" "battery life" "when using GPS" to match a variety of sentences. The first page showed there were 50,900 results. On page 10, I reached the end of the results – 98 in total, apparently.

I re-ran the search with omitted results included and this time there were an extra 100 results bringing the total up to 51,000. Page 13 was the end of the results this time, showing just 130 results. There are about 50,900 results unaccounted for.

So, is there something wrong with Google’s calculator?

I know some results have links to “show more results from xyz.com”, but, using the first example above, this would mean that each result, on average, would need to have almost 1000 extra results from the same site for this to add up – not likely.

So what is with these massive differences in numbers? My only theory is that a “result” is just a match to a word in the search terms, but the ones Google show you are only “relevant” results. For example, this blog post should match "iPhone 4S" "battery life" "when using GPS" but it’s completely irrelevant to someone wondering why their iPhone 4S has battery life problems when using GPS (I’d be more worried about it burning up, but I digress).

Yes, they use the word “about” before every number, but I don’t go round telling people I’m “about 30,000 years old.”

Comments

Recent Bloggage

  • Opening Windows PowerShell in the current directory

    I came across this little trick on tripledot.be and thought I’d share it here, too.

    I’ve been using the Shift + Right-click > Open command window here shortcut for years, but there isn’t such a link for starting PowerShell in the same fashion. Or so I thought…

    Continue reading...

  • What's wrong with Google's numbers?

    I noticed recently that Google shows a massive amount of results, even for searches that you’d think are fairly uncommon. I also noticed that the number of results seems to change a lot in the same search. Are they up to something? Are these numbers made-up? Is there a difference between a result of the search and a result that’s shown to the user? Continue reading...

  • Some ideas for film/book-based games

    Here’s a few ideas for some new games that some friends on Live and I came up with a couple of days ago. They’re nothing serious, but they could actually work pretty well if made by the right people (i.e., not EA). Continue reading...

Xbox Gamercard

About Me

Creeping Jesus [krping jzəss] n
Derogatory slang

  1. an obsequious or servile person
  2. a hypocritically religious person

Let's go with the first definition. Funny thing is, I only just looked that up after having this nickname for pushing 8 years now, and it's scarily accurate! Although, the second definition couldn’t be further from the truth.Continue reading...