Fact-checking Canwest’s Don Martin

Yesterday in the National Post, Don Martin wrote a column condemning the Conservative’s “definition” ads on Michael Ignatieff trying to find scandal where there is none.

The Conservative Web site attacking the new Liberal leader is www.Ignatieff. me. Here endeth the federal party’s free publicity.

The .me Internet domain name is registered to the tiny European country of Montenegro, incidentally governed by a coalition, and its Web administrator is based in Arizona. It is, I’m told by experienced Web surfers, often used to showcase pornography.

Columnists and (more unfortunately) reporters often use terms such as “critics say”, “experts agree”, “some suggest” in place of “in my opinion”, “I think”, and “according to me”. Perhaps “I’m told by experienced Web surfers” could be “the Liberal party told me that” but “questions surround” Don’s true source on top level domains and pornography.

I decided that I’d get to the bottom of this. I decided to check Google for the number of sites that came up when one searches for the term “porn”. If Mr. Martin is correct, we’ll see the Montenegro top level domain (TLD) populated with teh porn results. However, I wanted to take a look to be sure.

There are 251 top level domains that I checked (including the 180 or so countries recognized at the UN), the semi-autonomous regions and the other TLDs including .com, .org, .net.

Don’t worry, I didn’t do these searches one at a time, one browser-based Google search after another; I wrote a script that used the Google AJAX API to get the results and crunch them.

For a quick tutorial on google searches, a search for

site:.fr fromage

will return all sites with the keyword “fromage” in the French top level domain.

Using the Google AJAX API, I decided to check every country code for the number of results with the keyword “porn”.

http://ajax.googleapis.com/ajax/services/search/web?v=1.0&q=site:.me%20porn
where “%20” is the URL encoded character for a space

I wrote a PHP script that checks each TLD for “porn” and then parses the results to extract the number of search results from the output of the Google AJAX API call.

Here are the results (you’ll want to click each graph to enlarge them)

If you click on the image to enlarge it, you’ll see that the most populated TLD for the word “porn” (as indexed by Google) is .com. In fact, Montenegro ranks at 61 for propensity of porn. In fact, if you’d be more likely to find porn on the (.cx) domain for… yes, Christmas Island. If we want to measure the proportion of porn sites in a TLD, .me is 15th behind the “porn-showcasing” countries of the Bahamas, Norfolk Island, Azerbaijan, Grenada, Zimbabwe, Malawi, Western Samoa, Bhutan, Congo, Togo, Tokelau, Georgia, Seychelles Islands, and the Virgin Islands. Just over 1% of .com domain names contain the keyword “porn”. Montenegro clocks in at just half a percent more. So, I think we can put Don Martin’s irresponsible words to rest. In terms of “showcased porn”, you’re much more likely to see it on German, Polish and Russian servers. Further, the term “showcase” is deceiving as well since most domains are open to registration to all local residents (if not all global residents) and there is generally no stipulation that states that a registrant must feature pornography.

But where did Don Martin’s assumption come from? Given the Liberal Party’s embarrassment and lack of foresight in launching a campaign at onprobation.ca when onprobation.com turned out to be a hardcore pornographic website, perhaps the Liberals were floating some bogus talking points over to Martin to get them published as fact.

Now that we’ve resolved the Montenegro issue, and now that we have a great program that sorts TLDs by search terms, let’s make sure that we’re still #1 for what matters (even though all Canadian team have been eliminated).

Take that, Sweden.

If you’re interested in the PHP program, here’s the source code.