Why Compete Sucks

by cat on September 24, 2008 · 1 comment

no compete

I know it sounds linkbaity, but I couldn’t come up with a title for this post that better reflects my assessment of Compete. On the surface, Compete seems like a good idea — they have a five-year daily clickstream history for 2 million consumers, and 80 million page views per day. Compete’s tagline claims that it “helps you benefit from click-sharing by providing free services that create a more trusted, transparent, and valuable Internet.” However, when digging into the methodologies behind these services, I found Compete to be anything but transparent, and of questionable value. (OK, so it’s a huge improvement over Alexa, but that’s a whole other rant.)

Compete has developed four primary products: Site Analytics, Search Analytics, Referral Analytics, and Ranked Lists. 

  • Site Analytics measures a homegrown metric called Attention, which is defined as the total time spent on a domain as a percentage of the total time spent online by all U.S. Internet users. Site Analytics also provides a metric called People, which it uses interchangeably with unique visitors. Beware of this People metric, it’s based on Compete’s “consumer-based community” rather than any technical measurement methods such as log files or page tags.
  • Search Analytics purports to rank a site’s keywords as a percentage of the site’s total search traffic, plus something called Average Time Index. When plugging my own sites into the Compete interface and comparing the results against search metrics reported by my web analytics tool for the same timeframe, it’s not even close. The top keyword identified by Compete was actually my third most popular, and the percentage of my site’s search traffic Compete reported for that keyword was over 100% more than my analytics package reported. The Average Time Index ranks average time per visit being spent on the site in terms of a scale from 1 to 100, so it doesn’t represent actual time measurements.
  • Referral Analytics claims to provide information on where any given website gets its traffic. It ranks referrers for the current month and previous month, and labels the month-to-month difference as “change in share.” While Compete correctly identified my site’s top referrer, it underreported the site traffic that referrer provided by 30% compared with my web analytics tool.
  • Ranked Lists provide Compete’s top 200, 1,000, 15,000, 100,000, or 500,000 sites based on various metrics — unique visitors, visits, page views, time spent, and Attention (there’s that homegrown metric again). I checked Compete’s UV numbers against Google Trends for Websites, and in general the Compete numbers are a tiny fraction (about 5%) of what Google reports. While neither source is inherently more trustworthy, Compete’s methodologies (described below) and the issues with all of its other metrics lead me to trust Google’s numbers a teensy bit more. 

How Compete Works
Compete’s claim that “we use rigorous statistics” and their assertion that Compete metrics are often quoted in the mainstream press are not enough of an explanation for me, so I did some digging, and here’s what I uncovered: In a nutshell, Compete triangulates between ISP data, homegrown Internet usage estimates, opt-in panels, surveys, Compete toolbar users, and various indices. 

Compete receives consumer data from US-only ISPs and ASPs, who remain unidentified due to contractual obligations. Compete also recruits consumers to join its member community and use its toolbar through targeted opt-in email lists, search advertising, co-registration programs and online promotions. These consumers currently represent less than 10% of the overall panel, but Compete will not disclose the exact number. Compete clickstream data is collected through a toolbar application downloaded by the consumer. Data collected via the Compete toolbar is then merged with participating ISP and ASP clickstream data. Compete claims to employ a “rigorous panel balancing and projection methodology,” but does not specify what this methodology is. To spice up this secret sauce, traditional industry data is thrown in from sources such as Monitor Plus, R. L. Polk, the Mortgage Banker’s Association, and Ward’s. Company-specific data from clients may also be incorporated.   

Compete also conducts a monthly random-digit-dial survey to track the U.S. Internet population. The results of the RDD survey are used to establish universe estimates for household income, age of head of household, and gender of head of household, as well as ISP provider and location of Internet usage. Weights are adjusted each month to align the sample with universe estimates produced by the RDD survey. Other weighted inputs are injected from syndicated data providers and mysterious proprietary sources. In addition, Compete has the ability to serve behaviorally-targeted surveys to the 2 million consumers in its database.   

Issues With Compete’s Methodology

  1. Compete says its data balancing and projection process de-biases the sample along critical behavioral and demographic dimensions, across their clickstream data sources, and that it provides the basis to project results from their database to the complete U.S. Internet population. However, Compete will not disclose the statistical processes that are being used to de-bias the sample.
  2. Compete will not reveal the participating ISPs and ASPs, so we have no way of knowing the composition of the sample before weighting.
  3. Even after weighting, population segments that are excluded by the sample design (such as non-participating ISPs and ASPs) can never be realized. 
  4. Compete will not disclose the composition of the sample by region (many ISPs have regional concentrations) or connection speed/method (dial-up, cable, DSL), so it’s impossible to evaluate even sample distribution.
  5. The consumers directly recruited to Compete’s panel are self-selected, and their specific origin is not known. Inherent bias, anyone?
  6. The cutoff value for their confidence interval is about 25k UVs/month – in other words, accuracy drops significantly for smaller sites. Compete states explicitly that they are most confident in their data on the top 1 million domains.
  7. The Compete toolbar only works with IE6+ and Firefox, so users of browsers like Safari and Chrome are not represented except through ISP data. Granted, Chrome users currently represent a tiny portion of the population, but that’s not the case for Safari.

 One thing I will give Compete is that their site offers numerous disclaimers about how their numbers are not to be compared with logfile or cookie-based web analytics or other measurements. Of course they aren’t — Compete numbers are US-only, and the sample is relatively small (around 2 million). Overall, I have difficulty believing Compete metrics are truly representative of the Internet as a whole. Ultimately, relying on numbers I know to be inaccurate — as I tested myself against my own analytics — would be a CLM at best. The only way I can see Compete data being helpful would be as measured as a trend against against itself, and even then, I would use it only directionally and not as a basis for making business decisions.

{ 1 trackback }

Competitive SEO Tools | Sundial SEO
August 23, 2009 at 11:38 am

{ 0 comments… add one now }

Leave a Comment