I’ve been reading a ton of articles with commentators’ takes on whether a merger between Sprint and T-Mobile will be good or bad for consumers. Almost everything I’ve read has taken a strong position one way or the other. I don’t think I’ve seen a single article that expressed substantial uncertainty about whether a merger would be good or bad.
It could be that everyone is hugely biased on both sides of the argument. Or maybe the deal is so bad that only incredibly biased people would consider making an argument that the merger will be good for consumers. I’m not sure.
I like to look at how markets handle situations I’m uncertain about. In the last few years, I’ve regularly seen liberal politicians and liberal news agencies arguing that we’re about to see the end of Trump’s presidency because of some supposedly impeachable action that just came to light. I’m not Trump’s biggest fan, but I’ve found a lot of arguments about how he’s about to be impeached too far-fetched. I have a habit of going to the political betting market PredictIt when I see new arguments of this sort. PredictIt has markets on lots of topics, including whether or not Trump will be impeached.
Politicians and newspapers have an incentive to say things that will generate attention. A lot of the time, doing what gets attention is at odds with saying what’s true. People putting money in markets have incentives that are better aligned with truth.
Most of the time I’ve seen articles about Trump’s impending impeachment, political betting markets haven’t moved much. In rare occasions where markets moved significantly, I’ve had a good indication that something major actually happened.
Wall Street investors have a strong incentive to understand how the merger will actually affect network operators’ success. Unsurprisingly, T-Mobile’s stock increased substantially when key information indicating likely approval of a merger came out. Sprint’s stock also increased in value.
What’s much weirder is that neither Verizon’s stock nor AT&T’s stock seemed to take a negative hit on the days when important information about the merger’s likelihood came out. In fact, it actually looks like the stocks may have increased slightly in value.
You could tell complicated stories to explain why a merger could be good for competing companies’ stock prices and also good for consumers. I think the simpler story is much more plausible: Wall Street is betting the merger will be bad for consumers.
Maybe none of this should be surprising. There were other honest signals earlier on in the approval process. As far as I can tell, neither Verizon nor AT&T seriously resisted the merger:
Disclosure: At the time of writing, I have financial relationships with a bunch of telecommunications companies, including all of the major U.S. network operators except T-Mobile.
A lot of numbers have been thrown out about how many jobs will be created or destroyed if the proposed merger between Sprint and T-Mobile goes through. Here are a few examples I found on the New T-Mobile website:
When dealing with complex systems full of feedback loops, nothing happens in isolation. This fact is often overlooked. I’ll occasionally see an article where a nutrition journalist explains that there are 3,500 calories in a pound. Using some simple math, they’ll explain that since a bag of chips has 200 calories, falling into the temptation to eat a bag of chips every day will lead a person to gain about ten pounds over six months. This is of course silly. By the same logic, eating a bag of chips each day could lead a person to gain 800 pounds over 40 years.
People can’t eat a bag of chips without affecting anything else about their body or their behavior. The body is a complex system. There are lots of feedback loops that regulate things like hunger and energy expenditure.
The economy works in a similar manner. Jobs cannot be created out of thin air. Jobs can’t be eliminated in isolation. If the New T-Mobile hires 10,000 people, it’s not reducing the number of unemployed people by 10,000. The emphasis on job creation is especially strange right now. The U.S. unemployment rate is currently lower than it’s been at any other point in my life.
Since then, I’ve noticed more forms of bogus website endorsements. For example, Comodo Group’s trusted site seals:
A Comodo SSL trust seal indicates that the website owner has made customer security a top priority by securely encrypting all their transactions. This helps build confidence in the site and increases customer conversion rates…For a site seal to be effective, customers have to have confidence in the ‘endorsement brands’ that are on your site. If visitors are to trust you, they must trust the companies behind the logos on your site…Comodo is now the world’s largest SSL certificate authority and over 80 million PC’s and mobile devices are protected using Comodo desktop security solutions. That adds up to a lot of online visitors trusting you because they trust us.
You can get these seals for free here. You don’t even have to verify that you’re using any kind of security! I indicated that I have a UCC SSL certificate. I don’t have one of those, but look at the cool seal I got!
SiteLock also offers cool security seals. They look like this:
That’s just an image for illustrative purposes. It’s not a real, verified seal. Getting an actual seal costs money and involves verification. The verification component is interesting. If SiteLock realizes a site is not safe for visitors, will the seal make that clear?
If a scan fails site visitors will not be alerted to any problem. The SiteLock Trust Seal will simply continue to display the date of the last good scan of the website. If the site owner fails to rectify the problem SiteLock will remove the seal from the site and replace it with a single pixel transparent image within a few days. At no point will SiteLock display any indication to visitors that a website has failed a scan.
All this got me thinking. What if I offered free, honest endorsement seals?
This idea had an obvious flaw: a total lack of credibility or credentials on my part. I decided it was time I got myself some credentials. I went to the Universal Life Church (ULC) website and began the arduous process of becoming an ordained minister. After painstakingly entering my personal details and clicking the “Get Ordained Instantly” button, I had my first credential:
A few days later, I had physical proof:
A lot of people have been ordained by the ULC. To make sure people could know I’m really trustworthy, I went ahead and got a few less common credentials:
After acquiring my credentials, I spent an intense eight minutes creating a professional endorsement seal:
Warning: This post is a rant and involves some foul language. Enjoy!
Tons of research suggests that people engage in deception and self-deception all the damn time. People are biased. People respond to the incentives they face. Everyone who has ever interacted with another human being knows these things.
Despite this, pretty much every website offering reviews makes claims of objectivity and independence. These websites don’t claim that they try to minimize bias. They claim to actually be unbiased.
Let’s take TopTenReviews, a site I criticized in a previous post. TopTenReviews says things like:
To be clear, these methods of monetization in no way affect the rankings of the products, services or companies we review. Period.
Bullshit. Total bullshit.
I’ve ranted enough in the past about run-of-the-mill websites offering bogus evaluations. What about the websites that have reasonably good reputations?
NerdWallet publishes reviews and recommendations related to financial services.
Looking through NerdWallet’s website, I find this (emphasis mine):
The guidance we offer, info we provide, and tools we create are objective, independent, and straightforward. So how do we make money? In some cases, we receive compensation when someone clicks to apply, or gets approved for a financial product through our site. However, this in no way affects our recommendations or advice. We’re on your side, even if it means we don’t make a cent.
NerdWallet meets Vanguard
Stock brokerages are one of the types of services that NerdWallet evaluates.
One of the most orthodox pieces of financial advice—with widespread support from financial advisors, economists, and the like—is that typical individuals who invest in stocks shouldn’t actively pick and trade individual stocks. This position is often expressed with advice like: “Buy and hold low-cost index funds from Vanguard.”
Vanguard has optimized for keeping fees low and giving its clients a rate of return very close to the market’s rate of return. Since Vanguard keeps costs low, it cannot pay NerdWallet the kind of referral commissions that high-fee investment platforms offer.
What happens when NerdWallet evaluates brokers? Vanguard gets 3 out of 5 stars. It’s the worst rating for a broker I’ve seen on the site.
NerdWallet slams Vanguard for not offering the sort of stuff Vanguard’s target audience doesn’t want. Vanguard gets the worst-possible ratings in the “Promotions” and “Trading platform” categories. Why? Vanguard doesn’t offer those things.
Imagine a friend who went to a nice restaurant and came back complaining that her steak didn’t come with cake frosting. NerdWallet is doing something similar.
The following excerpt is found on NerdWallet’s Vanguard review under the heading, “Is Vanguard right for you?” (emphasis mine):
Ask yourself this question: Are you part of Vanguard’s target audience of retirement investors with a relatively high account balance? If so, you’ll likely find no better home. You really can’t beat the company’s robust array of low-cost funds.
Investors who fall outside of that audience — those who can’t meet the fund minimums or want to regularly trade stocks — should look for a broker that better caters to those needs.
This is silly. Vanguard’s minimum is $1,000. You shouldn’t buy stocks if you have less than $1,000 to put into stocks! If you invest in stocks, you shouldn’t regularly trade individual stocks!
From my perspective, NerdWallet is saying that if you are (a) the typical kind of person that should be buying stocks and (b) you don’t use a stupid strategy, then “you really can’t beat the company’s [Vanguard’s] robust array of low-cost funds.”
So there we have it. Despite the lousy review, NerdWallet correctly recognizes that Vanguard is awesome.
NerdWallet didn’t really lie, but it’s biased.
To be clear, I’m being hard on NerdWallet. NerdWallet does a good job aggregating information about financial services and offers decent financial advice in some areas. The evaluation methodology I’m criticizing may not have been maliciously engineered. NerdWallet may have stumbled into the current methodology. Still, there’s a big problem. Since NerdWallet’s current methodology is good for the company’s bottom line, NerdWallet has a strong incentive not to correct the obvious issues.
Sometimes evaluators aim to create divisions between editorial content (e.g., review writing) and revenue generation. I think divisions of this sort are a good idea, but they are not magic bullets.
WireCutter is one of my favorite review sites, but it makes the mistake of overemphasizing how much divisions can do to reduce bias:
We pride ourselves on following rigorous journalistic standards and ethics, and we maintain editorial independence from our business operations. Our recommendations are always made entirely by our editorial team without input from our revenue team, and our writers and editors are never made aware of any business relationships.
I believe WireCutter takes actions to encourage editorial independence. However, I’m skeptical of how the commitment to editorial integrity is described. Absent extreme precautions, people talk. Information flows between coworkers. Even if editors aren’t explicitly informed about financial arrangements, it’s easy for editors to make educated guesses.
Bias is sneaky
Running Coverage Critic, I face all sorts of decisions unrelated to accuracy or honesty where bias still has the potential to creep in. For example, in what order should cell phone plans I recommend by displayed? Alphabetically? Randomly? One of those options will be better for my bottom line than the other.
I don’t have perfect introspective access to what happens in my head. A minute ago, I scratched my nose. I can’t precisely explain exactly how or why I chose to do that. It just happened. Similarly, I don’t always know when and how biases affect my decisions.
I have conflicts of interest. Companies I recommend sometimes pay me commissions. You can take a look at the arrangements here.
I’ve tried to align my incentives with consumers by building my brand around commitments to transparency and rigor. I didn’t make these commitments for purely altruistic reasons. If the branding strategy succeeds, I stand to benefit a lot.
Even with my branding strategy, my alignment with consumers will never be perfect. I’ll still be biased. If you ever think I could be doing better, please let me know.
Forbes argues that most college rankings (e.g., U.S. News) fail to focus on what “students care about most.” Forbes’ rankings are based on what it calls “outputs” (e.g., salaries after graduation) rather than “inputs” (e.g., acceptance rates or SAT scores of admitted applicants).
Colleges are ranked based on weighted scores in five categories, illustrated in this infographic from Forbes:
This methodology requires drawing on data to create scores for each category. That doesn’t mean the methodology is good (or unbiased).
Some students are masochists who care almost exclusively about academics. Others barely care about academics and are more interested in the social experiences they’ll have.
Trying to collapse all aspects of the college experience into a single metric is silly—as is the case for most other products, services, and experiences. If I created a rubric to rank foods based on a weighted average of tastiness, nutritional value, and cost, most people would rightfully ignore the results of my evaluation. Sometimes people want salad. Sometimes they want ice cream.
To be clear, my point isn’t that Forbes’ list is totally useless—just that it’s practically useless. My food rubric would come out giving salads a better score than rotten steak. That’s the correct conclusion, but it’s an obvious one. No one needed my help to figure that out. Ranking systems are only useful if they can help people make good decisions when they’re uncertain about their options.
Where do the weights for each category even come from? Forbes doesn’t explain.
Choices like what weights to use are sometimes called researcher degrees of freedom. The choice of what set of weights to use is important to the final results, but an alternative set of reasonable weights could have been used.
Creating scores for each category introduces additional researcher degrees of freedom into Forbes’ analysis. Should 4-year or 6-year graduation rate be used? What data sources should be drawn on? Should debt be assessed based on raw debt sizes or loan default rates? None of these questions have clear-cut answers.
Additional issues show up in the methods used to create category-level scores.
A college ranking method could assess any one of many possible questions. For example:
How impressive is the typical student who attends a given school?
How valuable will a given school be for the typical student who attends?
How valuable will a school be for a given student if she attends?
It’s important which question is being answered. Depending on the question, selection bias may become an issue. Kids who go to Harvard would probably end up as smart high-achievers even if they went to a different school. If you’re trying to figure out how much attending Harvard benefits students, it’s important to account for students’ aptitudes before entering. Initial aptitudes will be less important if you’re only trying to assess how prestigious Harvard is.
Forbes’ methodological choices suggest it doesn’t have a clear sense of what question its rankings are intended to answer.
Alumni salaries get 20% of the overall weight. This suggests that Forbes is measuring something like the prestige of graduates (rather than the value added from attending a school).
Forbes also places a lot of weight on the number of impressive awards received by graduates and faculty members. This again suggests that Forbes is measuring prestige rather than value added.
When coming up with scores for the debt category, Forbes considers default rates and the average level of federal student debt for each student. This suggests Forbes is assessing how a given school affects the typical student that chooses to attend that school. Selection bias is introduced. The typical level of student debt is not just a function of a college’s price and financial aid. It also matters how wealthy students who attend are. Colleges that attract students with rich families will tend to do well in this category.
Forbes switches to assessing something else in the graduation rates category. Graduation rates for Pell Grant recipients receive extra weight. Forbes explains:
Pell grants go to economically disadvantaged students, and we believe schools deserve credit for supporting these students.
Forbes doubles down on its initial error. First, Forbes makes the mistake of aggregating a lot of different aspects of college life into a single metric. Next, Forbes makes a similar mistake by mashing together several different purposes college rankings could serve.
Many evaluators using scoring systems with multiple categories handle the aggregation from category scores to overall scores poorly.Forbes’ methodology web page doesn’t explain how Forbes handled this process, so I reached out asking if it would be possible to see the math behind the rankings. Forbes responded telling me that although most of the raw data is public, the exact process used to churn out the rankings is proprietary. Bummer.
Why does Forbes produce such a useless list? It might be that Forbes or its audience doesn’t recognize how silly the list is. However, I think a more sinister explanation is plausible. Forbes has a web page where schools can request to license a logo showing the Forbes endorsement. I’ve blogged before about how third-party evaluation can involve conflicts of interest and lead to situations where everything under the sun gets an endorsement from at least one evaluator. Is it possible that Forbes publishes a list using an atypical methodology because that list will lead to licensing agreements with schools that don’t get good ratings from better-known evaluators?
I reached out to the licensing contact at Forbes with a few questions. One was whether any details could be shared about the typical financial arrangement between Forbes and colleges licensing the endorsement logo. My first email received a response, but the question about financial arrangements was not addressed. My follow-up email did not get a response.
While most students probably don’t care about how many Nobel Prizes graduates have won, measures of prestige work as pretty good proxies for one another. Schools with lots of prize-winning graduates probably have smart faculty and high-earning graduates. Accordingly, it’s possible to come up with a reasonable, rough ranking of colleges based on prestige.
While Forbes correctly recognizes that students care about things other than prestige, it fails to provide a useful resource about the non-prestige aspects of colleges.
The old College Prowler website did what Forbes couldn’t. On that site, students rated different aspects of schools. Each school had a “report card” displaying its rating in diverse categories like “academics,” “safety,” and “girls.” You could even dive into sub-categories. There were separate scores for how hot guys at a school were and how creative they were.
Forbes’ college rankings were the first college rankings I looked into in depth. While writing this post, I realized that rankings published by U.S. News & World Report and Wall Street Journal/Times Higher Education both use weighted scoring systems and have a lot of the same methodological issues.
To its credit, Forbes is less obnoxious and heavy-handed than U.S. News. In the materials I’ve seen, Forbes doesn’t make unreasonable claims about being unbiased or exclusively data-driven. This is in sharp contrast to U.S. News & World Report. Here’s an excerpt from the U.S. News website under the heading “How the Methodology Works:”
Hard objective data alone determine each school’s rank. We do not tour residence halls, chat with recruiters or conduct unscientific student polls for use in our computations.
The rankings formula uses exclusively statistical quantitative and qualitative measures that education experts have proposed as reliable indicators of academic quality. To calculate the overall rank for each school within each category, up to 16 metrics of academic excellence below are assigned weights that reflect U.S. News’ researched judgment about how much they matter.
As a general rule, I suggest running like hell anytime someone says they’re objective because they rely on data.
U.S. News’ dogmatic insistence that there’s a clear dichotomy separating useful data from unscientific, subjective data is misguided. The excerpt also contradicts itself. “Hard objective data alone” do not determine the schools’ ranks. Like Forbes, U.S. News uses category weights. Weights “reflect U.S. News’ researched judgment about how much they matter.” Researched judgments are absolutely not hard data.
It’s good to be skeptical of third-party evaluations that are based on evaluators’ whims or opinions. Caution is especially important when those opinions come from an evaluator who is not an expert about the products or services being considered. However, skepticism should still be exercised when evaluation methodologies are data-heavy and math-intensive.
Coming up with scoring systems that look rigorous is easy. Designing good scoring systems is hard.
TopTenReviews ranks products and services in a huge number of industries. Stock trading platforms, home appliances, audio editing software, and hot tubs are all covered.
TopTenReviews’ parent company, Purch, describes TopTenReviews as a service that offers, “Expert reviews and comparisons.”
Many of TopTenReviews’ evaluations open with lines like this:
We spent over 60 hours researching dozens of cell phone service providers to find the best ones.
I’ve seen numbers between 40 and 80 hours in a handful of articles. It takes a hell of a lot more time to understand an industry at an expert level.
I’m unimpressed by TopTenReviews’ rankings in industries I’m knowledgable about. This is especially frustrating since TopTenReviews often ranks well in Google.
A particularly bad example: indoor bike trainers. These devices can turn regular bikes into stationary bikes that can be ridden indoors.
I love biking and used to ride indoor trainers a fair amount. I’m suspicious the editor who came up with the trainer rankings at TopTenReviews couldn’t say the same.
The following paragraph is found under the heading “How we tested on the page for bike trainers”:
We’ve researched and evaluated the best roller, magnetic, fluid, wind and direct-drivebike [sic] trainers for the past two years and found the features that make the best ride for your indoor training. Our reviewers dug into manufacturers’ websites and engineering documents, asked questions of expert riders on cycling forums, and evaluated the pros and cons of features on the various models we chose for our product lineup. From there, we compared and evaluated the top models of each style to reach our conclusions. 
There’s no mention of using physical products.
The top overall trainer is the Kinetic Road Machine. It’s expensive but probably a good recommendation. I know lots of people with either that model or similar models who really like their trainers.
However, I don’t trust TopTenReviews’ credibility. TopTenReviews has a list of pros and cons for the Kinetic Road Machine. One con is: “Not designed to handle 700c wheels.” It is.
It’s a big error. 700c is an incredibly common wheel size for road bikes. I’d bet the majority of people using trainers have 700c wheels. If the trainer wasn’t compatible with 700c wheels, it wouldn’t deserve the “best overall” designation.
TopTenReviews even states, “The trainer’s frame fits 22-inch to 29-inch bike wheels.” 700c wheels fall within that range. A bike expert would know that.
TopTenReviews’ website has concerning statements about its approach and methodology. An excerpt from their about page (emphasis mine):
Our tests gather data on features, ease of use, durability and the level of customer support provided by the manufacturer. Using a proprietary weighted system (i.e., a complicated algorithm), the data is scored and the rankings laid out, and we award the three top-ranked products with our Gold, Silver and Bronze Awards.
Maybe TopTenReviews came up with an awesome algorithm no one else has thought of. I find it much more plausible that—if a single algorithm exists—the algorithm is private because it’s silly and easy to find flaws in.
TopTenReviews receives compensation from many of the companies it recommends. While this is a serious conflict of interest, it doesn’t mean all of TopTenReviews’ work is bullshit. However, I see this line on the about page as a red flag:
Methods of monetization in no way affect the rankings of the products, services or companies we review. Period.
Avoiding bias is difficult. Totally eliminating it is almost always unrealistic.
Employees doing evaluations will sometimes have a sense of how lucrative it will be for certain products to receive top recommendations. These employees would probably be correct to bet that they’ll sometimes be indirectly rewarded for creating content that’s good for the company’s bottom line.
Even if the company is being careful, bias can creep up insidiously. Someone has to decide what the company’s priorities will be. Even if reviewers don’t do anything dishonest, the company strategy will probably entail doing evaluations in industries where high-paying affiliate programs are common.
Reviews will need occasional updates. Won’t updates in industries where the updates could shift high-commission products to higher rankings take priority?
TopTenReviews has a page on foam mattresses that can be ordered online. I’ve bought two extremely cheap Zinus mattresses on Amazon. I’ve recommended these mattresses to a bunch of people. They’re super popular on Amazon. TopTenReviews doesn’t list Zinus.
R-Tools Technology Inc. has a great article discussing their software’s position in TopTenReviews’ rankings, misleading information communicated by TopTenReviews, and conflicts of interest.
The article suggests that TopTenReviews may have declined in quality over the years:
In 2013, changes started to happen. The two principals that had made TopTenReviews a household name moved on to other endeavors at precisely the same time. Jerry Ropelato became CEO of WhiteClouds, a startup in the 3D printing industry. That same year, Stan Bassett moved on to Alliance Health Networks. Then, in 2014, the parent company of TopTenReviews rebranded itself from TechMediaNetwork to Purch.
Purch has quite a different business model than TopTenReviews did when it first started. Purch, which boasted revenues of $100 million in 2014, has been steadily acquiring numerous review sites over the years, including TopTenReviews, Tom’s Guide, Tom’s Hardware, Laptop magazine, HowtoGeek, MobileNations, Anandtech, WonderHowTo and many, many more.
I don’t think I would have loved the pre-2013 website, but I think I’d have more respect for it than today’s version of TopTenReviews.
I’m not surprised TopTenReviews can’t cover hundreds of product types and consistently provide good information. I wish Google didn’t let it rank so well.
Sturgeon’s law:Ninety percent of everything is crap.
Rankings & reviews online
The internet is full of websites that ostensibly rank, rate, and/or review companies within a given industry. Most of these websites are crappy. Generally, these ranking websites cover industries where affiliate programs offering website owners large commissions are common.
Here are a few examples of industries and product categories where useless review websites are especially common:
Web hosting services
Online fax services
If you Google a query along the lines of “Best [item from the list above]” you’ll likely receive a page of search results with a number of “top 10 list” type sites. At the top of your search results you will probably see ads like these:
Lack of in-depth evaluation methodologies
Generally, these “review” sites don’t go into any kind of depth to assess companies. As far as I can tell, rankings tend to be driven primarily by a combination of randomness and the size of commissions offered.
Admittedly, it’s silly to think that the evaluation websites found via Google’s ads would be reliable. Unfortunately, the regular (non-ad) search results often include a lot of garbage “review” websites. From the same query above:
Most of these websites don’t offer evaluation methodologies that deserve to be taken seriously.
Even the somewhat reputable names on the list (i.e. CNET & PCMag) don’t offer a whole lot. Neither CNET nor PCMag clearly explain their methodologies, and the written content doesn’t lead me to believe either entity went in depth to evaluate the services considered.
If consumers easily recognized these bogus evaluation websites for what they are, the websites would just be annoyances. Unfortunately, it looks like a substantial portion of consumers don’t realize these websites lack legitimacy.
Google offers a tool that presents prices that “advertisers have historically paid for a keyword’s top of page bid.” According to this tool, advertisers are frequently paying several dollars per click on the kind of queries that return ads for bogus evaluation websites:
We should expect that advertisers will only be willing to pay for ads when the expected revenue per click is greater than the cost per click. The significant costs paid per click suggest that a non-trivial portion of visitors to bogus ranking websites end up purchasing from one of the suggested companies.
How biased are evaluation websites found via Google?
Let’s turn to another industry. The VPN industry shares a lot of features with the web hosting industry. Both VPN and web hosting services tend to be sold online with reasonably low prices and reoccurring billing cycles. Affiliate programs are very common in both industries. There’s an awesome third-party website, ThatOnePrivacySite.net, that assesses VPN services and refuses to accept commissions. ThatOnePrivacySite has reviewed over thirty VPN services. At the time of writing, only one, Mullvad, has received a “TOPG Choice” award, indicating an excellent review. Mullvad is interesting in that it has no affiliate program—an unusual characteristic for a VPN company.
Mullvad’s lack of an affiliate program allowed me to perform a little experiment. I Googled the query “Best VPN service”. I received 16 results, 7 of which were ads. 15 of the results provided rankings of VPN services (one result was an ad for a specific VPN company).
Three of the ranking websites listed Mullvad:
3 out of 9 isn’t great, but it’s not terrible either. None of the 6 ranking websites that I received ads for listed Mullvad.