Lots of people make up data for lots of reasons. Maybe they’re guessing at numbers to make a point. Perhaps they’re estimating numbers based on personal experience or work expertise. Often they’re simply exaggerating for comic effect.
Whatever the reason someone fabricates data, most of the time they’ll admit the data’s not true — if you know how to read their claims.
For instance, YouTuber Chef Tyler made a 10-layer grilled cheese sandwich. He then cut that enormous sandwich into slices of “bread” and used them to make another 10-layer grilled cheese, and then cut that sandwich into slices and used those to make one final grilled cheese sandwich. That’s a lot of cheese. (Seriously, go watch the video.) But when Tyler claimed that the slices he used for that final grilled cheese were “basically 80% cheese,” it was obvious he had made that number up.
How can you tell when someone’s fabricating data? Here are three common clues:
They use qualifiers like “basically” or “at least.” When someone prefaces a number with “basically” they’re either admitting they don’t know or trying to be funny. Chef Tyler was trying to be funny. (It seems to work for him: He’s got millions of followers. And probably coronary disease.) But when the commissioner of Major League Baseball said steroid use in his sport was “basically over” and “virtually nonexistent,” he made it clear he didn’t know for sure how many players were doping.
Likewise, the words “at least” should raise red flags. James Beard Award-winning chef Michael Solomonov was obviously joking when he said “My body is at least 33 percent kebab.” (What is it with chefs and fake data?) But when used in a serious context — like “at least 1000 birds died” or “quiet quitters make up at least 50% of the workforce” — it means the writer simply don’t have a reliable number to report.
They quote data ranges. A data range is just a numerical way of saying “basically” or “at least” — and lacks credibility for the same reasons.
So when someone says “20 to 30% of office stock is obsolete” or “pricing is down 30 to 40%” (or both in the same interview) you can be pretty sure they’re guessing. And when someone says “It’s estimated 60% to 90% of non-fiction books are ghostwritten,” they’re either guessing or they’re repeating guesses from two different sources. (Though as my old editor Josh Bernoff pointed out, this writer didn’t bother to list sources.)
They cite round numbers. Your atennae should also perk up when you see round numbers. Because while some data just works out to, say, 40% or 75%, it’s somewhat rare: Just one in seven of the numbers from 0 to 100 either end in zero or break the data into quartiles or thirds. It’s no coincidence that all the fake numbers quoted above are round: 80%, 33%, 1000, 50%, 20-30%, 30-40%, 60-90%.
Whenever someone cites a round number, you should pay more attention to their data’s source and accuracy. And if they don’t list a source? That number is very likely an exaggeration or an estimate, not an actual data point.
If you encounter any one of these tricks, think twice before believeing the numbers or repeating them yourself. And if someone uses several of these tricks at once — or if you spot other problems, like the lack of a credible source — you can be pretty sure they’re making stuff up.
Thanks for reading. What’s the sketchiest “data” you’ve seen lately? Post it in the comments below or on LinkedIn. And if you want good data delivered to your inbox, subscribe here.