Burton Malkiel, author of the famous investment guide A Random Walk Down Wall Street, came up with a method to test a theory suggesting that movements in stock prices were essentially random and unpredictable. He set an initial $50 price for a made-up stock and had each of his students flip a coin every day. If the coin came up heads, their fictitious stock price went up a half of a basis point. If the coin landed on tails, the stock price went down by the same amount. Malkiel had the students keep charts of their “stock prices” over the course of his class.
Near the end of term, Malkiel took these time-series of fake stock prices to chartists for analysis. Chartists were a former profession of forecasters who claimed that they could use past histories of stock price movements to determine what future stock prices were going to be. These chartists gave confident opinions to Malkiel that ran the gamut from “Buy right now!” to “Sell right now!” and everything in between. Interestingly, not one of the chartists Malkiel presented these graphics to (which were based on a pure-noise process of flipping coins) realized that they were just random noise devoid of a meaningful pattern. They all assumed that there must have been a legible system at work.
Malkiel suggested that psychology was the most likely culprit in explaining why humans need to read patterns into random data. People struggle to cope with a lack of order and therefore often see patterns in data where there are none.
In this way, visualizing data becomes a tricky proposition. At its best, a data visualization can isolate and make clear true trends in the underlying data. Unfortunately, it can also lead to a mirage of understanding. Random processes typically do not appear random, especially to the naked eye. The unfortunate byproduct of increasingly easy access to visualized information is an explosion of analytical dilettantism. Individuals’ beliefs about the stories they “see” in graphical data can lead to wildly incorrect assessments about complex, random processes.
An example of this fallacy, as Malkiel’s unfortunate chartists eventually discovered, is called spurious correlation. This occurs when variables share a high correlation and look like they move in a clear relationship to each other on a graph even though they’re totally unrelated. Tyler Vigen runs a website dedicated to visualizing the most ridiculous examples of these data. For example, there is no logical connection between the amount of money spent by the U.S. on science, space and technology initiatives and the total number of suicides by hanging, strangulation and suffocation. Vigen charted both values over the 10-year period from 1999 to 2009, however, showing a close visual evolution of both series and a correlation coefficient of 99.8 percent for good measure. This visualization certainly makes it seem like there should be some underlying reason why these two metrics are so closely related.
Vigen’s website (and book) demonstrate that spurious correlation can arise whenever random processes are compared with each other. The ease with which anyone can now create data visualizations and the ever-expanding range of options available for data presentation can lead researchers to cherry-pick the most appealing charts to make their cases.
The Whole Picture
The incentive, then, if a researcher’s goal is to convince people that her conclusion is correct, is to only present the most convincing graphical evidence to the public. It is impossible for observers to see what visuals were not included. If all relevant visualizations were published, however, a compelling chart could be benchmarked against all of the different graphical choices, scales and transformations that it went through prior to its final form. Doing so would allow the public to evaluate any claims with a hermeneutic of healthy skepticism.
Such a framework wouldn’t be totally out of the blue, either. Most researchers understand and are rightly wary of these “researcher degrees of freedom” and consequently apply a skeptical eye toward statistical estimates from mathematical models. Often, though, they fail to treat visualizations with this same skepticism. Even more damning is that data visualizations are generally presented to a much wider and less questioning audience than formal statistical results are. Beautiful graphics that tell compelling stories typically do not face harsh pushback, especially by lay audiences.
Not Enough Information
Another issue arising from visualized data is the information that figures leave out. Abraham Wald was a famous mathematician in the early 20th century. During World War II, Wald worked with Columbia University’s Statistical Research Group, a team of researchers assigned to work on questions that arose during warfare. They were tasked with solving a unique problem faced by air crews during bombing raids. The military wanted to reduce the number of bombers shot down. To accomplish this, they floated the idea of beefing up protection on the areas of the plane that showed the most damage upon return.
Wald realized that it wasn’t where the hits on returning planes occurred that mattered, but where these planes were not hit. Wald reasoned that everywhere on a bomber is equally likely to take a hit during aerial combat. This insight led him to conclude that the military should do the exact opposite and allocate more armor to the parts of returning bombers that were not hit. The reason for this was that bombers that never made it back likely had taken hits in the exact same parts that showed no damage on the returning planes. Thankfully, Wald used a deep understanding of statistical distributions to correctly identify what the visual evidence on the returning planes was indicating. But for Wald’s insight, however, the visual evidence would have led to a waste of resources as the military over-armored the least vulnerable parts of the aircraft and likely cost the lives of more Allied aircrews.
Forty years after World War II ended, aerial tragedy did strike, and it was largely due to data left off of a visualization. In 1986, the Challenger space shuttle exploded a minute and a half after lift-off. All seven crew members were killed in the blast. The reason for the explosion was that an O-ring, a type of sealing piece between sections of the rocket’s engine, failed to expand enough to properly fill its section. The subsequent hold in the motor caused a fireball that blew the rocket apart.
After the crash, an investigation found that, in the lead up to the launch, a graphic showed the number of distressed O-rings per practice launch and the ambient air temperature on the date of the practice launch for launches that came back with at least one distressed O-ring. The pattern on the graphic showed no visual evidence that air temperature displayed any relationship with O-rings experiencing distress during launch. By only charting the flights that had at least one distressed O-ring, however, the majority of practice launches, which had no signs of O-Ring distress, were left off of the graph. Every one of these launches occurred on days with temperatures in a range from 65 to 81 degrees Fahrenheit. The launch exhibiting the most O-ring-distress occurrences happened during a morning with an outside temperature of 51 degrees. The ambient air temperature on the morning of the explosion was 31 degrees As the commission investigating the explosion found, the missing data from the graphic led to a belief that the association between air temperature and the probability of O-ring failure was small to non-existent. This, in turn, led to the disastrous call to go ahead with the launch on an unseasonably cold morning that cost seven individuals their lives.
How, then, in a world increasingly dominated by data visualizations, many of which are updated in real-time, can a person avoid the pitfalls of drawing mistaken or unjustified conclusions from these graphics? Amanda Makulec offered ten suggestions for doing so in a March 2020 article. She followed up on these recommendations at the end of last year. Her pointers are in the context of information concerning COVID-19. Some of her tips, though, go beyond just pandemic-related data and provide the best means by which to guard against being misled by visuals.
The first rule is the most imperative: Do more to understand the data than simply downloading it and throwing it into a dashboard or graphic. Failures to account for the intricacies of data collection including what the numbers actually count, how they were reported, who collected them and other considerations can mislead just as easily as illuminate. If you don’t know your data, your audience almost certainly won’t either.
Makulec’s second tip is to avoid overaggregation and meaningless or spurious comparisons. This is especially true of rate calculations due to the underappreciated difficulty of identifying the correct base groups for comparison. For instance, Makulec argues that data can be highly misleading if the base groups for comparing metrics are different or not readily identifiable. This problem is widespread across disciplines, affecting how countries count COVID cases, how different municipalities compile information on crime or even how companies compute the proportion of employees who have terminated over the last year between workplaces.
Another important pointer Makulec offers is that visualizations should be honest about what is not being represented. It’s crucial, then, to visualize uncertainty in addition to the known data. This is especially important when presenting model estimates in graphical format and bleeds into another of Makulec’s tips: Be clear about what the data are, what model estimates are and where the uncertainty lies.
Don’t Be Deceived by Data Visualization
At best, data visualization is a powerful tool to identify patterns in noisy data. Additionally, graphical evidence can be an excellent pedagogical device by which to illustrate important aspects of what collected information can tell us. When mocked up hastily, however, and with little thought as to blind spots in how humans perceive and can be misled by graphics at a topical level, data visuals can be just as fraught, if not more so, than traditional statistics. In presenting viewers with a façade of certainty (“a picture is worth a thousand words,” as the cliché goes), misleading data visualization can lead to spurious conclusions or a lack of recognition of the real risks underlying critical decisions. Asking questions of the visuals and the numbers behind them is just as crucial a type of skeptical inquiry as at every other level of statistical analysis.