Book Notes/Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy
Cover of Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy

Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy

by Cathy O'Neil

In "Weapons of Math Destruction," Cathy O'Neil critiques the pervasive use of big data and algorithmic models that, despite their mathematical precision, perpetuate inequality and threaten democratic principles. Central to her argument is the idea that these models are not neutral; they reflect the biases and ideologies of their creators, often reinforcing systemic injustices. O'Neil highlights that data processes are inherently retrospective, encoding past prejudices rather than predicting fair futures, and that human values must be explicitly integrated into algorithmic design to prioritize fairness over profit. She illustrates how these models, termed "WMDs," tend to penalize marginalized groups while benefiting the affluent, creating a feedback loop of disadvantage. For instance, algorithms used in criminal justice or credit scoring often rely on proxies that unfairly incriminate individuals based on socio-economic factors rather than their actions. O'Neil also discusses the implications of opaque algorithms that make unjust decisions without accountability. Ultimately, she calls for a more equitable approach to data-driven decision-making, urging society to harness the power of data to support rather than punish those in need. The book serves as a warning against blind faith in mathematical models, advocating for moral responsibility in the development and application of algorithms to ensure justice and equality in a data-driven world.

30 popular highlights from this book

Key Insights & Memorable Quotes

Below are the most popular and impactful highlights and quotes from Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy:

Big Data processes codify the past. They do not invent the future.
Big Data processes codify the past. They do not invent the future. Doing that requires moral imagination, and that’s something only humans can provide. We have to explicitly embed better values into our algorithms, creating Big Data models that follow our ethical lead. Sometimes that will mean putting fairness ahead of profit.
Here we see that models, despite their reputation for impartiality, reflect goals and ideology. When I removed the possibility of eating Pop-Tarts at every meal, I was imposing my ideology on the meals model. It’s something we do without a second thought. Our own values and desires influence our choices, from the data we choose to collect to the questions we ask. Models are opinions embedded in mathematics.
At the federal level, this problem could be greatly alleviated by abolishing the Electoral College system. It's the winner-take-all mathematics from state to state that delivers so much power to a relative handful of voters. It's as if in politics, as in economics, we have a privileged 1 percent. And the money from the financial 1 percent underwrites the microtargeting to secure the votes of the political 1 percent. Without the Electoral College, by contrast, every vote would be worth exactly the same. That would be a step toward democracy.
In a system in which cheating is the norm, following the rules amounts to a handicap.
Racism, at the individual level, can be seen as a predictive model whirring away in billions of human minds around the world. It is built from faulty, incomplete, or generalized data. Whether it comes from experience or hearsay, the data indicates that certain types of people have behaved badly. That generates a binary prediction that all people of that race will behave that same way. Needless to say, racists don’t spend a lot of time hunting down reliable data to train their twisted models. And once their model morphs into a belief, it becomes hardwired. It generates poisonous assumptions, yet rarely tests them, settling instead for data that seems to confirm and fortify them. Consequently, racism is the most slovenly of predictive models. It is powered by haphazard data gathering and spurious correlations, reinforced by institutional inequities, and polluted by confirmation bias.
Justice cannot just be something that one part of society inflicts on the other.
However, when you create a model from proxies, it is far simpler for people to game it. This is because proxies are easier to manipulate than the complicated reality they represent.
Just imagine if police enforced their zero-tolerance strategy in finance. They would arrest people for even the slightest infraction, whether it was chiseling investors on 401ks, providing misleading guidance, or committing petty frauds. Perhaps SWAT teams would descend on Greenwich, Connecticut. They’d go undercover in the taverns around Chicago’s Mercantile Exchange.
The math-powered applications powering the data economy were based on choices made by fallible human beings. Some of these choices were no doubt made with the best intentions. Nevertheless, many of these models encoded human prejudice, misunderstanding, and bias into the software systems that increasingly managed our lives. Like gods, these mathematical models were opaque, their workings invisible to all but the highest priests in their domain: mathematicians and computer scientists. Their verdicts, even when wrong or harmful, were beyond dispute or appeal. And they tended to punish the poor and the oppressed in our society, while making the rich richer.
The human victims of WMDs, we’ll see time and again, are held to a far higher standard of evidence than the algorithms themselves.
Our own values and desires influence our choices, from the data we choose to collect to the questions we ask. Models are opinions embedded in mathematics.
I was forced to confront the ugly truth: people had deliberately wielded formulas to impress rather than clarify.
The government regulates them, or chooses not to, approves or blocks their mergers and acquisitions, and sets their tax policies (often turning a blind eye to the billions parked in offshore tax havens). This is why tech companies, like the rest of corporate America, inundate Washington with lobbyists and quietly pour hundreds of millions of dollars in contributions into the political system. Now they’re gaining the wherewithal to fine-tune our political behavior—and with it the shape of American government—just by tweaking their algorithms.
Simpson’s Paradox: when a whole body of data displays one trend, yet when broken into subgroups, the opposite trend comes into view for each of those subgroups.
If you look at this development from the perspective of a university president, it’s actually quite sad. Most of these people no doubt cherished their own college experience—that’s part of what motivated them to climb the academic ladder. Yet here they were at the summit of their careers dedicating enormous energy toward boosting performance in fifteen areas defined by a group of journalists at a second-tier newsmagazine. They were almost like students again, angling for good grades from a taskmaster. In fact, they were trapped by a rigid model, a WMD.
To create a model, then, we make choices about what’s important enough to include, simplifying the world into a toy version that can be easily understood and from which we can infer important facts and actions. We expect it to handle only one job and accept that it will occasionally act like a clueless machine, one with enormous blind spots.
Thanks in part to the resulting high score on the evaluation, he gets a longer sentence, locking him away for more years in a prison where he’s surrounded by fellow criminals—which raises the likelihood that he’ll return to prison. He is finally released into the same poor neighborhood, this time with a criminal record, which makes it that much harder to find a job. If he commits another crime, the recidivism model can claim another success. But in fact the model itself contributes to a toxic cycle and helps to sustain it. That’s a signature quality of a WMD.
My point is that police make choices about where they direct their attention. Today they focus almost exclusively on the poor. That’s their heritage, and their mission, as they understand it.
[A] crucial part of justice is equality, and that means, among other things, experiencing criminal justice equally. People who favor policies like Stop and Frisk should experience it themselves. Justice cannot just be something that one part of society inflicts upon the other.
This is a point I’ll be returning to in future chapters: we’ve seen time and again that mathematical models can sift through data to locate people who are likely to face great challenges, whether from crime, poverty, or education. It’s up to society whether to use that intelligence to reject and punish them—or to reach out to them with the resources they need.
We’ve seen time and again that mathematical models can sift through data to locate people who are likely to face great challenges, whether from crime, poverty, or educations. It’s up to society whether to use that intelligence to reject and punish them—or to reach out to them with the resources they need. We can use the scale and efficiency that make WMDs so pernicious in order to help people. It all depends on the objective we choose.
This is unjust. The questionnaire includes circumstances of a criminal’s birth and upbringing, including his or her family, neighborhood, and friends. These details should not be relevant to a criminal case or to the sentencing. Indeed, if a prosecutor attempted to tar a defendant by mentioning his brother’s criminal record or the high crime rate in his neighborhood, a decent defense attorney would roar, “Objection, Your Honor!” And a serious judge would sustain it. This is the basis of our legal system. We are judged by what we do, not by who we are. And although we don’t know the exact weights that are attached to these parts of the test, any weight above zero is unreasonable.
Baseball also has statistical rigor. Its gurus have an immense data set at hand, almost all of it directly related to the performance of players in the game. Moreover, their data is highly relevant to the outcomes they are trying to predict. This may sound obvious, but as we’ll see throughout this book, the folks building WMDs routinely lack data for the behaviors they’re most interested in. So they substitute stand-in data, or proxies. They draw statistical correlations between a person’s zip code or language patterns and her potential to pay back a loan or handle a job. These correlations are discriminatory, and some of them are illegal.
Opaque and invisible models are the rule, and clear ones very much the exception. We’re modeled as shoppers and couch potatoes, as patients and loan applicants, and very little of this do we see—even in applications we happily sign up for. Even when such models behave themselves, opacity can lead to a feeling of unfairness.
qualifies as a WMD. The people putting it together in the 1990s no doubt saw it as a tool to bring evenhandedness and efficiency to the criminal justice system. It could also help nonthreatening criminals land lighter sentences. This would translate into more years of freedom for them and enormous savings for American taxpayers, who are footing a $70 billion annual prison bill. However, because the questionnaire judges the prisoner by details that would not be admissible in court, it is unfair. While many may benefit from it, it leads to suffering for others. A key component of this suffering is the pernicious feedback loop. As we’ve seen, sentencing models that profile a person by his or her circumstances help to create the environment that justifies their assumptions. This destructive loop goes round and round, and in the process the model becomes more and more unfair. The third question is whether a model has the capacity to grow exponentially. As a statistician would put it, can it scale? This might sound like the nerdy quibble of a mathematician. But scale is what turns WMDs from local nuisances into tsunami forces, ones that define and
Once companies amass troves of data on employees’ health, what will stop them from developing health scores and wielding them to sift through job candidates? Much of the proxy data collected, whether step counts or sleeping patterns, is not protected by law, so it would theoretically be perfectly legal. And it would make sense. As we’ve seen, they routinely reject applicants on the basis of credit scores and personality tests. Health scores represent a natural—and frightening—next step.
If we’re going to be equal before the law, or be treated equally as voters, we cannot stand for systems that drop us into different castes and treat us differently.
And in Florida, adults with clean driving records and poor credit scores paid an average of $1,552 more than the same drivers with excellent credit and a drunk driving conviction.
This underscores another common feature of WMDs. They tend to punish the poor. This is, in part, because they are engineered to evaluate large numbers of people. They specialize in bulk, and they’re cheap. That’s part of their appeal. The wealthy, by contrast, often benefit from personal input. A white-shoe law firm or an exclusive prep school will lean far more on recommendations and face-to-face interviews than will a fast-food chain or a cash-strapped urban school district. The privileged, we’ll see time and again, are processed more by people, the masses by machines.

More Books You Might Like