Can We Explain Online Toxicity When We Don’t Know What it Is?

A new paper claims to explain online toxicity with a simple model. But a lot depends on what we mean by “toxic.”

J. Nathan Matias
10 min readFeb 12, 2024
Onandaga Lake, just north of where I live, has been one of the most toxic, polluted waterways in the US. The cleanup has cost over a billion dollars, and every few years a new article declares it sort of cleaner. Image: SYR

Last week, when Cory Doctorow shared an academic article on Mastodon about online toxicity, I had to click. It’s a topic I care about, as a scientist who has studied solutions to online harassment and written about environmental conservation and food safety as metaphors for Internet governance for over a decade.

UPDATE Apri 6: One of the authors has responded, and so have I, in a post entitled: “Disagreements, Fast and Slow.” If you read this post, I encourage you to also read my follow-up.

The anonymous article had a grand title and offered a simple model to make a big claim: “Bias, Skew and Search Engines Are Sufficient to Explain Online Toxicity.” It also seemed incomplete and waiting for peer review. When I expressed my skepticism and asked for more details, I got a response from Henry Farrell, a colleague I deeply appreciate who turned out to be one of the co-authors. Knowing how much Henry values the respect expressed in thoughtful disagreement, I decided to take a closer look and write up more of a response — in the form of this post.

In this viewpoints article for the Association for Computer Machinery, Henry and Cosma Shalizi critique (what I agree to be) an overly simplistic techno-determinist view that “toxicity” in political discourse is the result of algorithms that optimize for engagement and profit (you can read my view on this problem in a Nature article last year). They point out that “if social media algorithms are primarily to blame for fractured discourse, then curbing them might make the Internet safer for democracy.” They then carry out a thought experiment to imagine and model a world without these algorithms.

In their thought experiment, they take a few assumptions about human psychology, make some assumptions about how search engines work, and then write a simulation for “preferential attachment” on the basis of those assumptions. Their outcome variable for this simulation is the number of groups of people who would form in a world that only had search engines — on the assumption that an Internet with many smaller groups would have greater toxicity and a world with fewer, larger groups would be less toxic. Since their simulation can imagine an Internet with many small groups, they conclude that a toxic internet does not depend on social media algorithms.

Models like this one are tools to think with — they force people to specify what they mean when sharing ideas, creating a basis for dialog and new science.

What is the value of such an over-simplification? Models like this one are tools to think with — they force people to specify what they mean when sharing ideas, creating a basis for dialog and new science. After all, the very idea of “pollution” is itself such a model. Consider the Streeter-Phelps equation, which defined pollution with a simple model for the critical oxygen deficit created by pollutants in a river. With the simple idea and a single formula, the public can imagine what it means for a river to die, and scientists can calculate the point at which a waterway becomes so polluted that it can no longer sustain life. Individual waterways vary widely, but the idea of pollution itself has given people a tool to think with for water management.

Streeter and Phelps’s model of pollution works because it’s based on actual observations of the Ohio River. Unfortunately, Henry and Cosma’s article is not so well grounded. They make an even more fundamental move in their viewpoint article which puts their argument on uncertain ground: basing it around the idea of toxicity, which has no stable or agreed-upon definition.

Talk about internet toxicity is a dodge — a way to obfuscate and sidestep hard debates about democratic governance

What do people mean when they talk about “toxicity” and “health” online? That’s the question taken up by an excellent new article in Information, Communication, and Society, Anna Gibson, Niall Docherty, and Tarleton Gillespie. Writing about this metaphor, they argue that talk of toxicity is a dodge — a way to obfuscate and sidestep precision in order to avoid hard debates about democratic governance.

As Anna and colleagues tell it, policymakers, researchers, and journalists use this language when they want to talk about social problems without having to mention any actual people. Imagine you’re Mark Zuckerberg facing scrutiny from Congress for your decisions. Do you point to a specific group of people and name their history of misinformation and hatred? Or do you just wave your hands and say you’re working on the problem of “toxicity” and “health” ? It’s the Internet policy equivalent of the passive voice- announcing “a gun was discharged” rather than saying “Bonnie & Clyde shot him in cold blood.”

Vague terms like toxicity offer safeguards to CEOs and politicians, but this vagueness also hinders science and the search for solutions. And on the issue of online harms, the details of your definitions make a huge difference in how you think about them. To illustrate this, I want to summarize three different, complementary definitions of toxicity, grounded in empirical research, that would lead scientists to completely different models: Selection, Competition, and Hatred. These are not the only three, but they are the most well studied.

The Selection Model of Online Toxicity

In the selection theory of online toxicity, problems with political discourse are the result of small interest groups that don’t understand each other and instead compete. This is the view that Henry and Cosma advance, and it has a long pedigree, going all the way back to Gordon Allport’s work on inter-group relations in 1954.

In this view (which I am over-simplifying), selection occurs when people sort into in-groups and out-groups that form prejudices and lose the capacity to listen, understand, and negotiate with each other. Ever since personalized algorithms were first invented in the 1990s, largely white male thought leaders including Nicholas Negroponte, Cass Sunstein, and Eli Pariser have spread the concern (without evidence) that recommender systems online might come to reinforce this sorting, divide people from each other, and create a threat to democracy.

Every four years, someone floats this idea as an explanation for the bitter struggle of another U.S. presidential election and a new group of organizations works on the “contact hypothesis” proposed by Allport as a way to solve the problem — creating more respectful connections between groups in conflict. Unfortunately, the evidence for this is very weak, even if you accept this as the central problem.

Many scholars have attempted to study the filter bubble idea, with the most reliable research finding that “the echo chambers narrative captures, at most, the experience of a minority of the public.

I tend to agree — though the one area that does concern me is recruitment to violent extremism. In that area, even small effects can have severe consequences, especially when a small number of people commit mass murder. A recent study published in Science Advances found that 3% of the 250 million Americans on YouTube who don’t subscribe to extremist and terrorist videos nonetheless are drawn to them by YouTube’s algorithm — that’s 7.5 million people who were introduced to extremism by the platform. As the authors write, “low levels of algorithmic amplification can have damaging consequences when extrapolated over YouTube’s vast user base and across time.”

Even if (only) roughly 7.5 million Americans have been introduced to extremism by YouTube’s algorithm, it’s still possible that Henry and Cosma are right that this is a drop in the bucket compared to the effects of people joining different, opposing teams. But to assess this view, we need to understand the two other leading definitions of toxicity.

The Attention-Competition Model of Online Toxicity

If Henry and Cosma define toxicity in terms of the size of communities — with smaller communities being more toxic and larger ones less so, the opposite might also be true. In my research on “online toxicity,” I tend to define online harms in terms of online harassment — slurs, threats of violence, and denigrating language that people fling at each other. And I’ll be honest — I’ve seen a *lot* of it in big, diverse online spaces.

Consider r/science, one of my collaborators. At 31.5 million subscribers, it’s one of the world’s largest conversation spaces for discussing scientific research. And without moderation it would be utterly unlivable — in a study on online harassment I published in PNAS, over half of all newcomer comments in this community were so unspeakable that a team of over a thousand volunteer moderators had to remove them (we went on to find effective prevention interventions). In another case that William Frey and I just wrote about for Data & Society, some of the most overwhelming moments of online racism occur when White Americans swarm Black communities with offers of help after Black people are murdered by the police.

Inspired by conversations with communities and scholarship by Benjamin Mako Hill, Aaron Shaw, and other Wikipedia scholars, I’m coming to believe that conversations online attract greater harassment and vitriol the larger they are. As Mako puts it, attention is one of the critical resources on the Internet, something that many actors want to exploit. With too little attention, a community cannot be self-sustaining. But when a platform, community, or conversation reaches a certain threshold of attention, it becomes an attractive target for hate groups, scammers, peddlers of disinformation, and political actors that compete for our attention.

In this view, recommender systems are both shaped by and shape this problem. Because they are designed to redirect people to highly popular topics, they pour fuel onto sites of active conflict, making it even greater. Inspired by Mako’s work, my lab has been monitoring tens of thousands of online conversations and communities to study this dynamic. In time, we hope to develop a pollution model not unlike the Streeter-Phelps equation: computing and predicting the threshold at which a community has so much attention that the conversation will become too full of hatred, conflict, and self-promotion to sustain.

The Hatred and Prejudice Model of Online Toxicity

While the attention-competition model is grounded in observed reality, neither of the of the models I’ve listed can actually explain the majority of terrible things that people experience online. That’s because both of these models explain problems arising from the relations of people who don’t know each other. But researchers have known for at least a decade that at least half of the horrible things people experience online come from people they know or have a connection to.

A huge amount of online violence arises from two things: identity-based hatred and how it intersects with relationships, both real and imagined. This is the detail that people politely set aside when they make the issue abstract with terms like “toxicity.”

According to surveys, half of all online harassment actually comes from someone known to the recipient. This could be stalking, death threats, rape threats, sustained harassment, or revenge porn (termed non-consensual imagery). When scholars define small communities as less “toxic,” they exclude these interpersonal risks as irrelevant and potentially advance a social structure where these risks might be even more likely. My work on the attention-competition model also fails to explain interpersonal violence, which is widely experienced by people who aren’t famous at all.

Racism and sexism form the second part of the prejudice-hatred model of toxicity. Numerous books and studies that have analyzed the public social media profiles of politicians and journalists have found that, in parallel with attention/competition dynamics, women, people of color, and queer public figures experience higher rates of harassment and hatred. Over the years, many minoritized politicians, journalists, and academics have disengaged from social media to protect themselves.

Proponents of the prejudice/hatred model propose it as an alternative to the selection model. In the view of scholars such as Daniel Kreiss and Shannon McGregor, academics and politicians talk about group size and polarization in order to avoid talking about racism, sexism, and inequality. In this view, neither larger nor smaller social clusters would solve the deeper problem of identity-based hatred and discrimination that continue to maintain underclasses in American wealth and power — even among people who are neighbors and live in the same family.

Grounding Models in Reality and Scoping Them Well

The best models model are simple, grounded in reality, and explain an understandable, consequential range of phenomena. This paper does about one and a half of these things. This matters, because computer scientists (who are their audience) have a very long track record of buying into over-simple models (like algorithmic polarization) that seem to explain everything — without checking the model against reality.

I agree with Henry and Cosma that the filter bubble hypothesis is not a good explanation of population-average political dynamics. And I applaud their attempt to find some way through the tricky problem of mutual causality. Their model is compellingly simple, but like the filter-bubble hypothesis, they over-claim how much of civic discourse their model explains, and the model is not well grounded in the current state of empirical evidence. An even stronger paper would have defined its limitations more precisely and looked at more empirical work beyond just famous papers published in Science and Nature, which by definition do not provide a reliable distribution of empirical results.

It won’t surprise anyone that as a practitioner of community/citizen science, I think the strongest possible work would be informed by lived experiences of the people most affected — those who experience inequality, harassment, threats, and democratic gridlock on their urgent needs.

That said, this paper has me writing a long blog post in response, and one other job of models is to provoke constructive dialog, so it’s definitely achieving that <grin>.

Interestingly, we may already live in the world that Henry and Cosma have tried to simulate. Over the past few years, Meta and other platforms have decided to dramatically reduce the promotion of news and de-prioritize accounts that share politics, just like municipalities that closed public pools to avoid having to desegregate them in the 1970s. If toxicity persists, it might be support for the selection model they propose— or any of the other models.



J. Nathan Matias

Citizen social science to improve digital life & hold tech accountable. Assistant Prof, Cornell. Prev: Princeton, MIT. Guatemalan-American