Can We Crash-Test AI with Artificial Humans?
Proxies are a promising and potentially-risky way to test the safety of AI systems
How can scientists (and communities) make general claims about the effects of AI systems that adapt to humans in different ways? A new article in PNAS shows how answering this important question could require us to develop new methods involving artificial humans.
This recent paper by Homa Hosseinmardi, Amir Ghasemian, Miguel Rivera-Lanas, Manoel Horta Ribero, Robert West, and Duncan J. Watts combines methods from experiments and simulations to look at the effect of YouTube’s recommendations on viewer preferences.
As I wrote in Nature last year, scientists and technologists don’t yet have the tools to reliably predict whether a given adaptive algorithm (like YouTube’s recommendations, or stock market traders, or ChatGPT) will cause harms or reduce harm. That’s a huge problem for society and for tech firms, and it’s one that Hosseinmardi and colleagues are trying to solve.
Here’s their idea: researchers need to show how YouTube’s algorithms would behave differently in response to different actions. That typically requires an experiment with treatment and control groups. In this study, researchers used bots as the “treatment” group — similar in ways to the figures used in automotive crash tests (called ATDs).
The researchers worked with 44 “focal users” and a similar number of bots that mimicked the behavior of the focal users. Then when the experiment started, they programmed the bots to follow the recommendations made by YouTube and compared the bots to the humans they were based on.
Dr. Hosseinmardi and colleagues found that on average, recommendations pushed users toward more moderate content rather than more extreme. Is this the last word on YouTube’s recommendations? Definitely not. But it *is* an exciting proof of concept for a possible path forward.
The Downsides of Testing AI with Artificial Humans
There’s a caveat though — these kinds of results are only believable if you agree that the “counterfactual bots” were experiencing something similar to humans — or at least enough of a proxy to humans that the results are meaningful. That’s a topic that Lisa Messeri and Molly Crockett took up in a Perspectives article for Nature in March that outlined the risks to science of this approach: they warned that using bots in human research could create an “illusion of understanding” that leads scientists to “produce more but understand less.”
Using bots to stand in for humans could create an “illusion of understanding” that leads scientists to “we produce more but understand less.”
Learning from failures in safety testing with proxies
Fortunately, this isn’t the first time that scientists have used proxies to stand in for humans in important research. Proxies, a 2021 book by Dylan Mulvin looks at the history, policy, and cultural work of testing technologies with proxies. Mulvin shows how a tendency toward white, abled, male proxies in safety testing has actually made products and systems more dangerous for many people.
How much should we worry about bad proxies? In 2019, Consumer Reports CEO Dr. Marta Tellado reported that in the United States, “the odds for women being seriously injured in a frontal crash are 73 percent greater than they are for men,” largely due to auto-makers failing to test their cars with proxies shaped like women’s bodies.
In short, proxies and bots are a promising possibility that scientists need to test further. Homa and colleagues have made a significant contribution in that direction with this new paper. As scientists make progress on these AI proxies, we should also remember that if we’re not careful, the cure could be worse than the disease.