Go Ahead, Try to Sneak Bad Words Past AI Filters—for Research

A new Facebook project pits humans against algorithms, to expose the systems’ weaknesses and help make them better….

Facebook’s artificial intelligence researchers have a plan to make algorithms smarter by exposing them to human cunning. They want your help to supply the trickery.

Thursday, Facebook’s AI lab launched a project called Dynabench that creates a kind of gladiatorial arena in which humans try to trip up AI systems. Challenges include crafting sentences that cause a sentiment-scoring system to misfire, reading a comment as negative when it is actually positive, for example. Another involves tricking a hate speech filter—a potential draw for teens and trolls. The project initially focuses on text-processing software, although it could later be extended to other areas such as speech, images, or interactive games.

Subjecting AI to provocations from people is intended to give a truer measure of the intelligence (and stupidity) of artificial intelligence, and provide data that can improve it. Researchers typically compare algorithms by scoring how accurately they label images or answer multiple choice questions on standard collections of data, known as benchmarks.

Facebook researcher Douwe Kiela says those tests don’t really measure what he and others in the field care about. “The thing we’re really interested in is how often it makes mistakes when it interacts with a person,” he says. “With current benchmarks, it looks like we’re amazing at doing language in AI and that’s very misleading because we still have a lot to do.”

The researchers hope analyzing cases where AI was snookered by people will make algorithms less dupable.

Douwe hopes AI experts and ordinary netizens alike will find it fun to log on to spar with AI and earn virtual badges, but the platform will also let researchers pay for contributions through Amazon’s crowdsourcing service Mechanical Turk. AI labs at Stanford, University of North Carolina, and University College London will all maintain artificial intelligence tests on the Dynabench platform.

Facebook’s project comes as more AI researchers, including the social network’s VP of artificial intelligence, say the field needs to broaden its horizons if computers are to become capable of handling complex, real world situations.

In the last eight years, breakthroughs in an AI technique called deep learning have brought consumers speech recognition that mostly works, phones that auto-sort dog photos, and some hilarious Snapchat filters. Algorithms can unspool eerily limpid text.

Yet deep learning software stumbles in situations outside its narrow training. The best text-processing algorithms can still be tripped up by the nuances of language, such as sarcasm, or how cultural context can shift the meaning of words. Those are major challenges for Facebook’s hate speech detectors. Text generators often spew nonsensical sentences adrift from reality.

Those limitations can be hard to see if you look at the standard benchmarks used in AI research. Some tests of AI reading comprehension have had to be redesigned and made more challenging in recent years because algorithms figured out how to score so highly, even surpassing humans.

article image

The WIRED Guide to Artificial Intelligence

Supersmart algorithms won’t take all the jobs, But they are learning faster than ever, doing everything from medical diagnostics to serving up ads.

Yejin Choi, a professor at University of Washington and research manager at the Allen Institute for AI, says such results are deceptive. The statistical might of machine learning algorithms can discover tiny correlations in test datasets undetectable by people that reveal correct answers without requiring a human’s wider understanding of the world. “We are seeing a Clever Hans situation,” she says, referring to the horse who faked numeracy by reading human body language.

More AI researchers are now seeking alternate ways to measure and spur progress. Choi has tested some of her own, including one that scores text-generation algorithms by how well their responses to Reddit posts rank against those from people. Other researchers have experimented with having humans try to trick text algorithms, and shown how examples collected this way can make AI systems improve.

Algorithms tend to look less smart when pitted against those more challenging tests and Choi expects to see a similar pattern on Facebook’s new Dynabench platform. Projects that strip away AI emperors’ clothes could jolt researchers into exploring fresher ideas that lead to breakthroughs. “It will challenge the community to think harder about how learning should really take place with AI,” Choi says. “We need to be more creative.”


More Great WIRED Stories

Live Updates for COVID-19 CASES