Every day, life seems to get closer and closer to becoming one big episode of “Black Mirror.” The MIT Technology Review recently warned that Google’s new hate speech removal software seems to be failing miserably at its stated purpose and is letting plenty of offensive messages slip through the cracks.
Last week, Google unveiled a service called Perspective, which uses machine learning to ferret out toxic comments on websites and eradicate them in an effort to take the bite out of using the internet.
Media outlets would theoretically use Perspective to make sure that their comments sections don’t devolve into hate-filled cesspools, but there’s just one problem: it’s almost trivial to sneak a doozy of a comment past Perspective’s filters.
MIT decided to test comments against Perspective’s rating system, which classifies comments from 1 to 100 percent for “toxicity,” which it defines as “a rude, disrespectful, or unreasonable comment that is likely to make you leave a discussion.”
“’Trump sucks’ scored a colossal 96 percent, yet neo-Nazi codeword ‘14/88’ only scored 5 percent,” MIT noted. “’Few Muslims are a terrorist threat’ was 79 percent toxic, while ‘race war now’ scored 24 percent. ‘Hitler was an anti-Semite’ scored 70 percent, but ‘Hitler was not an anti Semite’ scored only 53%, and ‘The Holocaust never happened’ scored only 21%.”
Hilariously, couching hate speech in polite phrases seems to trick Perspective into lowering its guard: “And while ‘gas the joos’ scored 29 percent, rephrasing it to ‘Please gas the joos. Thank you,’ lowered the score to a mere 7 percent.”
The MIT Technology Review uncovered a wealth of similar problems with Perspective’s algorithms, which seem to despise profane words no matter the context and to be oblivious to code phrases and innuendo.
Will our innocent comments on social media soon be targeted for removal by an especially tone-deaf AI? Tune in next week to find out.