Lessons to be learned from Meta’s controversial BlenderBot 3

On August 5, the social media giant introduced BlenderBot 3 in a blog post. This chatbot was built on the Open Pretrained Transformer, or OPT, language model. With OPT-175, the idea of ​​the group’s researchers was to equal or even surpass another autoregressive language model, OpenAI’s famous GPT-3 and its 175 billion parameters.

If it has the same number of parameters, BlenderBot 3 relies on the architecture of SeeKeR, a language model capable of using a search engine.

Meta claims that the chatbot can search the internet for data to converse on any topic. Meta also mentioned in her blog post that as people interact with the system, she uses the data to improve the virtual assistant.

Internet users quickly pushed BlenderBot 3 to its limits. Articles have referred to the chatbot being not only anti-Semitic, but also pro-Trump, and spouting conspiratorial rants about the 2020 US presidential election. Other articles show the chatbot attacking Meta and its CEO.

The bad press campaign led Meta to update their blog post on August 8, assuring that BlenderBot 3’s flaws are part of their strategy.

“While it is painful to see some of these offensive responses, public demonstrations like this are important for building truly robust conversational AI systems and for closing the clear gap that exists today before such systems can be put into production”, writes Joëlle Pineau, general manager of fundamental research in AI at Meta.

Meta did not respond to request for comment from TechTarget, owner of MagIT.

The dangers of training AI on public data

Meta’s statement on public demos is both correct and incorrect, Forrester analyst Will McKeon-White considers.

“People are particularly inventive when it comes to using languages,” he notes. “It’s very hard for robots to understand things like metaphors and mocks, and that might help Meta in part. It takes a lot of data to train a chatbot, and it’s not easy.

However, “Meta should have applied terms of service or filters to prevent people from misusing the chatbot,” he continues.

“If you know what’s going on, then you should have taken extra steps to prevent it,” says Will McKeon-White. “Social media doesn’t provide a good training dataset, and having made it publicly available doesn’t allow it to be trained well either.”

Meta’s BlenderBot 3 is reminiscent of Tay, an AI-powered chatbot launched by Microsoft in 2016. Like BlenderBot 3, Tay has also been cited for being misogynistic, racist, and anti-Semitic. Controversy surrounding Tay prompted Microsoft to shut it down days after the system was released on social media.

Find other training data

Given that AI chatbots, like BlenderBot 3 and Tay, are often trained on publicly available information and data, “it should come as no surprise that they spout toxic information,” says Mike Bennett, director Curriculum and Lead AI Lead at the Institute for Experiential AI at Northeastern University.

“I just don’t know how the big tech companies investing in these chatbots are going to, in a cost effective way, train this software quickly and efficiently to do anything other than talk in the mode of the sources that were used to train them,” says Mike Bennett.

Small businesses and businesses can find other training data, but the investment in developing a select data set to train chatbots – and the time involved – would be costly.

A less expensive solution would be for smaller companies to pool resources to create a dataset to train chatbots. However, this would cause friction as the organizations would work with competitors, and it would take time to figure out who owns what, Bennett said.

Should we be wary of the NLG?

Another solution is to avoid prematurely launching such systems.

Brands and companies that leverage automatic natural language generation (NLG) need to keep a close eye on their system. They must maintain it, understand its trends and modify the data set if necessary before making it public, explains the Forrester analyst.

If companies choose to source their training data from the internet, there are multiple ways to do so responsibly, he adds. A terms of use policy can prevent users from abusing the technology. Another solution is to set up filters in the background or to make a list of forbidden words that the system should not generate.

Because of BlenderBot’s performance, caution will likely be in order when it comes to NLG systems, predicts Will McKeon-White.

“It’s probably going to impair experimentation with these systems for a while,” he adds. “That will last until vendors can provide filters or shields for systems like these.”

Meta researchers want to be fed by trolls

For their part, Meta researchers are well aware of the potential toxicity of the sculpin. While OPT-175 better recognizes racist, misogynistic or harmful biases in texts, it already tended to generate more toxic content than its equivalents. The BlenderBot 3 project leaders implemented a whole set of mechanisms when training and inferring the model, including an external classifier that should “inhibit risky builds.”

A list of keywords has also been introduced to avoid deleterious dialogues. A final check is made when issuing a response if the application has detected a “malicious” interaction on the part of the user. Despite all this, BlenderBot3 “still generates toxic content in small percentages”, acknowledge Meta researchers.

The remarks of Mike Bennett and Will McKeon-White are therefore both relevant and partly erroneous.

“If our security mechanisms fail to prevent our bot from saying something inappropriate, rude, or offensive, our UI has feedback mechanisms for users to report its messages,” write Meta researchers.

“Data collected will be shared with the community to improve existing systems and make our models more accountable over time.”

Meta does not play in the same court as customers of AI solutions. The group prefers to explore advanced possibilities which it will then exploit for its own needs or for commercial purposes.

Clearly, if the BlenderBot 3 project fails, it is to better train the next version of the chatbot. Meta researchers mention a new architecture called DIRECTOR that they plan to implement later.

DIRECTOR is also based on a standard decoder to which the project managers have added a classification head per generated word.

The resulting deep learning model can be “trained on unlabeled data (like most NLG NDLR models), and labeled data to indicate whether a generated sequence is desirable or not”. DIRECTOR has already been shown to outperform other techniques in detecting and avoiding the production of toxic, inconsistent, or repetitive text, according to Meta researchers.

Leave a Comment