Gadgets

Meta’s prototype moderation AI needs few examples of bad behavior to take action

The moderation of content in today’s Internet is comparable to a round of Whack-A-Mole, in which human moderators are constantly forced to react in real time to changing trends, such as vaccinating misinformation and disinformation or intentionally malicious actors looking for ways seek to circumvent established personal codes of conduct. Machine learning systems can help alleviate some of this burden by automating the policy enforcement process, but modern AI systems often require months of lead time to properly train and deploy (most of the time is spent reading the thousands, if not millions of to collect and comment on necessary examples). . In order to shorten this reaction time at least to weeks instead of months, the AI ​​research group at Meta (formerly FAIR) has developed a more general technology This only requires a handful of specific examples to respond to new and emerging forms of malicious content known as a Few-Shot Learner (FSL).

Few-shot learning is a relatively new development in AI that essentially teaches the system to make accurate predictions based on a limited number of training examples – unlike traditional supervised learning methods. For example, if you want to teach a standard SL model to recognize pictures of rabbits, feed it a few hundred thousand rabbit pictures and then you can present it with two pictures and ask if they both show the same animal. The thing is, the model doesn’t know if the two pictures are of rabbits because she doesn’t really know what a rabbit is. That’s because the purpose of the model is not to spot rabbits, but to look for similarities and differences between the images presented and to predict whether the things displayed are the same or not. There is no broader context in which the model can operate that only makes it useful for distinguishing “rabbits” – it cannot tell you whether it is a picture of a rabbit, a lion, or a John Cougar. is Mellencamp, only these three entities are not the same.

FSL relies far less on labeled data (i.e. images of rabbits) in favor of a generalized system that is more like human learning than conventional AIs. “It was first trained on billions of generic and open source language samples,” read a meta blog post on Wednesday. “Then the AI ​​system is trained with integrity-specific data we’ve labeled over the years. Ultimately, it is trained on compressed text that explains a new guideline. ”And, unlike the rabbit matching model above, FSL is“ pre-trained in both general and integrity language so that it can learn the guideline text implicitly ”.

Recent tests of the FSL system have been encouraging. Meta-researchers studied the change in the distribution of harmful content shown to Facebook and Instagram users before and after enabling FSL on the websites. The system found malicious content that traditional SL models had overlooked, as well as the proliferation of that content in general. The FSL system reportedly outperformed other low-shot models by up to 55 percent (although only 12 percent on average).

Meta

FSL’s improved performance is due in part to its commitment defined as “The act or fact of inducing or drawing by necessity or as a consequence.” It is essentially a corollary between two sentences – if sentence A is true, sentence B must also be true. For example, if Sentence A reads “The President was murdered,” it means that Sentence B, “The President is dead,” is also true, accurate, and correct. Using childbirth in the FSL system, the team is able to “convert the class label into a natural language sentence that can be used to describe the label and determine if the example includes the label description”, explained Meta-AI researchers. So instead of trying to generalize what a conventional SL model knows from its training set (hundreds of thousands of rabbit pictures) to the test set (“are those two pictures of rabbits?”), The FSL model can detect harmful content more generally, if it sees it because it understands the policy that the content is violating.

The added flexibility of a “single shared knowledge base and backbone” could one day allow AI moderation systems to identify and respond to new forms of malicious content much faster, capture more content that just barely circumvents current guidelines, and even Meta. help develop and better define future policies.

All products recommended by Engadget are selected by our editorial team independently of our parent company. Some of our stories contain affiliate links. If you buy something through one of these links, we may earn an affiliate commission.

Related posts

ASUS PureGo checks the cleanliness of your fruits and vegetables

TechLifely

J20 rugged Android 11 5G smart phone

TechLifely

The Morning After: What to expect at the iPhone 14 launch event

TechLifely

Leave a Comment