To create accurate assessments of the models’ answers, Meta’s “Self-Taught Evaluator” uses the same “chain of thought” method as OpenAI’s o1 models.
Meta, said on Friday that it was launching a number of new AI models from its research division, one of which is a “Self-Taught Evaluator” that might pave the way for a reduction in the amount of human intervention in the AI development process.
The tool’s release comes after Meta first described it in a paper published in August. In that work, Meta explained how the tool uses the same “chain of thought” method as OpenAI’s recently released o1 models to produce accurate assessments of the models’ replies.
This method, which divides difficult issues into more manageable logical steps, seems to increase the precision of answers to difficult problems in disciplines like math, physics, and coding.
The evaluator model was trained by Meta’s researchers using just AI-generated data, excluding human involvement at that point as well.
Two of the Meta researchers working on the project told Reuters that the ability to use AI to evaluate AI reliably provides a glimpse of a potential road toward creating autonomous AI entities that can learn from their own mistakes.
Such agents are envisioned by many in the AI industry as digital assistants that possess the intelligence to do a wide range of activities without the need for human participation.
Self-improving models could eliminate the need for Reinforcement Learning from Human Feedback, a currently employed, frequently costly, and ineffective method that depends on human annotators with specialized knowledge to correctly label data and confirm the accuracy of responses to challenging writing and math problems.
One of the researchers, Jason Weston, stated, “As AI becomes more and more super-human, we hope that it will get better and better at checking its work, so that it will actually be better than the average human.”
According to him, “the idea of being self-taught and able to self-evaluate is basically crucial to the idea of getting to this sort of super-human level of AI,”.
Research on the idea of Reinforcement Learning from AI Feedback, or RLAIF, has also been published by other businesses, such as Google and Anthropic. However, those businesses often don’t make their models available to the general public, unlike Meta.
An upgrade to Meta’s image-identification Segment Anything model, a tool that expedites LLM response production times, and datasets that can help with the development of novel inorganic materials were among the other AI capabilities the business unveiled on Friday.
The goal of many AI specialists is to build digital assistants that are capable of handling a variety of activities without assistance from humans. Meta intends to increase the effectiveness of AI training procedures, which now call for a great deal of human supervision and knowledge, by utilizing self-learning models.
One of the researchers, Jason Weston, expressed hope that as AI develops, it would grow better at checking its own work and may even outperform humans in some situations. He emphasized that achieving a greater degree of AI capabilities requires the ability to learn and assess itself.
Similar ideas are also being investigated by other businesses, such as Google and Anthropic, albeit they typically do not make their models publicly accessible.
In addition to the Self-Taught Evaluator, Meta also made available resources to assist scientists in finding novel materials and an improved version of their image-recognition software.
In the meanwhile, Meta is combining its three creator monetization programs into one program and making adjustments to its Facebook monetization program. The goal of this new strategy is to make it easier for platform builders to generate money.
With different qualifying requirements and application processes, producers can already make money through in-stream advertisements, ads on reels, and performance bonuses. The updated monetization scheme will streamline the onboarding process into a single, cohesive experience by requiring creators to apply just once.
Discover more from TechBooky
Subscribe to get the latest posts sent to your email.