AI & Robotics News

Generative AI LLMs set to enter the Jurassic-2 era with AI21 Labs

March 9, 2023

While OpenAI is grabbing a lot of the hype in the generative AI world, it isn’t the only vendor building a large language model (LLM).

Today, Israeli startup AI21 Labs announced the release of its latest generative AI model, known as Jurassic-2. AI21 Labs was founded in 2017, and released its Jurassic-1 Jumbo LLM in 2021, boasting that it had been trained on 178 billion parameters. The company raised $64 million in a series B funding round back in July 2022 and is focused on text generation use cases.

With Jurassic-2, AI21 Labs has updated the training data for the model and is aiming to accelerate the response times for generation by up to 30%. The company is also integrating new capabilities that support more advanced instructions to enable users to get highly customized results.

AI21 will be integrating Jurassic-2 into its natural language processing (NLP)-as-a-service platform, AI21 Studio, as well as via a series of APIs for developers to integrate into their own custom applications.

“Large language models are magical and they’re very broadly applicable; it’s continuously surprising to see what can be achieved with them,” Ori Goshen, co-CEO and cofounder of AI21 Labs told VentureBeat. “At the same time, we see some limitations and that’s why we started the company — to try and bring more reasoning and more semantics into the statistical approach.”

How A21 Labs is taking a semantic approach to generative AI

The approach that many LLMs take is a statistical model that is able to infer outcomes based on training via a machine learning process.

Goshen explained that there are some types of processes that do not tend to work well with a statistical approach to AI. For example, basic mathematics is not learned just by training on examples and then generalizing based on those examples. Rather, he noted that humans learn basic mathematics by being taught rules, such as the basics of how to perform addition or subtraction. The goal for A21 Labs with Jurassic-2 is to integrate semantic reasoning along with statistical representation.

The direction is to help provide what Goshen referred to as a more guided and precise response to a user’s intent with generative AI. For instance, he noted that if a user asks the system to generate a statistical fact or historical fact, it will generate coherent text but it will also be factual and will cite the source of where the information is coming from.

>>Follow VentureBeat’s ongoing generative AI coverage<<

In general, Goshen said that the way to move forward with LLMs and apply them in a productive way for work environments is to have more reliability.

“We’re trying to focus on reading and writing use cases like summarizing text and generating text that is highly guided and reliable,” Goshen said.

You can teach a ‘dinosaur’ new tricks

The term “Jurassic” refers to a geological period in Earth’s history in which dinosaurs were very much active. With Jurassic-2, AI21 Labs is literally teaching its dinosaur-era-named LLM new techniques.

Goshen explained that AI21 Labs had a multiphase approach to building out the Jurassic-2 LLM. The first phase involved a self-supervised approach where the model was trained on a very large corpus of unstructured and unlabeled data. The next phase involved taking a large volume of labeled data to help teach the LLM to be able to follow instructions.

With Jurassic-2, a focus for AI21 Labs was also on more selectively picking the right data to train on.

“There’s a lot of text out there and there’s a lot of repetitiveness on the web,” he said. “So one of the key things we worked on was how to selectively pick examples for the model that actually boost its learning, [which] obviously improves efficiency of training and the performance of the model in general.”

It’s not a MRKL (miracle), it’s just AI

One key approach that isn’t yet in the Jurassic-2 LLM is an implementation of AI21 Labs’ MRKL (pronounced “miracle”) modular reasoning knowledge and language system.

The promise of MRKL is an advanced form of reasoning to help better infer results from an LLM. The company has been talking about its MRKL technology since at least May 2022, when it first demonstrated its Jurassic-X architecture. Goshen said that Jurassic-2 is not implementing MRKL into its architecture at launch, but he hinted that AI21 Labs has some future model releases that will carry forward the spirit of MRKL.

The Jurassic-2 LLM is available to developers via APIs that they can implement, and it’s also part of AI21’s products, including the Wordtune suite of services.

“We don’t just develop our own models. We also serve our applications that are built on top of these models,” Goshen said.

VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Discover our Briefings.

Author: Sean Michael Kerner
Source: Venturebeat

923

0