AI & RoboticsNews

Building out generative AI models: Insights from MosaicML

There’s still much naivete in the enterprise around building large language models (LLMs) and other generative AI systems — which is not surprising, as they’re only just emerging in the mainstream. 

As described by Naveen Rao, founder and CEO of MosaicML, there are a whole span of options for enterprises to consider. They can use OpenAI and other existing models; they can fine tune those tools for specific use cases; they can build models from scratch. The most forward-thinking companies are often using many tools together while orchestrating customized models for particular domains and use cases. 

This concept of blending models or mixing and matching is not yet well understood, Rao pointed out.

“Everyone’s starting to get their heads around it,” he said in a fireside chat with VentureBeat founder Matt Marshall at this week’s VentureBeat Transform 2023. “Everything is so new. Most people didn’t even know what a large language model or GPT was 9 months ago. It’s probably one of the fastest transitions I’ve ever seen in my career.” 

MosaicML, which helps enterprises train and deploy LLMs and other gen AI, was just acquired in late June by data lakehouse and AI company Databricks for an incredible $1.3 billion. The startup released its MPT-7B model in May, which cost $200,000 to build. 

“It’s not $100 million,” the laid back and low key Rao emphasized of the price tag. “Everyone needs to get that out of their mind.”

>>Follow all our VentureBeat Transform 2023 coverage<<

As he put it, models don’t need to have the capability to philosophize about such topics as how Rome fell. Organizations just need to ensure general capabilities and correctness for their particular use cases. “That’s not necessarily what OpenAI has built,” he said. 

In many cases, enterprises are still gathering data, he noted, and the next stage is “How do I activate that data with AI?”

Taking that to the next level, in building a model and maintaining control over it, enterprises should pre-train and layer in their own data with existing data, he said. He also emphasized that it’s difficult for one model provider to build for every domain, so organizations must put the capability of model building into the hands of those with expertise in their fields. 

MosaicML is seeing early adopters putting models into production, soliciting feedback from users, then modifying and building a pipeline and feedback loop. 

“It’s this continuous cycle of innovation and improvement,” he said. 

MosaicML, for its part, set out to create a stable, cross-cloud interface to simplify the training of large models. The company has only spent $35 million from its conception in 2023 and just hit 50 customers, Rao said. He explained that the company is selective in who they work with: Customers must be organizations with strong teams in place and data in reasonable shape. 

At its outset, the company saw AI as a whole and generative AI as having “massive value.” 

“ChatGPT is new to a lot of people, it was not new to us,” he said. He called the chatbot “entertaining” and admitted that he initially thought it would be a “no-off” (until his teenage kids began talking about it). 

By their very nature, startups have the unique ability to take bets, jump on things quickly, work on it like mad and carve niches for themselves, he noted. 

Looking ahead, traditional enterprise will take a few more years to get to peak use of generative AI. Fintech is always an early adopter of new technologies, Rao said, and use in healthcare is also ticking up, while big pharma has “promise.” 

The most common use cases will be around consumer experiences and “new ways to manipulate your own data” for bespoke search and to provide context and personalization. Support automation and co-pilots will also serve as important tools, he said. 

“The pace of change is very high right now, it’s scary to me, to anybody,” said Rao. “It’s not going to be replacing jobs, it’s really going to enhance people’s jobs. There will be co-pilots for lawyers, co-pilots for doctors, co-pilots for everything.”

As for the Databricks acquisition, Rao said he was not looking for his company to be bought, but MosaicML found a “strong synergy” with the enterprise software company serving 10,000 customers. As he described it, Mosaic ML can “bolt on” to what Databricks built. 

“Enterprises are hungry for this,” said Rao. “We want to win. we want to be there first.”

Head over to our on-demand library to view sessions from VB Transform 2023. Register Here


There’s still much naivete in the enterprise around building large language models (LLMs) and other generative AI systems — which is not surprising, as they’re only just emerging in the mainstream. 

As described by Naveen Rao, founder and CEO of MosaicML, there are a whole span of options for enterprises to consider. They can use OpenAI and other existing models; they can fine tune those tools for specific use cases; they can build models from scratch. The most forward-thinking companies are often using many tools together while orchestrating customized models for particular domains and use cases. 

This concept of blending models or mixing and matching is not yet well understood, Rao pointed out.

“Everyone’s starting to get their heads around it,” he said in a fireside chat with VentureBeat founder Matt Marshall at this week’s VentureBeat Transform 2023. “Everything is so new. Most people didn’t even know what a large language model or GPT was 9 months ago. It’s probably one of the fastest transitions I’ve ever seen in my career.” 

Event

VB Transform 2023 On-Demand

Did you miss a session from VB Transform 2023? Register to access the on-demand library for all of our featured sessions.

 


Register Now

Customize, no need to spend millions

MosaicML, which helps enterprises train and deploy LLMs and other gen AI, was just acquired in late June by data lakehouse and AI company Databricks for an incredible $1.3 billion. The startup released its MPT-7B model in May, which cost $200,000 to build. 

“It’s not $100 million,” the laid back and low key Rao emphasized of the price tag. “Everyone needs to get that out of their mind.”

>>Follow all our VentureBeat Transform 2023 coverage<<

As he put it, models don’t need to have the capability to philosophize about such topics as how Rome fell. Organizations just need to ensure general capabilities and correctness for their particular use cases. “That’s not necessarily what OpenAI has built,” he said. 

In many cases, enterprises are still gathering data, he noted, and the next stage is “How do I activate that data with AI?”

Taking that to the next level, in building a model and maintaining control over it, enterprises should pre-train and layer in their own data with existing data, he said. He also emphasized that it’s difficult for one model provider to build for every domain, so organizations must put the capability of model building into the hands of those with expertise in their fields. 

MosaicML is seeing early adopters putting models into production, soliciting feedback from users, then modifying and building a pipeline and feedback loop. 

“It’s this continuous cycle of innovation and improvement,” he said. 

Generative AI has ‘massive value’

MosaicML, for its part, set out to create a stable, cross-cloud interface to simplify the training of large models. The company has only spent $35 million from its conception in 2023 and just hit 50 customers, Rao said. He explained that the company is selective in who they work with: Customers must be organizations with strong teams in place and data in reasonable shape. 

At its outset, the company saw AI as a whole and generative AI as having “massive value.” 

“ChatGPT is new to a lot of people, it was not new to us,” he said. He called the chatbot “entertaining” and admitted that he initially thought it would be a “no-off” (until his teenage kids began talking about it). 

By their very nature, startups have the unique ability to take bets, jump on things quickly, work on it like mad and carve niches for themselves, he noted. 

Co-pilots for everything

Looking ahead, traditional enterprise will take a few more years to get to peak use of generative AI. Fintech is always an early adopter of new technologies, Rao said, and use in healthcare is also ticking up, while big pharma has “promise.” 

The most common use cases will be around consumer experiences and “new ways to manipulate your own data” for bespoke search and to provide context and personalization. Support automation and co-pilots will also serve as important tools, he said. 

“The pace of change is very high right now, it’s scary to me, to anybody,” said Rao. “It’s not going to be replacing jobs, it’s really going to enhance people’s jobs. There will be co-pilots for lawyers, co-pilots for doctors, co-pilots for everything.”

As for the Databricks acquisition, Rao said he was not looking for his company to be bought, but MosaicML found a “strong synergy” with the enterprise software company serving 10,000 customers. As he described it, Mosaic ML can “bolt on” to what Databricks built. 

“Enterprises are hungry for this,” said Rao. “We want to win. we want to be there first.”

VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Discover our Briefings.


Author: Taryn Plumb
Source: Venturebeat

Related posts
AI & RoboticsNews

H2O.ai improves AI agent accuracy with predictive models

AI & RoboticsNews

Microsoft’s AI agents: 4 insights that could reshape the enterprise landscape

AI & RoboticsNews

Nvidia accelerates Google quantum AI design with quantum physics simulation

DefenseNews

Marine Corps F-35C notches first overseas combat strike

Sign up for our Newsletter and
stay informed!