Galileo, a trailblazer in enterprise generative AI, has unveiled Galileo Luna, a groundbreaking suite of Evaluation Foundation Models (EFMs) that promises to transform how enterprises evaluate their GenAI systems. With Luna, Galileo aims to address the critical challenges of speed, cost, and accuracy that have hindered the widespread adoption of generative AI in production environments.
“Galileo created Luna to address the limitations of current GenAI evaluation methods, which were slow, expensive, and often inaccurate,” said Vikram Chatterji, Co-Founder and CEO of Galileo, in an interview with VentureBeat. “The motivation stemmed from the need for ultra-low-latency, cost-effective, and high-accuracy evaluations in production environments.”
The development of Luna marks a significant milestone for Galileo, which has been at the forefront of enterprise GenAI since its inception in early 2021. The company’s dedication to pushing the boundaries of AI evaluation is evident in the nearly year-long intensive R&D process that led to Luna’s creation.
At the heart of Luna’s innovation lie its purpose-built small language models, meticulously tailored for specific evaluation tasks such as hallucination detection, context quality assessment, data leakage prevention, and malicious prompt identification. This specialized design allows Luna to deliver unparalleled performance across three key metrics: speed, cost, and accuracy.
“Luna surpasses GPT-3.5 in speed, cost, and accuracy through several innovations,” Chatterji explained. “Luna utilizes purpose-built small language models that are tailored for specific evaluation tasks, significantly reducing computational overhead and cost. This design choice allows for evaluations that are 97% cheaper and 11x faster than those performed with GPT-3.5.”
But it’s not just about speed and cost. Luna also boasts industry-leading accuracy, outperforming previous methods by up to 20% in detecting hallucinations, prompt injections, personally identifiable information (PII), and more. “Multi-headed small language models and advanced techniques like intelligent chunking ensure that Luna models maintain context better and provide more accurate evaluations,” Chatterji added.
One of the most remarkable aspects of Luna is its ability to operate without the need for traditional ground truth datasets. By leveraging pre-trained evaluation models fine-tuned on diverse, domain-specific datasets, Luna eliminates the time-consuming and costly process of creating custom test sets. This innovation streamlines the evaluation process and reduces dependence on extensive human-generated data.
The potential applications of Luna are vast, with Chatterji highlighting its relevance in industries that demand high reliability and speed in AI evaluations. “Luna is especially powerful in large-scale enterprise applications where volume and throughput are necessary (i.e. millions of queries per month). We’re seeing Fortune 100 enterprises in healthcare, finance, and telecom find Luna particularly useful,” he said.
Use cases range from real-time monitoring of AI outputs and detecting hallucinations in AI-generated content to ensuring the safety and quality of chatbot interactions. And with Galileo’s Fine Tune product, Luna can be customized to meet specific customer requirements, achieving accuracy levels of 95% or higher for critical tasks in industries such as pharmaceuticals and financial services.
As the generative AI landscape continues to evolve rapidly, Galileo remains committed to staying at the forefront of innovation. Chatterji emphasized that Luna will scale in three key ways: expanding support for more evaluation task types, continually improving accuracy, and further reducing cost and latency.
“Galileo is committed to pushing the boundaries of what’s possible in AI evaluation and helping organizations bring trustworthy AI to production,” Chatterji said. “As the landscape of generative AI continues to evolve, Galileo remains dedicated to providing its clients with cutting-edge evaluation capabilities that make AI practical for businesses to deploy and inspire confidence and trust amongst consumers.”
With the launch of Luna, Galileo has solidified its position as a leader in enterprise GenAI evaluation. As more organizations seek to harness the power of generative AI, Luna’s ability to deliver fast, cost-effective, and accurate evaluations will be a critical factor in driving widespread adoption and unlocking the full potential of this transformative technology.
VB Transform 2024 returns this July! Over 400 enterprise leaders will gather in San Francisco from July 9-11 to dive into the advancement of GenAI strategies and engaging in thought-provoking discussions within the community. Find out how you can attend here.
Galileo, a trailblazer in enterprise generative AI, has unveiled Galileo Luna, a groundbreaking suite of Evaluation Foundation Models (EFMs) that promises to transform how enterprises evaluate their GenAI systems. With Luna, Galileo aims to address the critical challenges of speed, cost, and accuracy that have hindered the widespread adoption of generative AI in production environments.
“Galileo created Luna to address the limitations of current GenAI evaluation methods, which were slow, expensive, and often inaccurate,” said Vikram Chatterji, Co-Founder and CEO of Galileo, in an interview with VentureBeat. “The motivation stemmed from the need for ultra-low-latency, cost-effective, and high-accuracy evaluations in production environments.”
The development of Luna marks a significant milestone for Galileo, which has been at the forefront of enterprise GenAI since its inception in early 2021. The company’s dedication to pushing the boundaries of AI evaluation is evident in the nearly year-long intensive R&D process that led to Luna’s creation.
Purpose-built models redefine speed, cost, and accuracy
At the heart of Luna’s innovation lie its purpose-built small language models, meticulously tailored for specific evaluation tasks such as hallucination detection, context quality assessment, data leakage prevention, and malicious prompt identification. This specialized design allows Luna to deliver unparalleled performance across three key metrics: speed, cost, and accuracy.
VB Transform 2024 Registration is Open
Join enterprise leaders in San Francisco from July 9 to 11 for our flagship AI event. Connect with peers, explore the opportunities and challenges of Generative AI, and learn how to integrate AI applications into your industry. Register Now
“Luna surpasses GPT-3.5 in speed, cost, and accuracy through several innovations,” Chatterji explained. “Luna utilizes purpose-built small language models that are tailored for specific evaluation tasks, significantly reducing computational overhead and cost. This design choice allows for evaluations that are 97% cheaper and 11x faster than those performed with GPT-3.5.”
But it’s not just about speed and cost. Luna also boasts industry-leading accuracy, outperforming previous methods by up to 20% in detecting hallucinations, prompt injections, personally identifiable information (PII), and more. “Multi-headed small language models and advanced techniques like intelligent chunking ensure that Luna models maintain context better and provide more accurate evaluations,” Chatterji added.
Revolutionizing evaluation without ground truth datasets
One of the most remarkable aspects of Luna is its ability to operate without the need for traditional ground truth datasets. By leveraging pre-trained evaluation models fine-tuned on diverse, domain-specific datasets, Luna eliminates the time-consuming and costly process of creating custom test sets. This innovation streamlines the evaluation process and reduces dependence on extensive human-generated data.
The potential applications of Luna are vast, with Chatterji highlighting its relevance in industries that demand high reliability and speed in AI evaluations. “Luna is especially powerful in large-scale enterprise applications where volume and throughput are necessary (i.e. millions of queries per month). We’re seeing Fortune 100 enterprises in healthcare, finance, and telecom find Luna particularly useful,” he said.
Customization and continuous evolution in the face of rapid GenAI advancements
Use cases range from real-time monitoring of AI outputs and detecting hallucinations in AI-generated content to ensuring the safety and quality of chatbot interactions. And with Galileo’s Fine Tune product, Luna can be customized to meet specific customer requirements, achieving accuracy levels of 95% or higher for critical tasks in industries such as pharmaceuticals and financial services.
As the generative AI landscape continues to evolve rapidly, Galileo remains committed to staying at the forefront of innovation. Chatterji emphasized that Luna will scale in three key ways: expanding support for more evaluation task types, continually improving accuracy, and further reducing cost and latency.
“Galileo is committed to pushing the boundaries of what’s possible in AI evaluation and helping organizations bring trustworthy AI to production,” Chatterji said. “As the landscape of generative AI continues to evolve, Galileo remains dedicated to providing its clients with cutting-edge evaluation capabilities that make AI practical for businesses to deploy and inspire confidence and trust amongst consumers.”
With the launch of Luna, Galileo has solidified its position as a leader in enterprise GenAI evaluation. As more organizations seek to harness the power of generative AI, Luna’s ability to deliver fast, cost-effective, and accurate evaluations will be a critical factor in driving widespread adoption and unlocking the full potential of this transformative technology.
Author: Michael Nuñez
Source: Venturebeat
Reviewed By: Editorial Team