AI & RoboticsNews

OctoAI launches OctoStack for enterprises to customize, deploy private AI models

OctoStack for Enterprises with Private Generative AI Models

Seattle-based OctoAI has a new offering called OctoStack, designed to help those in the enterprise deploy private generative AI models. Companies can use this “turn-key production platform” in a virtual private cloud or on-premises and will have access to highly optimized inference, model customization and asset management. In doing so, OctoAI wants to give companies the freedom to build and run gen AI applications in the way they see fit.

“Enabling customers to build viable and future-proof Generative AI applications requires more than just affordable cloud inference,” Luis Ceze, OctoAI’s chief executive, said in a statement. “Hardware portability, mode onboarding, fine-tuning, optimization, load-balancing — these are full-stack problems that require full-stack solutions.”

OctoStack supports fine-tuning and deployment of a range of open source and commercial AI models, such as Meta’s Llama family, Mistral’s 8x8B and Stable Diffusion models. However, it doesn’t include Anthropic’s Claude, because the AI is only offered in the cloud via Anthropic. “But we offer a lot of these super capable open source models that you can fully control and customize for,” Ceze said.

A detailed diagram of how OctoAI's OctoStack platform works in the enterprise. Image credit: OctoAI
How the OctoStack platform works in the enterprise. Image credit: OctoAI

From Fully Managed to Do-It-Yourself

This isn’t the first attempt by the startup to provide companies with a packaged AI offering. Last year, OctoAI released its self-optimizing infrastructure service. As Ceze explains, the difference is that the feature introduced back then is now a fully managed solution. “That means that you call our APIs, offers highly efficient inference, and we have support for customizing the model,” he told VentureBeat. “We have support for building model cocktails and so on, all with the enterprise and production in mind.”

In contrast, OctoAI’s OctoStack is a self-managed offering. Ceze said that once the company’s customers started to “do many billions of tokens a day” and there were “millions of images” generated on the platform daily, it became clear there was a need “for more private deployments of our technology.” It’s comparable to having your blog hosted on versus being on your own private server — an analogy Ceze didn’t dispute.

“As enterprises start getting serious about deploying AI, they’re nervous about sending data over an API outside their control,” Ceze delineates. “What we do with OctoStack is they can choose their model, customize their models, and offer that as a totally private API. And we provide all of the infrastructure for it. That means we take care of how the model becomes reliable and efficient across their GPUs.”

Hundreds of customers currently use OctoAI’s fully managed solution, but Ceze declined to share how many have signed up for OctoStack. Instead, he referred me to those listed on the company’s press release —, Otherside AI, Lattitude Games and CapitalAI. However, I’m told the companies being targeted are those that are already experimenting with gen AI tools and are now looking to deploy these models into a production environment.

A Wide Open Market for Enterprise AI

There is a tremendous opportunity for generative AI adoption within the enterprise. A Menlo Ventures report highlighted that $400 billion was spent on cloud software in this space last year. Seventy billion of that investment went to AI (18 percent). Gen AI made up $2.5 billion, less than 1 percent.

Enterprise investment in generative AI is small compared to enterprise budgets for traditional AI and cloud software.
Enterprise investment in generative AI is small compared to enterprise budgets for traditional AI and cloud software. Image credit: Menlo Ventures

“The current usage and availability of Generative AI in the enterprise is technically high, with over half of CIOs having some plans to formally deploy Generative AI and the popularity of services and models such as Microsoft Copilot, ChatGPT, Midjourney and many others,” Amalgam Insights chief executive and analyst Hyoun Park shared with me. “But the capabilities associated with customization, fine-tuning and augmenting models is still low…”

Must Read: The AI Impact Tour: Opportunities & Practices for Generative AI

Constellation Research founder and principal analyst Ray Wang shares that right now, “most organizations are trying to optimize for a multi-vendor world, hence there have not been any pure-gen AI stacks. Bringing your own app frameworks, models and data is the predominant approach.” He describes OctoStack as a good thing because “it’s easier to have the stack in one place.”

OctoAI may not have long to rest on its laurels. It faces stiff competition from not only its fellow startups, but also enterprise incumbents, including Nvidia, Databricks and Sambanova Systems. Ceze says he’s not worried: “I’m pretty sure this is a hot space and we have to expect that others who will have offerings that compete with this and the way we continue to differentiate is again using our unique expertise in doing cross-tech optimizations. That’s the DNA of our company. That’s how we started.”

Author: Ken Yeung
Source: Venturebeat
Reviewed By: Editorial Team

Related posts

Navy, senators argue over who is to blame for a too-small fleet


To expand the US Navy’s fleet, we must contract


Ellis to succeed Rey as director of Army Network Cross-Functional Team

Cleantech & EV'sNews

Tesla asks shareholders to move to Texas and re-pass Elon Musk's massive compensation plan

Sign up for our Newsletter and
stay informed!

Share Your Thoughts!

This site uses Akismet to reduce spam. Learn how your comment data is processed.