Officially dubbed Hailo-10, the processor is designed to deploy gen AI applications across edge devices, like cars and commercial robots, without relying on cloud data centers. This, Hailo claims, will not only help maximize the performance of models on the devices but also deliver significant benefits in terms of cost and energy savings.
In addition to the new gen AI-focused chip, which is set to ship in the second quarter of 2024, Hailo also announced it has extended its series C round with an additional investment of $120 million. The round, which has been led by new and existing investors from different industries, values the company at $1.2 billion, Hailo confirmed to VentureBeat.
“The closing of our new funding round enables us to leverage all the exciting opportunities in our pipeline while setting the stage for our long-term future growth. Together with the introduction of our Hailo-10 GenAI accelerator, it strategically positions us to bring classic and generative AI to edge devices in ways that will significantly expand the reach and impact of this remarkable new technology,” Hailo co-founder and CEO Orr Danon, said in a statement.
We designed Hailo-10 to seamlessly integrate GenAI capabilities into users’ daily lives, freeing users from cloud network constraints. This empowers them to utilize chatbots, copilots, and other emerging content generation tools with unparalleled flexibility and immediacy, enhancing productivity and enriching lives,” he emphasized.
What to expect from Hailo-10 gen AI accelerator?
From generating marketing content to engaging in full-fledged conversations with customers, generative AI is already making a major impact at different levels of the enterprise. The technology is evolving, but one area where its full potential is yet to be achieved is at the edge. Imagine a robot talking like a human and doing everything one asks without dedicated programming — Hailo wants to bring capabilities like this to life with its new gen AI processor.
While AI hardware running in cloud data centers can power edge use cases, the inherent nature of cloud computing, which requires data to be sent to and processed in a remote server, can make it difficult to execute, with apps suffering from latency issues at times. Hailo-10 bridges this gap by running gen AI services directly on targeted devices, supercharging their current and future CPUs.
“Whether users employ gen AI to automate real-time translation or summarization services, generate software code, or images and videos from text prompts, Hailo-10 lets them do it directly on their PCs or other edge systems, without straining the CPU or draining the battery,” Danon noted.
According to the CEO, the ability of Hailo-10 to run edge gen AI workloads at the best performance-to-cost and performance-to-power consumption ratio is the biggest highlight. The new chip, which uses the same software suite bundled with Hailo-8 and the Hailo-15, can run Llama2-7B with up to 10 tokens per second at under 5W of power. When working with Stable Diffusion 2.1, it takes under 5 seconds to produce an image while consuming nearly the same amount of power.
“As GenAI on the edge becomes immersive, the focus turns to handling large LLMs in the smallest possible power envelope — essentially less than five watts,” Danon added. Overall, the chip is capable of delivering up to 40 TOPS (tera operations per second), which Hailo claims is a new performance standard for edge AI accelerators.
Notably, Nvidia’s Jetson line of modules also handles edge AI workloads, with performance going up to 275 TOPS. But, that also increases power consumption. A comparable offering from Nvidia that targets entry-level edge AI apps is Jetson Orin Nano, which delivers up to 40 TOPS of AI performance with power options between 5W and 15W.
Danon didn’t say much when VentureBeat reached out about performance gains over comparable Nvidia offerings. However, he did note that Hailo-10 is significantly more effective for inferencing than a GPU-based solution, both in terms of cost and in terms of power consumption, allowing it to be integrated into more compact systems at the edge. As for Intel, Hailo claims that the new processor delivers at least two times more performance at half the power of Intel’s Core Ultra NPU.
“We believe that the unique advantages of our dataflow-based architecture, which delivers high performance at lower power consumption and lower cost, together with the breadth of support to a variety of AI-based applications across different modalities, puts us in a unique position to serve the emerging edge AI market. We are the only vendor in the market that offers a mature hardware and software stack that can support multiple edge platforms with high-performance AI-based analytics, generative AI and AI-based image and sound enhancement, all for a few Watts of power consumption,” he told VentureBeat.
Initial industries to be targeted
When Hailo-10 begins shipping in Q2 2024, it will be targeted at the PC and auto infotainment categories. It remains unclear when it will be adopted for broader applications like gen AI-powered robots – something that Nvidia is already pursuing with its GR00T project.
Currently, Hailo works with 300 global customers across the compute, automotive, security, industry 4.0, retail, medical and other sectors. This includes some big players such as NEC, Bosch, Schneider Electric, Dell, ABB and Foxconn.
Author: Shubham Sharma
Source: Venturebeat
Reviewed By: Editorial Team