AI & Robotics News

Spell, Graphcore partner to build next-gen AI infrastructure

February 8, 2022

Join today’s leading executives online at the Data Summit on March 9th. Register here.

Two 5-year-old startups — one each from the U.K. and the U.S. — today announced a partnership to design and build what they describe as “the next generation of AI infrastructure.”

New York City-based Spell, which operationalizes deep learning at scale for natural language processing (NLP), machine vision, and speech recognition applications, has joined Bristol-based Graphcore, developer of a microprocessor designed for next-generation AI computing, as the participants in this new venture.

Graphcore is the inventor of what it calls the intelligence processing unit (IPU), a sophisticated microprocessor specifically designed for current and next-generation artificial intelligence workloads. Graphcore’s IPU-POD data center systems, for scale-up and scale-out AI computing, offer the ability to run large models across multiple IPUs or to share the computing resources between different users and workloads. An IPU-POD enables all the IPUs in servers to communicate and synchronize their connections.

Spell will bring on-demand access to IPUs and help unify the ever-expanding AI ecosystem, which is reaching into applications not dreamed of only a few years ago. Graphcore is one of the most prominent AI chip makers in the world and is backed by Sequoia Capital, Microsoft, Dell, Samsung, BMW iVentures, Robert Bosch Venture Capital, among others.

GPU (vector) and IPU (graph) central processing units are designed for office apps — GPUs for graphics and IPUs for machine intelligence. IPUs have a structure that provides efficient massive compute parallelism hand in hand with huge memory bandwidth.

Making AI development easier to do

The two companies will build a new AI hardware-software package that integrates Graphcore’s IPU-POD scale-out systems with Spell’s eponymous, hardware-agnostic MLOps software platform for deep learning (DLOps) to make advanced AI development faster, easier, and less expensive.

“There are lots of businesses that can’t really take advantage of deploying ML and AI very well, and a lot of the infrastructure that exists for deployment are in the possession of big companies, such as Facebook and Microsoft,” Matt Fyles, SVP of software at Graphcore, told VentureBeat. “There are not a lot of products that smaller companies can take advantage of in those types of deployments and for that use of hardware, and that’s where Spell comes in.”

With the emergence of AI centered on large neural nets, transformers, and other advanced AI models, there is a growing demand for more specialized AI computing instances. These include Graphcore IPU systems, which accelerate model training, help lower cost, and enable new breakthroughs in ML, Fyles said. Conventional MLOps tools for traditional machine learning do not provide the abstraction and automation needed to enable and assure compliance, reproducibility, and resource management, both human and computational, for deep learning AI, Fyles said.

Graphcore and Spell are offering a free trial of this powerful hardware and software combination to commercial AI practitioners and academic researchers, Spell CEO and cofounder Serkan Piantino told VentureBeat.

“Graphcore’s IPU has consistently demonstrated its ability to accelerate the most widely used AI models and workloads,” Piantino said.

Graphcore said it will utilize Spell Workplaces to provide developers free access (limited to six hours of total compute time) to both interactive Jupyter notebooks and direct remote execution to run IPUs in the cloud, supported by a range of quickstart code tutorials leveraging popular AI models, including natural language models (e.g., BERT), computer vision models (e.g., EfficientNet, ResNet) and Graph Neural Networks (e.g., temporal graph networks).

Analysis from Gartner Research

Gartner Predicts that by 2025, AI will be the top category driving infrastructure decisions due to the maturation of the AI market, resulting in a 10-fold growth in compute requirements (Predicts 2021: Operational AI Infrastructure and Enabling AI Orchestration Platforms), Gartner analyst Chirag Dekate told VentureBeat.

Where does Graphcore/Spell play in a category dominated by Nvidia, which owns more than 80% of the market?

“We had the opportunity to highlight Spell.ml’s capabilities in a recent Cool Vendor research note, “Cool Vendors in Enterprise AI Operationalization and Engineering,” Dekate said. “Spell seeks to transform the entire end-to-end infrastructure-agnostic MLOps platform for experiment orchestration, comparing multiple experiments, hyperparameter optimization, model catalog, deployment, and governance of models.

“Spell’s approach is infrastructure-agnostic, enabling data and analytics leaders to unlock innovation across hybrid multicloud contexts. Spell is currently differentiated from its peers in part because it can operate across any on-premises environment or critical cloud service providers, including Amazon Web Services, Google Cloud Platform, and Microsoft Azure Cloud,” Dekate said.

Spell’s platform comprises powerful command-line tools that automate packaging data and model and orchestrate execution across a hybrid multicloud context, Dekate said.

“Further, Spell enables data scientists to leverage existing Python notebooks and exposes a notebook-friendly environment that is easy to utilize,” Dekate said. “Spell offers feature-rich dashboards that allow data and analytics leaders to manage deep learning models across a diversified hybrid multicloud estate to orchestrate and productionalization of AI pipelines.”

More details from Dekate’s analysis of the upcoming Graphcore-Spell solution:

Open source: Cubeflow/MLflow like environments enable orchestration of end-to-end ML pipelines.
Managing the end-to-end ML pipeline: These ecosystems enable easier integration of GPUs and utilization of the same using standard Kubernetes stacks that can be provisioned quickly in any cloud or on-premises environment.
Cloud-based environment: Easiest and possibly the most effective way of GPU utilization. Here GPUs are deeply integrated across major cloud MLOps environments, including Amazon Sagemaker, Google Vertex AI, and Azure. Container management across these environments is deeply integrated and optimized for their container-native stacks (AKS, EKS, and GKE).

“As the maturity in the AI infrastructure landscape accelerates, technologies including DNN ASICs (Graphcore, Cerberus, Sambanova, etc.) are increasingly positioned to deliver a differentiated approach to value capture from AI. Spell’s support for Graphcore integration enables enterprises to focus their energies on value capture from AI by enabling all of the above capabilities into a GraphCore user experience,” Dekate said.

VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Learn More

Author: Chris J. Preimesberger
Source: Venturebeat

895

0