AI & Robotics News

A new way to optimize and prioritize AI projects for the GPU shortage

September 3, 2023

Generative AI, enabled by large language models (LLMs) like GPT-4, has caused shockwaves in the tech world. ChatGPT’s meteoric rise has triggered the global tech industry to reassess and prioritize gen AI, reshaping product strategies in real time.

Integration of LLMs has given product developers an easy way to incorporate AI-powered features into their products. But it’s not all smooth sailing. A glaring challenge looms large for product leaders: the GPU shortage and spiraling costs.

The increasing number of AI startups and services has led to high demand for high-end GPUs such as A100s and H100s, thereby overwhelming Nvidia and its manufacturing partner TSMC, both of whom are struggling to meet the supply. Online forums like Reddit are abuzz with frustrations over GPU availability, echoing the sentiment across the tech community. It’s become so dire that both AWS and Azure have had no choice but to implement quota systems.

This bottleneck doesn’t just squeeze startups; it’s a stumbling block for tech giants like OpenAI. At a recent off-the-record meeting in London, OpenAI’s CEO Sam Altman candidly acknowledged that the computer chip shortage is stymieing ChatGPT’s advancement. Altman reportedly lamented that the dearth of computing power has resulted in subpar API availability and has obstructed OpenAI from rolling out larger “context windows” for ChatGPT.

On the one hand, product leaders find themselves caught in a relentless push to innovate, facing the expectations to deliver cutting-edge features that leverage the power of gen AI. On the other hand, they grapple with the harsh realities of GPU capacity constraints. It’s a complex juggling act, where ruthless prioritization becomes not just a strategic decision but a necessity.

Given that GPU availability is poised to remain a challenge for the foreseeable future, product leaders must think strategically about GPU allocation. Traditionally, product leaders have leaned on prioritization techniques like the Customer Value/Need vs. Effort Matrix. This method, however logical in a world where computational resources were abundant, now demands a bit of reevaluation.

In our current paradigm, where compute is the constraint and not software talent, product leaders must redefine how they prioritize various products or features, bringing GPU limitations to the forefront of strategic decision-making.

Planning around capacity constraints might seem unusual for the tech industry, but it’s a commonplace strategy in other industries. The underlying concept is straightforward: The most valuable factor is the time spent on the constrained resource, and the objective is to optimize the value per unit of time spent on that constraint.

As a former consultant, I’ve successfully applied this framework across various industries. I believe that tech product leaders can also use a similar approach to prioritize products or features while GPU constraints exist. When applying this framework, the most straightforward measure of value is profitability.

However, in tech, profitability might not always be the appropriate metric, particularly when venturing into a new market or product. Thus, I have adapted the framework to align with the success metrics generally used in tech, outlining a simple four steps process:

First and foremost, identify your North Star metric. This is the contribution of each product or feature, something that encapsulates the essence of its worth. Some concrete examples might include:

Gauge the number of GPUs needed for each product or feature. Focus on key factors including:

Break it down to the specifics. How does each GPU contribute to the overall goal? Understanding this will give you a clear picture of where your GPUs are best allocated.

Now, it’s time to make the tough decisions. Rank your products by their Contribution per GPU, and then line them up accordingly. Focus on the products with the highest Contribution per GPU first, ensuring that your limited resources are channeled into the areas where they’ll make the most impact.

With GPU constraints no longer a blind spot but a quantifiable factor in the decision-making process, your company can more strategically navigate the GPU shortage. To bring this framework to life, let’s visualize a scenario where you, as a product leader, are grappling with the challenge of prioritizing among four different products:

Although Product A has the highest revenue potential, it doesn’t yield the highest contribution per GPU. Surprisingly, Product D, with the least revenue potential, offers the most substantial return per GPU. By prioritizing based on this metric, you could maximize total potential revenue.

Let’s say you have a total of 1,000 GPUs at your disposal. A straightforward choice might have you opting for Product A, generating a revenue potential of $100 million. However, by applying the prioritization strategy described above, you could achieve $155 million in revenue:

The same method can be applied to other contribution metrics, such as market share gain:

Similarly, selecting Product A would have led to a market share gain of 5%. However, applying the prioritization strategy described above, you could achieve 7.75% in market share gain:

This alternative prioritization framework introduces a more nuanced and strategic approach. By zeroing in on the Contribution Per GPU, you’re strategically aligning resources where they can make the most substantial difference, whether in terms of revenue, market share or any other defining metric.

But the advantages don’t stop there. This method also fosters a greater sense of clarity and objectivity across product teams. In my experience, including my early days leading digital transformation at a healthcare company and later while working with various McKinsey clients, this approach has been a game-changer in scenarios where capacity constraints are a critical factor. It’s enabled us to prioritize initiatives in a more data-driven and rational way, sidelining the traditional politics where decisions might otherwise fall to the loudest voice in the room.

However, no one-size-fits-all solution exists, and it’s worth acknowledging the potential limitations of this method. For instance, this approach may not always encapsulate the strategic importance of certain investments. Thus, while exceptions to the framework can and should be made, they ought to be carefully considered rather than the norm. This maintains the integrity of the process and ensures that any deviations are made with a broader strategic context in mind.

Product leaders are facing an unprecedented situation with the GPU shortage, so finding new ways of managing resources is needed. In the words of the great strategist Sun Tzu, “In the midst of chaos, there is also opportunity.”

The GPU shortage is indeed a challenge, but with the right approach, it may also be a catalyst for differentiation and success. The proposed prioritization framework, focusing on Contribution Per GPU, offers a strategic way to prioritize. By zeroing in on Contribution Per GPU, companies can maximize their return on investment, aligning resources where they’ll make the most impact and focusing on what matters the most to the long-term success of their company.

Prerak Garg is senior director of cloud and AI corporate strategy at Microsoft and a former McKinsey and Company engagement manager.

Head over to our on-demand library to view sessions from VB Transform 2023. Register Here

Generative AI, enabled by large language models (LLMs) like GPT-4, has caused shockwaves in the tech world. ChatGPT’s meteoric rise has triggered the global tech industry to reassess and prioritize gen AI, reshaping product strategies in real time.

Integration of LLMs has given product developers an easy way to incorporate AI-powered features into their products. But it’s not all smooth sailing. A glaring challenge looms large for product leaders: the GPU shortage and spiraling costs.

Rise of LLMs and GPU shortage

The increasing number of AI startups and services has led to high demand for high-end GPUs such as A100s and H100s, thereby overwhelming Nvidia and its manufacturing partner TSMC, both of whom are struggling to meet the supply. Online forums like Reddit are abuzz with frustrations over GPU availability, echoing the sentiment across the tech community. It’s become so dire that both AWS and Azure have had no choice but to implement quota systems.

This bottleneck doesn’t just squeeze startups; it’s a stumbling block for tech giants like OpenAI. At a recent off-the-record meeting in London, OpenAI’s CEO Sam Altman candidly acknowledged that the computer chip shortage is stymieing ChatGPT’s advancement. Altman reportedly lamented that the dearth of computing power has resulted in subpar API availability and has obstructed OpenAI from rolling out larger “context windows” for ChatGPT.

Event

VB Transform 2023 On-Demand

Did you miss a session from VB Transform 2023? Register to access the on-demand library for all of our featured sessions.

Prioritizing AI features

On the one hand, product leaders find themselves caught in a relentless push to innovate, facing the expectations to deliver cutting-edge features that leverage the power of gen AI. On the other hand, they grapple with the harsh realities of GPU capacity constraints. It’s a complex juggling act, where ruthless prioritization becomes not just a strategic decision but a necessity.

Given that GPU availability is poised to remain a challenge for the foreseeable future, product leaders must think strategically about GPU allocation. Traditionally, product leaders have leaned on prioritization techniques like the Customer Value/Need vs. Effort Matrix. This method, however logical in a world where computational resources were abundant, now demands a bit of reevaluation.

In our current paradigm, where compute is the constraint and not software talent, product leaders must redefine how they prioritize various products or features, bringing GPU limitations to the forefront of strategic decision-making.

Planning around capacity constraints might seem unusual for the tech industry, but it’s a commonplace strategy in other industries. The underlying concept is straightforward: The most valuable factor is the time spent on the constrained resource, and the objective is to optimize the value per unit of time spent on that constraint.

Technology success metrics

As a former consultant, I’ve successfully applied this framework across various industries. I believe that tech product leaders can also use a similar approach to prioritize products or features while GPU constraints exist. When applying this framework, the most straightforward measure of value is profitability.

However, in tech, profitability might not always be the appropriate metric, particularly when venturing into a new market or product. Thus, I have adapted the framework to align with the success metrics generally used in tech, outlining a simple four steps process:

1. Contribution

First and foremost, identify your North Star metric. This is the contribution of each product or feature, something that encapsulates the essence of its worth. Some concrete examples might include:

An increase in revenue and profit
Gains in market share
Growth in the number of daily/monthly active users

2. Number of GPUs required

Gauge the number of GPUs needed for each product or feature. Focus on key factors including:

Number of queries per user per day
Number of daily active users
Complexity of the query (how many tokens each query consumes)

3. Calculate contribution per GPU

Break it down to the specifics. How does each GPU contribute to the overall goal? Understanding this will give you a clear picture of where your GPUs are best allocated.

Prioritize products based on contribution per GPU

Now, it’s time to make the tough decisions. Rank your products by their Contribution per GPU, and then line them up accordingly. Focus on the products with the highest Contribution per GPU first, ensuring that your limited resources are channeled into the areas where they’ll make the most impact.

With GPU constraints no longer a blind spot but a quantifiable factor in the decision-making process, your company can more strategically navigate the GPU shortage. To bring this framework to life, let’s visualize a scenario where you, as a product leader, are grappling with the challenge of prioritizing among four different products:

	Product A	Product B	Product C	Product D
Revenue Potential (Contribution)	$100M	$80M	$50M	$25M
Number of GPUs Required	1,000	450	500	50
Contribution Per GPU	$0.1M/GPU	$0.18M/GPU	$0.1M/GPU	$0.5M/GPU

Although Product A has the highest revenue potential, it doesn’t yield the highest contribution per GPU. Surprisingly, Product D, with the least revenue potential, offers the most substantial return per GPU. By prioritizing based on this metric, you could maximize total potential revenue.

Let’s say you have a total of 1,000 GPUs at your disposal. A straightforward choice might have you opting for Product A, generating a revenue potential of $100 million. However, by applying the prioritization strategy described above, you could achieve $155 million in revenue:

Priority Order	Product	Revenue Gain	GPUs
1	Product D	$25M	50
2	Product B	$80M	450
3	Product C	$50M	500
Total		$155M	1,000

The same method can be applied to other contribution metrics, such as market share gain:

	Product A	Product B	Product C	Product D
Market Share Gain (Contribution)	5%	4%	2.5%	1.25%
Number of GPUs Required	1,000	500	500	50
Contribution Per GPU	0.005%/GPU	0.008%/GPU	0.005%/GPU	0.025%/GPU

Similarly, selecting Product A would have led to a market share gain of 5%. However, applying the prioritization strategy described above, you could achieve 7.75% in market share gain:

Priority Order	Product	Market Share gain	GPUs
1	Product D	1.25%	50
2	Product B	4%	450
3	Product C	2.5%	500
Total		7.75%	1,000

Benefits and limitations

This alternative prioritization framework introduces a more nuanced and strategic approach. By zeroing in on the Contribution Per GPU, you’re strategically aligning resources where they can make the most substantial difference, whether in terms of revenue, market share or any other defining metric.

But the advantages don’t stop there. This method also fosters a greater sense of clarity and objectivity across product teams. In my experience, including my early days leading digital transformation at a healthcare company and later while working with various McKinsey clients, this approach has been a game-changer in scenarios where capacity constraints are a critical factor. It’s enabled us to prioritize initiatives in a more data-driven and rational way, sidelining the traditional politics where decisions might otherwise fall to the loudest voice in the room.

However, no one-size-fits-all solution exists, and it’s worth acknowledging the potential limitations of this method. For instance, this approach may not always encapsulate the strategic importance of certain investments. Thus, while exceptions to the framework can and should be made, they ought to be carefully considered rather than the norm. This maintains the integrity of the process and ensures that any deviations are made with a broader strategic context in mind.

Conclusion

Product leaders are facing an unprecedented situation with the GPU shortage, so finding new ways of managing resources is needed. In the words of the great strategist Sun Tzu, “In the midst of chaos, there is also opportunity.”

The GPU shortage is indeed a challenge, but with the right approach, it may also be a catalyst for differentiation and success. The proposed prioritization framework, focusing on Contribution Per GPU, offers a strategic way to prioritize. By zeroing in on Contribution Per GPU, companies can maximize their return on investment, aligning resources where they’ll make the most impact and focusing on what matters the most to the long-term success of their company.

Prerak Garg is senior director of cloud and AI corporate strategy at Microsoft and a former McKinsey and Company engagement manager.

DataDecisionMakers

Welcome to the VentureBeat community!

DataDecisionMakers is where experts, including the technical people doing data work, can share data-related insights and innovation.

If you want to read about cutting-edge ideas and up-to-date information, best practices, and the future of data and data tech, join us at DataDecisionMakers.

You might even consider contributing an article of your own!

Read More From DataDecisionMakers

Author: Prerak Garg, Microsoft
Source: Venturebeat
Reviewed By: Editorial Team

1926

0