Published on January 4, 2024, 9:26 am
The demand for generative AI initiatives is driving chipmakers, hyperscalers, and other IT providers to rush in delivering the processors needed by enterprises. However, there is a potential barrier emerging this year: a scarcity of graphics processing units (GPUs). Last year’s enthusiasm for generative AI adoption has created an increased demand for GPUs, which could potentially outpace supply.
During a Senate hearing in May, OpenAI CEO Sam Altman acknowledged the shortage of GPUs, stating that “we’re so short on GPUs” and even suggesting that “the less people [that] use our products, the better.” Cloud providers like Microsoft and AWS also raised concerns about GPU supply chain shortages.
For CIOs, preparing for AI adoption goes beyond securing GPUs. It involves gathering the necessary data, talent, and tools required to fine-tune models. Additionally, CIOs must navigate through various cloud-based AI services and software-as-a-service add-ons to establish a solid foundation. Tom Reuner from HFS Research stresses that while GPU constraints are real, the bigger challenge lies in building a strong business case for adopting AI.
Fortunately, most CIOs are shielded from short-term chip availability fluctuations through their cloud providers. While companies looking to purchase GPUs for on-premises capabilities might face waitlists, the major concern lies with vendors training models such as cloud service providers. However, these supply chain bottlenecks can still impact enterprises if they cannot acquire the necessary servers or receive timely cloud services.
Although CIOs typically do not directly purchase chips from manufacturers like Nvidia or Intel, they indirectly obtain processing power through partnerships with hardware vendors such as Dell, HP, and Microsoft. Given the demand surge for generative AI tools driven by GPUs, it is crucial for CIOs planning broad deployments to discuss capacity forecasts with their cloud providers.
Despite concerns about chip shortages and limited manufacturing capacity within the industry’s key players like TSMC, Samsung, and GlobalFoundries, enterprise customers and CIOs have not yet experienced significant issues. Erik Brown from West Monroe emphasizes that for most of his clients, there is no GPU shortage as they focus on effectively utilizing cloud providers and exploring specific GPU-tuned cloud offerings.
Analysts believe that smaller and more contained models can help reduce GPU consumption. Microsoft’s Phi suite, released earlier this month, exemplifies this approach. Gartner envisions the development of small-sized language model (LLM) models restricted to certain users that may not depend heavily on public cloud resources. This trend enables models to run in distributed clouds or on-premises applications with lower compute requirements.
As CIOs become more knowledgeable about sourcing GPUs and implementing LLM tools, the industry is likely to rely less on GPU resources. Skilled prompt engineering and understanding the capabilities of off-the-shelf models are critical in many use cases. While fine-tuning parameters can be challenging, it has a narrower scope than training a model.
In the event of a GPU shortage, CIOs will need to choose between waiting for improved supply or finding substitute products. There may also be a need to adjust project ambitions and focus on a smaller number of AI initiatives due to constrained resources.
Overall, while there are challenges related to scarcity in GPUs for generative AI initiatives, enterprises can navigate through this by leveraging partnerships with cloud providers, optimizing workloads efficiently, and exploring alternative models that reduce the dependence on GPUs. With strategic planning and resource allocation adjustments, CIOs can continue driving innovation in their organizations’ AI projects even amidst potential shortages.