Published on January 24, 2024, 10:05 am

Artificial Intelligence (AI) has become an integral part of many businesses, but training and deploying generative AI models can be a costly endeavor, especially when scaling it for millions of customers. Microsoft, in its continuous effort to improve efficiency, has been investing in smaller and more affordable AI models.

According to The Information, Microsoft has formed a new team called GenAI dedicated to developing these smaller conversational AI models. These “small language models” (SLMs) are designed to replicate the quality of larger language models like OpenAI’s GPT-4 while utilizing significantly less computing power. By employing SLMs, Microsoft aims to reduce computing costs by processing simple queries for chatbots like Bing and Windows Copilot.

To bolster its GenAI team, Microsoft has reassigned several top AI developers from its research group. Sebastien Bubeck, who contributed to Microsoft’s flagship SLM Phi-2 (released as open source), is among those leading the charge. Phi-2 was even reported to outperform Google’s commercial SLM Gemini Nano.

Misha Bilenko, corporate vice president, leads the GenAI team under the guidance of CTO Kevin Scott. However, this is not Microsoft’s sole endeavor in language models; the company also maintains a Turing team focused on large-scale language models. These Turing models are integrated into Copilot products, often in conjunction with OpenAI models. Microsoft’s strategy involves leveraging their own models to handle less complex tasks effectively while keeping costs down.

Scaling plays a vital role in the realm of AI development and deployment. As new models become more sophisticated, training phases become computationally intensive and costlier to execute. At the same time, companies like Microsoft strive to make AI technology accessible to a broader user base in order to gain market share rapidly.

Without efficiency gains, costs associated with AI development will only escalate further. The Wall Street Journal cited an anonymous source revealing that earlier this year, Microsoft incurred losses of over $20 per user per month for the generative code AI Github Copilot. Some users even cost the company as much as $80 per month, despite charging an individual fee of $10 per month.

To overcome such challenges, Microsoft’s research chief, Peter Lee, reportedly enlisted a significant portion of the company’s 1,500 researchers to develop smaller and more affordable conversational AI systems by fall 2023.

AI providers are also exploring methods to reduce their reliance on expensive AI chips like Nvidia’s by developing cheaper and more efficient alternatives. The high cost of Nvidia processors is partly due to difficulties in sourcing them. In response to potential chip shortages, OpenAI CEO Sam Altman has engaged in discussions with TSMC about establishing a chip company that would cater primarily to OpenAI’s needs. However, it may take several years for such initiatives to yield noticeable cost benefits.

In pursuit of efficiency gains without compromising quality, Microsoft’s Peter Lee instructed his team to optimize their use of the available 2,000 Nvidia graphics cards within the research unit for developing more efficient AI models. It is crucial to strike the right balance between cost efficiency and maintaining model quality as any compromise could hinder utility and slow down AI adoption.

OpenAI, on the other hand, with its latest model release GPT-4 Turbo, seems focused on efficiency improvements. Reports suggest that enhancing efficiency drives OpenAI’s new generation of models—hence the prototype models being named after deserts. While there have been occasional complaints about reduced performance among ChatGPT users since GPT-4’s launch in March—possibly stemming from measures taken to enhance efficiency—there is currently limited evidence supporting this claim.

In conclusion, Microsoft and other AI providers are actively pursuing strategies to create smaller and more cost-effective AI models while ensuring performance and quality are not compromised. Their efforts aim to minimize computing costs and make AI technology more accessible to a wider audience. Efficiency and scalability will remain key considerations as AI continues to evolve and shape industries across the globe.


Comments are closed.