Published on March 27, 2024, 8:32 am

In the world of technology and major investments, how would you spend $10 million to elevate your tech company’s prominence? Would you consider a Super Bowl ad or an F1 sponsorship? Perhaps you might opt for something more unconventional like training a generative AI model.

Generative AI models are not your typical marketing tools, but they have a unique way of capturing attention. Recently, Databricks unveiled DBRX, a new generative AI model comparable to OpenAI’s GPT series and Google’s Gemini. This innovative model is available on platforms like GitHub and Hugging Face, enabling both research and commercial utilization.

Naveen Rao, VP of generative AI at Databricks, highlighted that DBRX has been meticulously crafted to provide valuable information across various topics. While initially optimized for English, it also supports conversations and translations in languages such as French, Spanish, and German.

Although praised for its performance surpassing existing open-source models according to standard benchmarks after a substantial investment of $10 million and eight months in training, there are limitations to accessing DBRX. Utilizing DBRX efficiently requires significant hardware resources like Nvidia H100 GPUs, making it challenging for individual developers or small businesses.

To address these challenges, Databricks offers the Mosaic AI Foundation Model as a managed solution. This comprehensive package includes running DBRX alongside other models and provides tools for fine-tuning based on custom data requirements. Furthermore, customers can either host DBRX privately through Databricks’ services or collaborate to deploy the model on their preferred hardware configurations.

Rao emphasized Databricks’ commitment to enhancing the platform’s capabilities for customized model building with improved performance metrics compared to competitors. Despite its advancements and speed enhancements over previous models such as Llama 2 due to its unique MoE architecture involving 16 experts instead of the conventional eight, there are areas where DBRX falls short when compared to leading generative AI models like OpenAI’s GPT-4.

Acknowledging potential limitations such as hallucinating answers or being unable to process images in addition to text data as seen in multimodal models like Gemini, Rao assured that red teaming exercises were conducted during training to enhance accuracy while maintaining data integrity.

While DBRX offers promising capabilities and improvements over time through continuous refinement by Databricks’ R&D team at Mosaic Labs, competing solutions in the generative AI space may appeal more broadly due to pricing or accessibility factors. Nevertheless, Databricks remains dedicated to advancing open-source models as a foundation for customers to innovate further with tailored solutions using their cutting-edge tools.


Comments are closed.