Published on November 28, 2023, 7:18 pm
Nvidia Corp. has announced the launch of a new generative artificial intelligence microservice called NeMo Retriever. This service is designed to enable enterprise businesses to connect custom chatbots, copilots, and AI summarization tools to real-time proprietary company data, thereby delivering more accurate results.
NeMo Retriever is part of the Nvidia NeMo cloud-native family of frameworks and tools that enables the building, customization, and deployment of generative AI models. With this new service, enterprise organizations can incorporate retrieval-augmented generation capabilities into their generative AI applications.
Retrieval-augmented generation (RAG) is a method aimed at enhancing the accuracy and safety of generative AI models by filling in gaps in their knowledge with facts and data retrieved from external sources. While large language models (LLMs) receive initial training with general task knowledge such as understanding conversational prompts and providing question-and-answer capabilities, they lack real-time information and up-to-date domain-specific expertise after deployment. This can result in inaccuracies or “hallucinations” where an LLM answers confidently but incorrectly.
NeMo Retriever addresses this challenge by allowing up-to-date data from various sources like databases, HTML, PDFs, images, videos, and other modalities to be fed into an LLM. Enterprise customers can leverage their proprietary sources of information to keep the model updated with relevant facts. The data can be securely accessed from clouds, data centers, or on-premises locations.
Ian Buck, vice president of hyperscale and high-performance computing at Nvidia believes that combining AI with a customer’s database boosts productivity and accuracy while optimizing the capabilities of models.
By incorporating proprietary data through Retriever’s RAG capability, inaccurate answers can be minimized as LLMs have access to contextual information that helps produce better results. Similar to research papers that cite sources for information credibility, NeMo Retriever provides extra sources of expert information based on a company’s internal domain knowledge. This better informs the LLM, enabling it to deliver more accurate answers.
Unlike community-led open-source RAG toolkits, Nvidia designed Retriever specifically to support commercial and production-ready generative AI models that are already optimized for RAG capabilities, enterprise support, and managed security patches.
Several enterprise customers like Cadence Design Systems Inc., Dropbox Inc., SAP SE, and ServiceNow Inc. are already collaborating with Nvidia to integrate RAG into their custom generative AI tools, applications, and services.
Anirudh Devgan, president and chief executive of Cadence, mentioned that researchers at the company are working with Nvidia to utilize Retriever for producing higher-quality electronics by improving accuracy. This demonstrates how generative AI can introduce innovative approaches to address customer needs.
Buck further highlights that Retriever enables customers to achieve more accurate results with fewer training efforts spent on generative AI models. Enterprise customers can now deploy off-the-shelf models and leverage their own internal data without the need for extensive training sessions to keep the models up-to-date.
NeMo Retriever will be available as part of Nvidia AI Enterprise, a cloud-native software platform that streamlines the development of AI applications. Developers can sign up for early access to NeMo Retriever starting today.