Published on December 19, 2023, 5:20 pm

In recent years, there has been a growing focus on the untapped potential of unstructured data, which includes text, graphics, documents, and IoT streams. These forms of data hold significant value, yet many organizations struggle to fully extract and utilize them. According to an IDC survey, only 46% of businesses have made efforts to extract value from the estimated 90% of unstructured data across enterprises.

However, the rise of generative artificial intelligence (AI) is providing another reason for businesses to explore and surface unstructured data. Companies and IT professionals who have already embraced unstructured data are now in a better position to leverage generative AI and delve deeper into their data stores. Matt Labovich, US Data, Analytics, and AI Leader at PwC, emphasizes that it’s time for enterprises to step up their management of unstructured data from sources such as IoT and knowledge documents like PowerPoints, text files, and Excel spreadsheets. These resources contain invaluable institutional knowledge about business operations that can be harnessed using generative AI.

While structured data strategies have traditionally received more attention, it’s essential to recognize the significant role that unstructured data plays in advancing generative AI. Labovich encourages businesses to shift their focus toward harnessing the power of unstructured data in driving generative AI forward.

Previous AI initiatives were mostly limited to use cases with readily available structured data. Collecting diverse datasets with varying formats proved too complex for wider AI adoption. However, according to a global survey underwritten by Databricks and published in MIT Technology Review Insights, generative AI now has the ability to unlock once-hidden data for extraordinary advances across organizations.

Capturing value from this kind of data is crucial today and considered even more critical than before. A majority (almost 70%) of technology executives surveyed agree that data problems pose a significant risk to their AI and machine learning goals. Text-generating AI systems, such as ChatGPT, rely on large language models (LLMs) that are trained on extensive datasets to answer questions and perform tasks based on statistical likelihoods.

2022 saw a realization of the business applications of generative AI, and nearly 70% of survey respondents recognized the importance of a unified data platform for analytics and AI. With the advent of generative AI, having a flexible, scalable, and efficient data infrastructure becomes crucial. Democratizing access to data and analytics, enhancing security measures, and combining low-cost storage with high-performance querying are key considerations.

However, gathering unstructured data for today’s AI is not an overnight task. Fragmented IT architectures resulting from mergers and acquisitions have made it challenging to access important documents stored in offline proprietary file types. Andrew Blyton, Vice President and CIO of Incyte and former VP of DuPont Water & Protection, highlights the potential value that language models can bring to these documents. By interrogating them with LLMs, organizations can uncover insights that were previously hidden within vast amounts of documentation.

To succeed with generative AI, it’s essential to involve data owners, analysts, and users from across the business. This collaboration is key in maximizing the value derived from unstructured data. Labovich emphasizes that while the CIO enables and supports the process, it’s the responsibility of business leaders to take charge. Operational readiness and change management are crucial factors that require executives throughout the organization to actively participate in identifying critical data, embedding it into workflows, and championing change to foster widespread adoption.

In conclusion, as businesses embrace generative AI technologies, they must recognize the immense value hidden within unstructured data sources like text files and IoT streams. By implementing an effective strategy for managing unstructured data alongside structured data assets, organizations can leverage generative AI to unlock new insights and drive innovation across their operations.


Comments are closed.