With an ever-changing business climate, companies have begun to shift their focus to unstructured data. In the past, unstructured data was challenging to deal with, considering the volume, governance and compliance, so organisations mainly focused on structured datasets.
However, with the rise of generative AI and large language models (LLMs), Reece Williams Griffiths, Field CTO of Collibra, says that we can no longer overlook 80 per cent of enterprise content—from transcripts and PDFs to emails and images.
In this episode of the Don't Panic It's Just Data podcast, host John Santaferraro, CEO and Head Research Analyst at Ferraro Consulting, talks with Griffiths, also Co-Founder and CEO of Deasy Labs (acquired by Collibra). They also talk about the change brought to Collibra after acquiring Deasy Labs.
Governing Structured & Unstructured Data
Following Collibra’s acquisition of Griffiths firm, Deasy Labs, he explains how this merger is making AI truly achievable for businesses. Deasy became renowned for its goal of simplifying data preparation. With Collibra, it’s leading the development of the tools necessary to create order from the chaos and build a unified AI enterprise.
Together, they created the first unified governance and catalogue platform for both structured and unstructured data. This single-hub approach is vital for a future where AI agents treat all data types equally.
Griffiths tells Santaferraro that, historically, Collibra, like others, focused only on structured data. Now, by combining Deasy’s capabilities, the platform provides a single entry point and a smooth experience for all data assets.
One outcome of a unified data strategy is simplified AI use cases. Since AI applications often need to access both tabular data (structured) and documents (unstructured) to give complete answers, unification offers the necessary routing and flexibility, the Field CTO explains.
Preparing Unstructured Data for AI
To effectively use a huge quantity of unstructured content, it must be prepared. Griffiths describes a four-layer data preparation funnel that goes beyond simple classification to deep semantic embedding, ultimately creating a Knowledge Product.
The talk of the moment is the knowledge data product, which the Collibra speaker says is familiar in the structured data scenario; however, not so much on the unstructured data. “We define a knowledge product with four elements – sensitivity, unstructured data quality, metadata for humans, and metadata for AI tools.”
"One key difference to note about them is that in the structured data world, data products are typically consumed by analytics, data and AI teams. Knowledge products, conversely, I think will be consumed by everyone."
Overall, Collibra’s system offers a multitude of solutions, including AI-based taxonomies which can automatically create meaningful taxonomies and segments directly from the data. This dramatically cuts down on the lengthy effort that manual mapping requires from subject matter experts.
This shift lets companies focus on areas where effort brings real value. For example, businesses can scan supplier contracts to identify auto-renewal clauses that trigger in 30 days. Enterprises can prioritise high-value, validated items that need human review. As technology evolves, this entire system is moving toward AI-managed workflows, representing a significant advance toward an autonomous enterprise.
Key Takeaways
- 85 per cent of enterprise content is unstructured (documents, transcripts)
- Unstructured data is the new foundation for scalable AI
- Governance must be unified (structured + unstructured) to simplify tools and serve AI agents effectively.
- Manual data labelling is impossible.
- AI/LLMs must automate metadata generation via a Continuous Tagging System.
- Leaders should adopt the Knowledge Product idea—a governed, AI-ready asset for unstructured data consumed by the entire enterprise.
Chapters
0:00: Intro: The Exciting Moment in Tech
1:42: Unified Governance via Dz Acquisition
4:27: The 4-Layer Unstructured Data Funnel
7:35: Automation: Continuous AI Tagging
12:55: The Value of Knowledge Products
Collibra Bio
Collibra frees your data from the constraints of silos by unifying data and AI governance across your entire ecosystem, regardless of source or compute engine, for ultimate flexibility in how you manage data. Our Collibra Platform gives you automated visibility, control and tracing from input through output, and it automates documentation and data traceability for AI use cases to power speed, data observability and safety. Our enterprise metadata graph enriches data context with every use, and our intuitive UX brings technical and business users into the fold to access and steward data.

Comments ( 0 )