The Evolution of the Business Glossary with Gal Ziton, CTO at Octopai
Industry terminology is part and parcel of working in technology. However, while you may think that the terms used are uniform across all of your business units – or perhaps you've never stopped to query it – you should formulate a business glossary if you haven't done so already.
Organisations easily assume that teams share a universal understanding of terms, but, until you enforce clear definitions in your business, you can't be too sure that your terminology is not subject to ambiguity. Especially within larger companies, semantics may vary given the different backgrounds and experience levels of employees.
With a business glossary, you can rest assured that your team are using terminology consistently within your organisation. Furthermore, it will reinforce certainty over terms, eliminating any confusion, which could have a knock-on effect on your business. However, to get the most out of your business glossary, you should consider automation.
We spoke with Gal Ziton, Chief Technology Officer at Octopai, to review traditional business glossaries, explore the potential of automation, and learn how these are especially relevant to business intelligence (BI), databases, data warehouses, and reporting systems.
What can you tell us about why business glossaries are important for organisations and their value in BI and data?
A business glossary is a critical resource that every enterprise should have in order to ensure successful data governance. It's an important reference that helps business analysts, data analysts, data scientists, data governance managers, and also everyone on the BI team to speak the same business and technical language. Why is this important? Because fields that mean the same thing could have completely different names. But a business glossary isn't just a convenient listing of terms to keep everyone on the same page.
Businesses use a Business Glossary to ensure consistency in their data assets, reports, and dashboards. In addition, having a common understanding of business terms makes it easier to create, maintain, and integrate new sources of data in the environment. Finally, when developing and delivering new reports and dashboards, standardization that comes with a Business Glossary enables data professionals to make the connections needed between data elements that mean the same thing but have different names.
How are traditional business glossaries built and utilised?
Building a Business Glossary can take months of work for the entire BI team, consuming enormous amounts of time and money. When an organization doesn't have enough resources to commit to a project of this scale, the alternative is to use an Excel spreadsheet for a specific business domain that includes items/objects for that specific domain. This spreadsheet needs to be filled manually, during which mistakes can happen.
Other than the uses I previously mentioned, I'll give you a specific example of how a Business Glossary used in day-to-day operations. A Data Analyst wants to create a new report and wants to drag a column that defines the business value of a customer in the organization into this report. The data analyst can't know the exact name of this column without a Business Glossary.
If he were using an Automated Business Glossary, he could find the column that he is looking for by searching part of the description of this column, and additionally he could discover more properties for this column such as: technical description, business description, calculation description, links to other columns with different names but with the same meaning, who is the owner of this column, from which source system this column was generated, and more...
Building traditional business glossaries sounds like it can be quite a large, and perhaps tedious, task. How might automation alleviate some of the burden associated with this?
Using automation to build a business glossary reduces or eliminates tedious data entry work. There is no need to enter the metadata terms manually and no worry that entry errors will be made because the glossary is generated automatically using your existing metadata.
An automated business glossary also centralizes data layers (physical, semantic and presentation) from all reporting systems that the enterprise is using. It also standardizes data layers across the different reporting systems, making sure that the terms used in SSRS and PowerBI, for example, are one and the same, or are at least linked in the glossary.
In addition, logical and calculated data items must be accounted for when building a business glossary. But logical data items aren't represented in a single column, table or data source. They are composed of data items and transformed and combined from other sources. Calculated data items are calculated from multiple other numerical data items using a defined formula. Accounting for both of these manually is almost mission impossible. In an automated business glossary, they're accounted for, well, automatically.
Finally, what does automation mean for the day-to-day usage of the business glossary?
You mean beyond the huge money and time savings? Beyond freeing up your BI team to work on the projects and tasks they should be focusing on?
The things that a Business Glossary is used for in a business, (one source of truth, common business language, improving self-service BI, increasing company-wide consistency) are even more reliable and trustworthy when that glossary was created automatically.
Using automation to generate a business glossary eliminates errors that tend to happen when data is entered manually. And when errors like that are made, they affect the reports that are built based on the terms in the glossary. These reports can negatively affect important business decisions, and well, you get the idea.
Using automation also means that if a data asset is added to the reporting system while the glossary is being built or even after, the glossary will be updated automatically. Ongoing maintenance in order to keep the business glossary up-to-date is simple and quick.