Enterprises today are focused on ensuring they have robust data management tools in place to enable them to find and understand their data. Some organizations know exactly what they need, while others can be overwhelmed or confused by all the different solutions out there.
Let’s take “Data Dictionary” and “Business Glossary” for example. They might sound similar, but are they? Yes and no. They’re actually quite different in what they are and how they’re used. It is important to note when differentiating between the two that they are mainly silo – or project-based, and therefore the value they provide across the multi-system business intelligence infrastructure is limited.
Let’s review each and clarify their uses.
What is a Data Dictionary?
As the name suggests, Data Dictionaries provide information about your data. Descriptions can include data attributes, fields, or other properties such as data type, length, valid values, default values, relations with other data fields, business definition, transformation rules business rules, constraints etc.—anything you need to define each physical data element inside operational data sources and data warehouses. This is also relevant for logical BI data objects, and it should have a business flavor to it, not just technical.
A Data Dictionary should be a one-stop-shop for IT system analysts, designers and developers to understand everything about their metadata. They are used to help translate data level business requirements into technical requirements, and should ideally be able to provide this information in an easily understood, structured and organized way. IT teams should be able to tell within a few seconds exactly which inputs should be included in order to meet project goals, from attribute type to field requirements to default values.
Data Dictionaries are often presented in spreadsheet format with rows and columns defining each attribute or metadata category that needs to be addressed in a system. They are sometimes something someone enters on their own and has to comment on and refresh manually.Data Dictionaries look over the system catalog of a database and pull specific objects into the database. For a column this may include:
- Column Name
- Column Datatype
- Column Null Rule
- If the Column is a Primary Key
- If the Column is a Foreign Key
- Descriptive information that a user has entered available in the System Catalog
Information within a Data Dictionary helps both BI developers and business owners who are looking to pursue analytics on their own. The Data Dictionary is essentially a one-stop-shop that shows which type of tables and columns exist.
What is a Business Glossary?
Business Glossaries help define terminology across business units. They offer clear definitions across the entire enterprise with the goal of keeping terms consistent and helping everyone stay on the same page.
A quality Business Glossary is an important part of collaboration, particularly in larger businesses that span numerous departments. You’d be surprised at how differently each different business unit defines data elements relevant to their own operations, even in related departments (such as sales and marketing). As organizations define the logical meaning of data elements and can create their own calculated columns, there is a lot of room for inconsistency.
Here’s a really great chart that clearly lays out the differences between a business glossary, a data dictionary and a data catalog:
A Business Glossary should be more. Much more.
Having an accurate understanding of what’s going on in your BI systems is a must, but you cannot have an accurate understanding of what’s going on in your BI systems when, alas, your Business Glossary standardizes terminologies only within single silo systems, and not across the whole landscape.
Business Glossaries build clear expectations by pulling data from reporting tools, standardizing all terms, and making it clear which terms are associated with which rules, policies, and reports. This enables people to make connections between all the elements with the same meaning but with different names, and then helps the BI team deliver BI outputs such as reports or processes with a complete understanding of the internal customer side.
You can build a Business Glossary AUTOMATICALLY
No, seriously. We use your metadata to generate the Business Glossary on the spot.Learn More
Differences in Application
Clearly, every business needs both Data Dictionaries and Business Glossaries, but there’s still plenty of confusion out there about the application of each. What is clear though, is that both require a lot of time and manual setup to get them going and are in general difficult to implement.
Data dictionaries provide IT frameworks
Since Data Dictionaries deal with the specifications of each database and system, they’re used more by IT teams. Data Dictionaries are used primarily by the designers and engineers who build/change the processes, and as such, they’re fairly technical. Most departments outside of IT won’t deal with Data Dictionaries too often.
Business glossaries offer more company-wide consistency
Business Glossaries are a bit more accessible. As Business Glossaries standardize definitions, they’re often used by just about everyone in the organization—especially business analysts and BI teams. Unlike Data Dictionaries (which are more technical), Business Glossaries are more logic-based; their purpose is to clarify terminology and help each department tie unique data into the overall system.
Because Business Glossaries exist in silo tools, the definition of terms is not always standardized – which leads to multiple truths. Ideally, business intelligence teams will keep these resources close at hand and integrate both into their decision-making: Business Glossaries provide the business language, and data dictionaries provide the technical details. Together, these aspects influence how communication flows across a company and how teams collaborate on each project.
The Major Difference Between a Data Dictionary and a Business Glossary
The main issue with Data Dictionaries is that they typically only display a database’s physical structure, which isn’t usually enough information for a BI developer to understand what each column contains.
This is a significant issue that both Data Dictionaries and Business Glossaries face. A Business Glossary contains business terms, but exactly which database columns relate to these business terms? A Data Dictionary contains database columns, but which business terms relate to these database columns?
The first way to reconcile the discrepancy between a Data Dictionary and a Business Glossary is manually. Many organizations have attempted this but the task can be extremely expensive and time-consuming, and the results may be prone to errors. This is typically performed by analyzing data values in the physical columns. Attempting to determine what each column translates to in business terms can be unreliable. Additionally, the budget required relative to the results produced is not feasible for many companies.
Data profiling is another approach implemented by many organizations. In this context, “Data profiling” means automatically looking over and classifying the content of a column. This solution also poses some problems. For example, it may be discovered that each individual value in a column contains an “@” sign followed by a website. As we all know, this signifies an email address. But what type of email address is it exactly? This limits our understanding of what the column actually represents and how it can be applied in a business context. So, the only way to remedy this situation involves manual effort, which is costly and may result in many mistakes.
Instead of wasting time trying to connect the dots, a section of the Business Glossary called the BI Catalog becomes applicable here as it helps fix the discrepancy between Data Dictionary and Business Glossary. It does so by linking a report label to each column, which helps explain where the data originates and the type of data you are viewing.
The BI Catalog is automated, and in order to understand all there is to know about the column, each individual column is associated with a specific report, which is connected to the specific label (or business term) it has been given.
Once the BI Catalog is established, Data Discovery can be utilized. With the comprehensive list of all data assets found within a Data Dictionary, BI developers can now locate the specific data that they need. Informed by the Data Dictionary, specific columns are pinpointed that may be applicable. This capability, paired with the Business Glossary, helps streamline the process of analyzing data sets by cutting down on both costs and time.
So how do you boost your data governance efforts?
How metadata management automation can help organizations implement a business glossary or data dictionary
For organizations working on implementing a Data Dictionary, having a full view of metadata across the entire BI infrastructure is critical. With metadata management automation, all metadata from each individual silo tool throughout the BI landscape is centralized in one place and easily accessible.
Likewise, organizations looking at implementing a Business Glossary would be able to get a full description and full path for any searched element if they leveraged metadata management automation. Octopai, for example, takes all the descriptions from reporting tools (from the semantic/logic layers) and correlates logical columns to physical columns, drastically improving accuracy and streamlining the implementation of a Business Glossary.
Happy Boss, Happy BI Team: Automate Your Business Glossary
Building a Business Glossary can be an incredibly time-consuming and costly project – especially since it demands lots of manual data entry. This kind of project can cause your BI team to burn out and keep them from focusing on the tasks they were hired to do. We don’t want that now, do we? Actually, since building a Business Glossary demands so much costly manpower and dedicated time, most enterprises opt to keep putting it off, despite how critical it is to the organization.
Using automation to build a Business Glossary nips both those problems in the bud as the majority of the manual work is done, you guessed it, automatically. Seriously – your Business Glossary is generated on the spot with your own BI metadata.
Using automation to generate your Business Glossary provides other benefits as well. If Joe from IT adds a data asset in the reporting system, you don’t have to worry about him updating you about it while building the Business Glossary. It will update automatically. In addition, when doing any process manually the chances of there being errors are pretty high. Automation eliminates errors completely.
First understand what it is that you have, and then get organized.
Many companies choose to work with Octopai as they embark on a Business Glossary or Data Dictionary project as they know just how cumbersome such a project can be, and they understand the added value of metadata management automation – specifically when it comes to cutting down the set up time, reducing the amount of manual tracing and boosting overall project accuracy.
Moreover, it is important to realize that there’s no real use in getting organized before you are able to see everything you have. Organizations use Octopai to discover and understand all their metadata, and then they move to the next step of getting organized with Data Dictionaries, Business Glossaries, Data Catalogs etc.
Updated July 2020