Data Catalog Implementation

The business wants to take control of its data landscape by implementing a data catalog. But where to start?

The Business Challenge

Over time, a business has found that it has multiple versions of customer information
scattered throughout several systems—CRM systems, ERP systems, marketing databases,
warranty service systems, and more.

The result:
• Some customers are represented in different ways across systems
• Some customers are represented in some systems but not others
• Some customers are represented multiple times in the same system

All these issues make it difficult to construct a single version of the truth.

The BI Challenge

The BI team would like to consolidate the various customer datasets into a single customer
master. This starts by assembling a data catalog, which stores information about an
organization’s data assets. An effective data catalog relies on the metadata of those data

Unfortunately, different systems have different metadata for their customer databases. Not
only do different systems use different field names, but each system represents names in a
different way. Some represent first names and last names in different fields, and others store
the “full name” in a single field, which may or may not have been concatenated using
first-name and last-name fields from some other system. Of these “full name” fields, some
store names in the format “John Smith” and others, “Smith, John.”

Addresses, phone numbers, and other data fields have similar difficulties. And that’s to say
nothing of the actual data, which can have misspellings, duplicates, and other issues that
make it difficult to match the same customers across different systems.

How BI Worked Before Octopai

In the past, reconciling all of these permutations would have required painstaking,
time-consuming, and tedious examination of each system’s databases. This exercise can be
more difficult in systems with convoluted entity relationships in their databases or multiple
tables storing the same customer data.

For data warehouses, the ETL process for each system that feeds the warehouse must be
examined (and perhaps corrected). For large organizations with complex data landscapes,
going through the process and validating the results could take months of work for the BI

BI Groups Are Empowered by Octopai Automation

Octopai can significantly reduce the manual overhead of this process.

• By automating the process of metadata discovery and data lineage.
• By finding all the fields in each system that appear to be, for example, name-related
fields, the BI team’s effort is reduced to examining Octopai’s output and taking
appropriate action. This is a much more productive use of their time compared to looking
through every table’s data dictionary manually.

Octopai empowers BI groups to:
• Discover all ETL processes and tables that deal with the type of data being consolidated
• Automatically trace each field back to its source
• Visualize the relationships through graphical tools
• Quickly assemble the metadata needed to create an effective data catalog

Value to the Organization

• Significantly reduced data catalog project costs
• Better control of data and metadata in disparate systems
• Enhanced ability to consolidate to a single version of the truth
• Better experience for customers who interact with different parts of the organization
(sales, marketing, customer service, technical support, field service, training…)

This website stores cookie on your computer that are used to improve your website experience and provide more personalized services to you, both on this website and through other media. Please take the time to read this Privacy Notice as it is important for you to know how we collect and use your personal information.