Establishing a single, enterprise-wide source of truth?
Streamlining a cloud migration?
Increasing data quality and accuracy?
Why are data catalog use cases so downright… predictable?
If you can rattle off the top five or ten enterprise data catalog use cases in your sleep, this post is an attempt to add a little more color and variety to your data life.
Here are three ways enterprises can leverage their data catalogs that don’t make the standard lists.
New Employee Onboarding and Training
What? The data catalog as an HR tool?
If your company is (or has visions of itself as) a data-driven organization, then getting new employees comfortable in your data environment is just as important as making them feel comfortable at your company happy hour.
And with remote work becoming a standard across the globe, effortless bonding with your company’s data may be even more important than becoming buds with your company’s other employees.
A comprehensive enterprise data catalog is a user-friendly environment for getting to know your company’s data landscape. It’s a place where a new employee can:
- explore and get the lay of the land
- connect to and ask questions from those in the know
- get a feel for your company’s standards
In order to contribute significantly to new employee onboarding, a metadata catalog must:
- Be user-friendly, with a search function and results that feel as natural as using Amazon or any other online marketplace
- Feature built-in collaboration tools, where a user can easily identify and communicate with the data owner, steward or subject matter expert
Integrate your enterprise data catalog into your new hire onboarding process with an official introduction and a walk-through of the main functionalities that will help your employee integrate into their new role (e.g. search, collaborative tools, data preview, quality assessment features, etc.). Then let your new hire loose to explore, discover, connect… and develop into a meaningful contributor to your enterprise.
Fraud Detection and Prevention (and Other Training Models)
Money makes your enterprise go ‘round. Which means that anyone attempting to defraud your enterprise must be stopped in his tracks.
But fraud doesn’t look like a holdup at gunpoint. Fraud is insidious. Fraud sneaks in the backdoor and sneaks out with the loot. When fraud is discovered after it happens, it’s usually too late.
Whoever said, “Those who do not learn from history are doomed to repeat it,” was never more right than when it comes to fraud. Those who learn from historical data what fraud looks like, and use that data to build and train models to identify fraud, are likely to succeed in preventing future cases. Those who don’t… well, we feel kind of bad for them, in a head-shaking and tongue-tsking kind of way.
Medical services management company Prime Therapeutics is on the right track. They partnered with analytics company SAS to build out a Fraud Framework with over 1000 models designed to identify the risk of healthcare fraud.
Building effective fraud detection models – and any other kind of model building and training – depends on selecting the right data.
A comprehensive data catalog that serves as a portal to every data asset in your enterprise’s landscape is critical to this selection. Searching for relevant data assets, evaluating asset quality, etc. – it can all be done quickly and easily from within the catalog.
With the right data on hand, you can begin building and training your data-based guard dogs.
Fraud, beware of the data catalog.
Saving lives?! What, the data catalog is going to sit on a tall chair at the edge of your data lake and save people who fall in?
Well, no. Although that may not be too far from the truth.
In industries like healthcare, a firm handle on your data can mean the difference between a life saved and a life lost. More than 195,000 deaths occur in US hospitals per year because of medical error, with 59% of those deaths due to “wrong patient errors.” In a study of 55 hospital CIOs, 17% said their institution had experienced incidents where patients had suffered harm due to mismatched patient records.
A comprehensive data catalog creates a single source of truth for your enterprise, and is indispensable when it comes to identifying duplicate datasets, assets or processes that might create multiple or erroneous records. When a data catalog has integrated data lineage, it makes it even simpler to track down the source of the duplicate or erroneous assets and correct them at the root.
Aside from guarding against costly errors, a data catalog can support proactive efforts for saving lives. Personalized, precision medicine is about combining the wealth of data we have about how disease prevention and treatments work with data about individuals and their genetic, environmental and lifestyle differences. But precision medicine can only be as powerful as our ability to precisely pick out and manipulate the right data.
This example of using a data catalog to design a cancer risk-scoring system based on external risk models and internal healthcare organization data is a perfect illustration of the data catalog’s ability to enable and smooth the process of proactively saving lives.
Think Outside the Box
Of course, data catalogs should be leveraged for the standard benefits of a data catalog: achieving enterprise-wide data asset consistency, improving data quality and accuracy, streamlining data migrations, reducing redundancy, increasing trust in data and the like.
But don’t stop there. An enterprise data catalog is a powerful, flexible tool that can support any number of aims in a seriously data-driven enterprise.
So put the data catalog key in the ignition – and start driving.