What is Purview Data Lineage?
Purview Data Lineage is a feature of the Microsoft Purview set of data governance solutions. Microsoft Purview was formed in 2022 from a merge between Azure Purview and Microsoft 365 Compliance, counting a data map, data catalog and data access policy creator among its numerous offerings. Purview Data Lineage – formerly known as Azure Purview Data Lineage, or Azure Data Lineage – is a component of the Purview Data Catalog, and enables tracking of data flow throughout the Purview user’s data environment.
How can Purview Data Lineage help your data management?
The picture of data flow provided by Purview Data Lineage enables your data teams to:
- Quickly track down the root cause of any errors found in data pipelines or products
- Conduct fast, accurate impact analysis before making changes to data-related business processes
- Apply sensitivity labels automatically and consistently to data assets across your data landscape
- Avoid time-consuming manual mapping of data flow
Microsoft Purview’s solutions connect the business users of data with those responsible for managing its risk, enabling more holistic, informed, 360-degree treatment of data.
How do you implement Purview Data Lineage?
Purview Data Lineage is a built-in part of the Purview Data Catalog. As Purview scans your systems, it notes all the rich static and operational metadata that describes both what any given data asset is and what is happening/has happened to it. It then saves that metadata to draw a map of the data happenings upon request.
In order to access the Purview Data Lineage, you must have the Microsoft Purview Data Catalog set up. From within any given asset entry, select the Lineage tab and select the check box next to each column you want to display in the data lineage. There you may need to select the columns you want to see in your lineage view.
How comprehensive is Purview Data Lineage?
When an organization is running its entire data ecosystem on Microsoft Azure, Purview Data Lineage can track and represent the end-to-end lineage of any data asset in the ecosystem, from source to target. When non-Azure-based data solutions are part of an organization’s data stack, Purview sometimes falls short and needs external data lineage solutions in order to fill in the complete picture.
In addition, column level lineage is available for some data assets, such as process nodes, but not for all types. Otherwise, lineage is only available on the entity level, showing the source, the process and the target.
Want a demo of how an end-to-end, holistic picture of data flow through Purview, Azure and your entire data stack would look? Request one here.