Data governance refers to the rules and processes imposed on maintaining data in a company. Data lineage is the part of data governance that records the movement of data from its original source through any system in between that source and the data’s destination. These lineage processes provide a way to document that the data is from an authoritative source and the good systems are in place to monitor the transfer of data.
There are different tiers of data lineage to serve the needs of different businesses. Lower levels of data lineage can be simple lineage data flow visualization. This type of data lineage provides a representation of how business processes occur within a company and where their control points are, but do not include specific details on data transformations or system information.
The highest of these tiers is an attribute-level lineage, which collects a high level of details about data. These details not only help document that the data has been well monitored and unaltered but can provide valuable insight to analysts to optimize the flow of data and find ways to improve the data platform.
However, attribute-level lineage can have high costs of implementation and monitoring. For this reason, many companies only use this level of lineage on critical data that has regulatory concerns, a high business value or could cause an organizational impact.