I have been saying for quite some time now that a resurgence in metadata management should be a component of every IT organization. In the 1980s metadata management, data dictionaries, and repositories were all the rage for a period of time. But the complexity of keeping the data current, along with management difficulty in understanding the return on the investment, caused many implementations to wither on the vine and become obsolete. That is unfortunate.
Oh, I know that for a some of you, managing your company’s metadata continues to be an ongoing concern. And that is great… it should put you ahead of the curve to manage the onslaught of data to support governance and compliance projects, as well as big data analytics efforts. Nevertheless, it seems to me that over the course of the past few decades companies have lessened the importance of data administration and metadata management.
Some of you might even be asking “what is metadata?” But I don’t really want to get into that in this blog post. Instead, I refer you to an earlier column of mine titled The Growing Importance of Metadata, which offers a good explanation.
OK, everybody back and with at least a high-level understanding of what metadata is? Good… let’s move on. So to summarize, you cannot really use data appropriately without good metadata. The metadata helps us to understand our data, which in turn helps us know what data we have, to assure that we are using it properly, and that we are in compliance with any industry or governmental regulations that apply to the data.
I think most people reading this will be in violent agreement that metadata is important. But how can we best manage metadata? The promise of metadata is that it will, if managed properly, provide corporate IT departments with a well-documented, up-to-date view of their data assets. But corporate data environments are riddled with redundancy and undocumented data which can easily lead to questionable data integrity and countless hours of wasted time. Much of the knowledge worker’s or business analyst’s time is spent chasing down data lineage to understand things like, “which customer record is most accurate, the one in the CRM or in the finance application?” And even if organizations manage to document the origins of their data, new development projects take off with no regard for standards adherence, leaving data managers right back where they started.
So how do you ensure that you are exploiting the metadata you are collecting to the fullest, possible extent? How do you make sure that your metadata is easily accessible and effectively used across your organization? Well, this is where data modeling comes in to play. Modeling is important to metadata management.
Effective communication is at the heart of the metadata value proposition. Data managers must be able to interpret the data coming into their organization and then provide a roadmap to everyone else so that they too can reach their destination. Modeling adds value to metadata management much the same way it does for data itself — by serving as a lingua franca, a standardized language, easily understood by everyone from business users to application developers to DBAs.
Proper modeling requires an integrated system incorporating tools, process, and people. Attacking the proper setup of your data architecture is beyond the scope of this blog entry, but we will discuss this in future postings. For now, let’s just focus on the top five advantages of using modeling to enhance metadata management. Starting with number 5 and counting down “Letterman”-style:
#5 Data Structure Quality. Models ensure that the business design of a data architecture is appropriately mapped to the logical design, providing comprehensive documentation on both sides.
#4 Data Consistency. By having standardized nomenclature for all data — including domains, sizing, and documentation formats — the risk of data redundancy or misalignment is greatly reduced.
#3 Data Advocacy. Models help to emphasize the critical nature of data within the organization, indicating direction of data strategy and tying data architecture to overall enterprise architecture plans, and ultimately to the business’s objectives.
#2 Data Reuse. Models, and encapsulation of the metadata underpinning data structures, ensure that data is easily identified and is leveraged correctly in the first place, speeding incremental tasks through reuse and minimizing the accidental building of redundant structures to manage the same content.
#1 Data Knowledge. Models, combined with an efficient modeling practice, enable the effective communication of metadata throughout an organization, and ensure all stakeholders are in agreement on the most fundamental requirement: the data.
Good luck tackling your company’s data and metadata – and be sure to share any thoughts, stories or feedback on the topic with us here on the blog.