Data quality can be a big problem to overcome, as I’ve opined about on this blog before, most recently in November 2012 when I posted Thoughts on Data Quality. Indeed, if you work with your organization’s data you know how important it is for that data to be accurate — and you know how often data inaccuracies conspire to make your job more difficult.
But there are several quality books on the topic that you should read if you are interested in improving data quality at your shop. With that in mind, I want to briefly write about four interesting data quality books and how they can help. First up is Data Quality: The Accuracy Dimension by Jack E. Olson. The book details aspects of assessing the quality of corporate data and improving its accuracy using the data profiling method. Olson pioneered the field of data profiling at Evoke Software (which was acquired by Similarity, and subsequently Informatica). Data profiling is a technology that supports and enhances the accuracy of database data. It is a process used to discover the existence of inaccurate data within the database. Using analytical algorithms and techniques, data profiling can uncover data that is valid, but not accurate. In this book, Jack Olson explains data profiling and shows how it fits into the larger picture of data quality.
Next up is Michael H. Brackett’s Data Resource Quality: Turning Bad Habits Into Good Practices. Data Resource Quality features the ten most fundamental and frequently exhibited bad habits that contribute to poor data quality, and presents the strategies and best practices for effective solutions. Brackett clearly outlines the impact of poor data practices and shows how to implement more effective approaches.
Larry English’s Improving Data Warehouse and Business Information Quality is also a worthy text that shows how better data quality can reduce costs and improve profits. The book is oriented around English’s Total Quality of data Management (TQdM). According to the book, TQdM is not a program or a process, but instead is a mind set, a belief system, a value system, and a culture that breeds good data habits. If you are familiar with Dr. W. Edwards Deming’s contributions to world of quality you will certainly appreciate Chapter 11 of this book, in which English translates Deming’s 14 points of quality into 14 points of information quality.
And finally, no discussion of data quality is complete without mentioning Thomas C. Redman. His most recent book, Data Quality: The Field Guide, provides practical guidance on building a successful data quality program. It describes the most important data quality problems facing the typical organization and outlines what needs to be done to correct the problems. Each of its 36 chapters describes a single data quality issue and how to address it; together they form a field guide for improving data quality.
And while your at it, consider seeking out Redman’s other books on data quality. One of my favorites is Data Quality Management & Technology. Even though this book is now 15 years old it offers a slew of useful information on managing data quality. I particularly enjoyed the descriptions of the three dimensions of data quality and the statistics Redman presents on the percentage of data that is inaccurate. And even better, if you act fast you can pick up a copy of it used for under $1 (just follow the link above to amazon).
The bottom line here is that data quality is a pervasive problem today within modern organizations, and these four (okay, five) books provide useful knowledge that can help you to begin to address the problem in your company.