Back in April 2013 I introduced a new regular, recurring feature of my blog in which I expose and review the books that I am reading and using as a data professional. The series is titled Inside the Data Reading Room and the goal of these posts is to introduce you to new and classic books on data technology and database systems.
First up for this Summer edition of Inside the Data Reading Room, we have Patterns of Information Management by Mandy Chessell and Harald Smith (IBM Press, ISBN 978-0-12-315550-1). The authors are both long-time IBMers who have worked with data for decades. Mandy Chessell is an IBM Distinguished Engineer, IBM Master Inventor, and the chief architect for InfoSphere Solutions. Harald Smith has 30 years of experience working with data quality products and solutions and is currently a software architect at IBM specializing in information quality, integration, and governance products,
I enjoyed this book quite a bit. The authors discuss the lifecycle of data and what can be done to it using the supply chain as an analogy. I have personally encouraged organizations to use this exact metaphor (with varying degrees of success). Along the way the authors touch on information architecture, information in motion and at rest, distribution of data, and the means by which data and information can be utilized for benefit and ROI.
True to its title, the book offers up patterns that can be used by the reader to integrate information management techniques into their organization.
There is a lot of ground covered in the book, and that should not be surprising given the complexity and nature of data in this day and age. Managers (both technical and business), DBAs, data scientists, and anyone who manages and deals with information on a regular basis would benefit from reading this book… particularly from looking at it using the supply chain analogy.
Next up in the data reading room is a slim, but informative tome from Peter Loshin titled Simple Steps to Data Encryption (Syngress, ISBN: 9780124114838). Loshin is an independent consultant with experience in Internet protocols and open source network technologies, as well as being regularly published in journals such as Computerworld and Information Security Magazine.
Simple Steps to Data Encryption contains just under 100 pages, but those pages are easy to read and contain a lot of useful information. The book focuses primarily on GnuPrivacy Guard (GnuPG) and how to use it to protect data in motion. But it also teaches about cryptography and encryption addressing issues such as generating and managing public keys, key servers, signatures, and even encrypting data at rest using FDE on modern operating systems.
The book is framed as a story about Bob, who lives in a mythical country named Sylvania and has reasons to keep sensitive data from prying eyes. As the story unfolds, Bob learns how to use GnuPG to encrypt and protect his data.
Although it is not really for those who already possess a firm knowledge of cryptography, Simple Steps to Data Encryption offers a nice introduction to the topic in an approachable and easy to digest manner.
And finally, we have Measuring Data Quality for Ongoing Improvement: A Data Quality Assessment Framework by Laura Sebastian-Coleman (Morgan Kaufmann, ISBN: 978-0-12-397033-6). Sebastian-Coleman is a professional data quality architect who has worked on data quality in large health care data warehouses for over a decade. She holds the IQCP (Information Quality Certified Professional) designation from IAIDQ, as well as a Certificate in Information Quality from MIT.
Her book offers a ready-to-use framework for data quality measurement. Using the information in this book you can establish meaningful data quality measurements that will work across data storage systems and products. It helps to define appropriate controls that will contribute to improving the quality of data at any organization.
The book is divided into six sections. The first focuses on the concepts and definitions necessary to set the stage for the remainder of the topics. Section two introduces the DQAF (Data Quality Assessment Framework) and section three walks through data assessment scenarios. Section four of the book applies the DQAF to data requirements and section five discusses data strategy.
It is in section six where the DQAF is defined in depth. Functions and features of the DQAF are presented and then the final chapter offers the coup de grace, defining the 6 facets and 48 measurement types that comprise the DQAF.
If you are intent on improving the quality of the data at your organization you would do well to read Measuring Data Quality for Ongoing Improvement and adopt the DQAF offered up in this fine book.
Which brings us to the end of today’s look Inside the Data Reading Room… but not to fear… I still have a stack of books on my desk that I am reading and will highlight in upcoming blog entries here… so keep on checking back for more!