Who Owns Data?

Who owns data?

This is a complex question that can’t be answered quickly or easily. It requires some thought and you have to break things down in order to even attempt to answer it.

First of all, there is public data and private data. One could say that public data is in the open domain and available to everyone. But what makes data public? Is data you post on Facebook now public because it is available to anyone with a browser or the Facebook app? Well, probably not. It is available only to those that you have shared the data with. But when you put it up on Facebook then Facebook likely owns it.

What about governmental data that is available freely online like that available at USA.gov and data.gov? Well, you can grab that data and use it, but that doesn’t mean you own it, does it?

Then there are all the data governance and privacy laws and regulations that impact who owns what and how it can be used. It can be difficult to fully understand what all of these laws mean and how and when they apply to you and your organization. This is especially important with GDPR compliance looming before us.

But let’s back it up a minute and think just about corporate data. It is not an uncommon question, when working on a new project or application, to ask “who owns this data?” That is an important question to have an answer for! But owns is probably not the correct word.

In my humble opinion, data belongs to the company and thus, the COMPANY is the owner. Each department within an organization ought to be the custodian of the data it generates and uses to conduct its business.  Departments are the custodian because they are the ones who decide who has access to their data, must maintain the integrity of the data they use, and ensure that it is viable for making decisions and influencing executives.

Nevertheless, this answer provides only a part of the answer to the question. You really need named individuals as custodians. These can be from the business unit or the IT group supporting the business unit. Generally speaking, if custodians are appointed in IT, they should probably not be application developers or DBAs, but perhaps data analysts or higher-level IT managers.

Application developers are responsible for writing code and DBAs are responsible for the physical database structures and performance. There needs to be a data professional in charge of the accuracy of the actual data in the databases.

Here are some things to consider as you approach your data ownership/custodian planning:

  • Understand the data requirements of all current systems, those developed in-house and those you bought. Be sure that you know all of the data interdependencies of your applications and how one app can impact another.
  • Assess the quality of your existing data in all of your existing systems. It is probably worse than you think it is. Then work on methods and approaches to improve that quality. There are tools and services that can help here.
  • Redesign and re-engineer your systems if you uncover poor data quality in your current applications and databases. You might choose to change vendors, replatform or rehost apps with poor data quality, but if the old data is still required it must be cleansed before using it in the new system.
  • Work on methods to score the quality of data in your systems and tie the performance and bonuses of custodians to the scores.

What do you think? Does any of this make sense? How does your organization approach data ownership and custodians?

Posted in data, Data Quality | Leave a comment

A Look at Data Professional’s Salaries

The annual ComputerWorld IT Salary Survey for 2017 was recently published and it contains a great wealth of interesting data. So, as I’ve done in the past, this post will summarize its findings and report on what is going on with the data-related positions mentioned in the survey. Of course, please click on the link above to go to ComputerWorld for the nitty-gritty details on other positions, as well as a lot of additional salary and professional information.

Overall, the survey reports a 3 percent growth in IT pay with 50% of respondents indicating that they are satisfied or very satisfied with their current compensation. That is down from last year when the number was 54%. Clearly, though, IT as a profession seems to be a sound choice. 43% expect their organization’s IT headcount to increase and 49 percent expect it to remain the same, while only 7 percent expect a decrease in their company’s headcount.

But all is not rosy. When looking at the amount of work that needs to be done 56 percent expect IT workload to increase over the next year. But if headcount is not rising commensurate with the amount of additional workload then that means organizations will expect more work from their IT staff than they did last year.

Nevertheless, 85 percent say they are satisfied or very satisfied with their career in IT.

Now let’s get to the interesting part for data professional… and that is the salary outlook for specific data jobs.

If you are the manager of a database or data warehousing group, your total compensation increased greater than the norm last year at 4.1 percent. Average compensation grew from $110,173 to $114,635.

DBAs compensation grew 2.9 percent, which was just about the average. Average compensation for DBAs was $104,860, growing from $101,907 in 2016.

Database developer/modeler, which is an interesting grouping, grew 2.5 percent from $96,771 in 2016 to $99,235 in 2017.

So things are looking OK, but not stellar for data professionals. Which IT positions grew their salary at the highest percentage? Well, the top of the heap, somewhat surprisingly, was Web Developer which grew at 6.7 percent (to an average total compensation of $76,446). The next highest growth makes a lot of sense, Chief Security Officer, which grew 6.4 percent year over year.

The common career worries looked familiar with keeping skills up-to-date being the most worrisome, followed by flat salaries and matching skills to a good position. And the biggest workplace woe? Not surprisingly, increased IT workload. But stress levels are about the same with 61 percent of respondents indicating that their level of job stress was the same as last year.

What can you do to help grow your salary this year? Well, you might consider aligning your career with one of the hot specialties called out in the survey. The top three tech functions with the highest average compensation in 2017 are cloud computing, ERP and security.

Overall, though, it looks like an IT career is a good thing to pursue… and working with data in some capacity still makes a lot of sense!

Posted in data, DBA, salary | Leave a comment

Why isn’t e-mail more easily queryable?

Today’s blog post is just a short rumination on finding stuff in my vast archive of email…

Do you ever wonder why e-mail systems don’t use database management technology?  Don’t you store some of your e-mails for long periods of time?  Do you group them into folders?  But then, isn’t it hard to find anything later?

Anybody who uses email like I do needs to know which folder is which and which e-mail has the information you need in it.  And it isn’t usually obvious from the folder name you gave it (which made sense at the time) or the subject of the e-mail (which might not have anything to do with the actual content of the e-mail you’re looking for).  And sometimes emails get stored in the wrong folder…

I’d sure love to be able to use SQL against my e-mail system, writing something like:

SELECT  TEXT
FROM  ALL OF MY RIDICULOUS E-MAIL FOLDERS
WHERE TEXT CONTAINS 'NEW DB2 PRODUCT ROLLOUT';

Or something like that.

Wouldn’t you?

Posted in e-mail, SQL | 1 Comment

One of the Top Database Blogs on the Web

Very proud to announce that the Data & Technology Today blog was selected as one of the top 60 database blogs on the web by Feedspot.

You can read all about it here – as well as learn about 59 other great database and data-related blogs that you might want to follow.

Posted in data | 1 Comment

News from IBM InterConnect 2017

This week I am in Las Vegas for the annual IBM InterConnect conference. IBM touts the event as a way to tap into the most advanced cloud technology in the market today. And that has merit, but there is much more going on here.

If I had to summarize the theme of InterConnect I would say that it is all about cloud, IBM’s Watson, and DevOps. This is evident in terms of the number of sessions being delivered on these topics, as well as the number of vendors and IBM booths in the concourse area devoted to these topics.

But the highlight of the conference for me, so far, was Ginni Rometty’s keynote address on Tuesday. She was engaging and entertaining as she interacted with IBM customers and partners to weave the story of IBM’s cloud and cognitive computing achievements. The session is available to for replay on IBMGO and it is well worth your time to watch it if you are at all interested in how some of the biggest and most innovative organizations are using IBM technology to gain competitive advantage.

And let’s not forget that Will Smith – yes, that Will SmithWill Smith – was part of the general session on Monday. Not surprisingly, he was intelligent and amusing calling himself an African-American Watson as he described how he used primitive data analytics to review the types of movies that were most successful as he planned his acting career. My favorite piece of advice he offered was something that he learned as he moved from music to acting. When he was asked if he had ever acted before (he hadn’t) he said “Of course,” and it led to him getting case in the mega-hit sitcom The Fresh Prince of Bel-Aire. His advice? “If someone asks if you have ever done something just say ‘yes’ and figure it out later.” He had a lot more to say, but let me send you here if you are interested in reading more about Will.

Of course, there is a lot more going on here than just what is happening in the keynote and general sessions. Things I’ve learned this week include:

  • DevOps is as much about business change as technology change
  • The largest area of growth for DevOps is now on the mainframe (according to Forrester Research)
  • Some companies are bringing college grads up to proficiency in mainframe COBOL in less than a month using a modern IDE
  • Networking is the hidden lurking problem in many cloud implementations
  • The mainframe is not going away (I knew this, but it was good to hear a Forrester analyst say it)
  • And a lot more

But that is enough for now. So to conclude, I’d like to end with a quote from Ginni Rometty that I think all of us in IT should embrace: “Technology is never good or bad; it is what you do with it that makes a difference.”

Let’s all get to work and do good things with technology!

Posted in cloud, DevOps, IBM, Watson | Leave a comment

Inside the Data Reading Room – Analytics Edition

If you are a regular reader of this blog you know that, from time-to-time, I review data-related books. Of course, it has been over a year since the last book review post, so this post is long overdue.

Today, I will take a quick look at a couple of recent books on analytics that might pique your interest. First up, is The Data and Analytics Playbook by Lowell Fryman, Gregory Lampshire and Dan Meers (2017, Morgan Kaufmann, ISBN 978-0-802307-5).

This book is written as a guide to proper implementation of data management methods and procedures for modern data usage and exploitation.

The first few chapters lay the groundwork and delve into the need for a new approach to data management that embraces analytics. Then, in Chapter 3, the authors guide the reader through steps to assess their current conditions, controls and capabilities with regard to their data. The material here can be quite helpful to assist you in gauging where your organization falls in terms of data maturity. Chapter 4, which chronicles the detailed activities involved in building a data and analytics framework comprises about a quarter of the book and this chapter alone can give you a good ROI on your book purchase.

Chapter 8 is also well done. It addresses data governance as an operations process, giving advice and a framework for successful data governance. If you are at all involved in your organization’s data management and analytics practice, do yourself a favor and grab a copy of this book today.

The second book I will cover today is a product-focused book on IBM’s Watson Analytics product. Most people have heard of IBM’s Watson because of the Jeopardy challenge. But if your only knowledge of Watson is how it beat Jeopardy champions at the game several years ago, then you need to update what you know!

So what book can help? How about Learning IBM Watson Analytics by James D. Miller (2016, Packt Publishing, ISBN 978-1-78588-077-3)?

This short book can help you to understand what Watson can do for your organization’s analytics. It shows how to access and configure Watson and to develop use cases to create solutions for your business problems.

If you are a nascent user of Watson, or are just looking to learn more about what Watson can do, then this is a superb place to start. Actually, if you learn best through books, then this is the only place to start because it is currently the only book available on IBM Watson Analytics.

As with any technology book that walks you through examples and screen shots, as the product matures over time, things may look different when you actually use Watson. But that is a small issue that usually won’t cause distraction. And with all of the advice and guidance this book offers in terms of designing solutions with Watson, integrating it with other IBM solutions, and more, the book is a good place to start your voyage with Watson.

Hopefully, you’ll find these two books as interesting and worthwhile as I did!

Posted in analytics, book review, books, DBA, Watson | Leave a comment

Time to Plan Your Trip to IBM InterConnect 2017

I am looking forward to attending this year’s IBM InterConnect conference in Las Vegas, NV the week of March 19-23, 2017. And after reading my blog post today I bet you will be interested in attending, too!

interconnect-social-vip_craigmullins_01

The first thing you will notice is that IBM InterConnect covers a plethora of technical topics, including some of the hottest and most important for your business. If you attend the conference you can learn abour Hybrid Cloud, Process Transformation, Integration, Internet of Things, DevOps, IT Service Management, Security, Data Management, and more.  There are educational presentations as well as hands-on sessions that allow you to build your conference experience how you’d like, using the learning techniques that best suit you.

And there are a lot of learning opportunities! IBM InterConnect has over 2,000 sessions, 200 exhibitors, hundreds of labs, certification opportunities, as well as the ability to network with other IT professionals from all around the world.

For me, there are several sessions that I’m very much looking forward to attending. The Continuous Delivery keynote on March 20th promises to inform and educate on DevOps best practices including a roadmap for IBM’s UrbanCode. On Tuesday I’m excited about the session on How Watson “Really Works… I mean who wouldn’t be interested in learning more about the AI, natural language and machine learning capabilities of IBM Watson? And Wednesday offers an intriguing session for mainframers like me – “Why z/OS is a Great Platform for Developing and Hosting APIs.”

Of course, there are a lot of additional sessions that I plan to attend, but I doubt anybody is interested in a rundown of my entire agenda. Especially with so much variety and choice available to attendees this year. And if you get stuck choosing from all the great sessions that are available, this year you can solicit Watson’s help recommending sessions as you build your agenda. I tried it and it was interesting and helpful to see what Watson chose for me.

So take a look at what IBM InterConnect has to offer this year. And if you plan on attending I hope we get a chance to meet and discuss our experiences at the conference. See you in Vegas!

Posted in certification, cloud, education, enterprise computing, IoT | Leave a comment

Data Technology Today’s 2016 Year in Review

Well, another year has come and gone and I thought it might be interesting to share a bit about this blog’s activity in 2016. It was an active year that saw 17 new posts, down a bit from 2015, but still averaging more than a post a month.

Posts on the blog were viewed 47,264 times by 36,870 visitors, meaning each visitor averaged 1.28 views.

The most popular post in 2016 was actually first posted in 2011: An Introduction to Database Design: From Logical to Physical was viewed 10,575 times in 2016. Obviously database design is an interesting topic — at least for the reader’s of this blog!

The second most popular post in 2016 was On The Importance of Database Backup and Recovery, which was first posted in 2014.  The most popular post actually posted in 2016 was published in December, late in the year to lead the year, but evidently people are interested in A Useful Guide to Data Fundamentals from Fabian Pascal. As well they should be!

And the blog gets read all over the world, as shown in the Top Ten Countries visiting in 2016 below:

countries

Yes, most of my readers are from the United States, but I’m proud of the following I have in India (and across the world).

So to end this brief synopsis of 2016, thank you to all of my regular readers – please keep visiting and suggesting more topics for 2017 and beyond. And if this is your first visit to the blog, welcome. Take some time to view the historical content – there are several informative posts that are popular every year… and keep checking back for new content on data, database, and related topics!

Posted in backup & recovery, DBA, review | Leave a comment

A Good Start for Your SQL Library

Every professional programmer (and DBA) should have a library of books on SQL fundamentals. There are many SQL titles to choose from, and a lot of them are very good. But you can’t buy them all unless you are independently wealthy. So this blog post will highlight the first four SQL books that should be on every database professional’s bookshelf.

The first SQL book is SQL Performance Tuning by Peter Gulutzan and Trudy Pelzer. This well-written book provides a treasure trove of tips for improving SQL performance on all of the major database systems. It does not teach SQL syntax, but instead helps the reader to understand the differences between the most popular DBMS products, including Oracle, DB2, SQL Server, Sybase ASE, MySQL, Informix, Ingres, and even InterBase.

Throughout this book the authors present and test techniques for improving SQL performance, and grade each technique for its usefulness on each of the major DBMSs. If you deal with heterogeneous database implementations this book will be a great assistance, whether you are a programmer, consultant, DBA, or technical end user. The contents of this book can help you to decide which tuning techniques will work for which DBMS.

My next SQL book recommendation is altogether different in purpose than the first. It is SQL in a Nutshell, 3rd edition by Kevin Kline, Daniel Kline, and Brand Hunt. This book offers a great cross-platform syntax reference for SQL. It probably is not the easiest reference to use for finding the exact syntax for one particular DBMS; but it is absolutely the best reference for those who work with multiple DBMSs.  Be sure to get the 3rd edition, which is up-to-date and offers more depth than the previous editions.

Next up is The Art of SQL by Stephane Faroult, which is a guide to SQL written using the approach of “The Art of War” by Sun-Tzu. The author actually uses the exact same title chapters for The Art of SQL that Sun Tzu used in The Art of War. Amazingly enough, the tactic works.

Consider, for example, the chapter titled “Laying Plans,” in which Faroult examines how to design databases for performance. As anyone who ever built database applications knows an improperly designed database can be the biggest impediment to flawless application performance. The chapter titled “Tactical Dispositions” covers the topic of indexing and in “The Nine Situations” the author examines several classic SQL patterns and how best to approach them.

This book is not for a novice who wants to learn SQL from scratch. The authors assume the reader is conversant with SQL as they describe how to apply SQL in a practical manner. If you can’t code an outer join or don’t know what a nested table expression or in-line view is, then this is not the book for you.

Neither is the book a list of SQL scripts that you can pluck out and use. Instead, the book skillfully manages to explain how to properly attack the job of coding SQL to effectively and efficiently access your data.

The Art of SQL skillfully manages to explain how to properly attack the job of coding SQL to effectively and efficiently access your data. The book offers best practices that teach experienced SQL users to focus on strategy rather than specifics.” You know, if Sun Tzu coded SQL, he might have written a book like “The Art of SQL”. But since Sun Tzu is dead, I’m glad Stephane Faroult was around to author this.

The final SQL book recommendation is the latest edition of Joe Celko’s SQL for Smarties, Fifth Edition: Advanced SQL Programming. Celko was a member of the ANSI SQL standards committee for ten years, and is highly qualified to write such a text. The latest edition of this fine book is the 5th edition, which was completely revised (in 2015) and boasts over 800 pages of advanced SQL programming techniques. If you have any of the past editions of this book, you owe it to yourself to get the newly revised fourth edition.

This book offers tips, techniques, and guidance on writing effective, sometimes complex, SQL statements  using ANSI standard SQL. It touches on topics ranging from database design and normalization to using proper data types to grouping and set operations, optimization, data scaling, and more. Every developer who codes SQL statements for a living will find something useful in SQL for Smarties!

These four books, properly used, can turn a fledgling SQL developer into an expert – and they can assist even the expert to become a better user of SQL.

Posted in books, DBA, performance, SQL | Leave a comment

A Useful Guide to Data Fundamentals from Fabian Pascal

Today’s blog post is a quick review of Fabian Pascal’s latest book The DBDEBUNK Guide to Misconceptions and Data Fundamentals. Those of you who know Pascal will know what to expect — a no-holds barred treatise on the fundamentals of data and database management focusing on the relational model and its benefits. But also you will get a true understanding of what relational means. You see, what is commonly called a relational DBMS more accurately should be referred to as a SQL DBMS. If that doesn’t make sense to you then you are exactly the type of reader who should buy and read this book.

The book is self-published, so it is only available at Mr. Pascal’s website, dbdebunk.com. But the material is well-written and laid out in an easy-to-consume fashion. This comes as no surprise given Pascal’s extensive history and background writing about relational theory and databases (multiple decades of books, articles, blogs, etc. to his credit).

What makes this book special? Well, the material is culled from the the DATABASE DEBUNKINGS website. There are 50 chapters each exposing a common misconception or misunderstanding about data and relational fundamentals. Pascal shows the misconception and then clearly explains the problems and implications that arise because of it. Furthermore, he goes on to provide an explanation of the correct way. This approach, once you get comfortable with it, offers a sound format for exposed fallacies and correcting people’s misunderstanding of the issues.

As the foreword  notes, there is an Index of Misconceptions rather than a a Table of Contents. You can use that index to seek out sections of the book that focus on specific misconceptions, such as on keys or unstructured data.

Also nice is the list of further reading provided in each section. If you read (and understand) not only this book, but also all of the referenced materials you will probably know more about data and database systems than most people working as data analysts and DBAs.

If your job requires you to manage, access or manipulate data, you would do well to read Pascal’s Guide. Even if you think you are an expert on relational theory and data fundamentals I am sure that the information in this powerful tome will offer up useful information. And for most people, who think they know more than they actually do, this book will deliver a wealth of knowledge that will serve you well as you progress with your IT career.

Posted in book review, DBA, relational | 1 Comment