One of the most common questions I get when talking to folks about database administration is how to measure the effectiveness and quality of a DBA staff. This question is not really an easy one to answer, and for a number of reasons. The most important reason is that the role of the DBA is constantly changing, so being able to measure something that is always in flux is challenging. One example of this constant change is that DBAs need to go beyond relational, to manage more than just relational/SQL database systems. But there are more examples, of which I have extensively written about and spoken of.
One popular post of mine, titled How Many DBAs?, discusses the difficulty of determining the appropriate staffing level for a DBA group. Basically, it boils down to the techies usually thinking that more DBAs are needed, and management saying that there are already enough (or, even worse, too many) DBAs on staff. The humorous reply from the DBA manager when asked how many DBAs they need is always the same: “one more, please!”
And that brings me to today’s entry, in which we look at what type of metrics are useful for measuring the DBA’s quality of work. So, what is a good way to manage how effective your DBA group is?
A good DBA has to be a jack of all trades. And each of these “trades” can have multiple metrics for measuring success. For example, a metric suggested by one reader was to measure the number of SQL statements that are processed successfully. But what does “successfully” mean? Does it mean that it is syntactically correct, that the statement returned the correct results, that it returned the correct results in a reasonable time?
And what is a “reasonable” time? Two seconds? One minute? A half an hour? Unless you have established service level agreements (SLAs) it is unfair to measure the DBA on response time. And the DBA must participate in establishing reasonable SLAs (in terms of cost and response time) lest s/he be handed a task that cannot be achieved.
Measuring the number of incidence reports is another oft-cited potential metric. Well, this is fine if it is limited to only true problems that might have been caused by the DBA. But not all database problems are legitimately under the control of the DBA. Should the DBA be held accountable for bugs in the DBMS (caused by the DBMS vendor); or for poor SQL written by developers when no time is given to the DBA for review; or for design elements forced on him or her by an overzealous development team (happens all the time).
I like the idea of using an availability metric, but it should be tempered against your specific environment and your organization’s up-time requirements. In other words, what is the availability required? Once again, back to SLAs. And the DBA should not be judged harshly for not achieving availability if the DBMS does not deliver the possibility of availability (e.g. online reorg and change management) or the organization does not purchase reasonable availability solutions from a third party vendor. Many times the DBA is hired well after the DBMS has been selected. Should the DBA be held accountable for deficiencies in the DBMS itself if he or she had no input at all into the DBMS purchase decision?
And what about those DBA tools that can turn downtime into up-time and ease administrative tasks? Well, most DBAs want all of these tools they can get their hands on. But if the organization has no (or little) budget, then the tools will not be bought. And should the DBA be held responsible for downtime when he is not given the proper tools to manage the problem?
OK then, what about a metric based on response to problems? This metric would not necessarily mean that the problem was resolved, but that the DBA has responded to the “complaining” entity and is working on a resolution. Such a metric would lean toward treating database administration as a service or help desk type of function. This sounds more reasonable, at least from the perspective of the DBA, but I actually think this is much too narrow a metric for measuring DBAs.
Any fair DBA evaluation metric must be developed with an understanding of the environment in which the DBA works. This requires an in-depth analysis of things like:
- number of applications that must be supported,
- number of databases and size of those databases,
- number of database servers,
- types of database systems (pre-relational, relational, NoSQL, etc.),
- use of the databases (OLTP, OLAP, web-enabled, analytics, ad hoc, etc.),
- number of different DBMSs (that is, Oracle, Db2, Sybase, MySQL, IMS, etc.),
- number of OS platforms to be supported (Windows, UNIX, Linux, z/OS, iSeries, etc.),
- on-premises versus cloud implementations and workloads,
- special consideration for ERP applications due to their non-standard DBMS usage,
- number of users and number of concurrent users,
- type of Service Level Agreements in effect or planned,
- availability required (24/7 or something less),
- the impact of database downtime on the business ($$$),
- backup and recovery requirements including disaster planning and the availability of recovery time objectives (RTOs),
- performance requirements (subsecond or longer – gets back to the SLA issue),
- type of applications (mission-critical vs. non-mission-critical),
- frequency of change requests.
This is probably an incomplete list, but it accurately represents the complexity and challenges faced by DBAs on a daily basis. Of course, the best way to measure DBA effectiveness is to judge the quality of all the tasks that they perform. But many aspects of such measurements will be subjective. Keep in mind that a DBA performs many tasks to ensure that the organization’s data and databases are useful, useable, available, and correct. These tasks include data modeling, logical and physical database design, database change management, performance monitoring and tuning, assuring availability, authorizing security, backup and recovery, ensuring data integrity, and, really, anything that interfaces with the company’s databases. Developing a consistent metric for measuring these tasks in a non-subjective way is challenging.
You’ll probably need to come up with a complex formula encompassing all of the above –and more — to do the job correctly. And that is probably why I’ve never seen a fair, non-subjective, metric-based measurement program put together for DBAs. If any of my readers have a measurement program that they think works well, I’d like to hear the details of the program — and how it has been accepted by the DBA group and management.