The Problem with Prediction

Predicting the future is a messy business. I try to avoid making predictions about the future of technology for many reasons. First off, nobody can see into the future, no matter what some fortune tellers say. Secondly, technology changes quickly and sometimes something revolutionary will show up and render your prediction (guess) irrelevant. As an example, few people in the late 1980s foresaw the Internet revolution. Or blogs like this one, for that matter!

Every now and then I may try to alert readers about some technology that I think will be disruptive, but I don’t usually try to predict what software or companies will succeed or perish.

Now as my regular readers know, I undertook an office cleaning project late last year — and I wrote a blog post about some of the material I encountered organizing my office closet (see On Being Skeptical in the Face of Hype).

Well, that closet is as clean as it is going to get for the time-being, but I periodically attempt to reorganize my many bookshelves, too. While doing so, I happened across an old book written by John C. Dvorak called Dvorak Predicts. Dvorak is an industry pundit and popular technology writer. I readily admit to being a fan of his work. He is entertaining. But his powers of prediction are poor. And why wouldn’t they be? No one knows what tomorrow will bring…

Now I don’t do what I am about to do to discredit Dvorak specifically, but to alert everyone that even the most talented writers and pundits have a hard time predicting the future. Here, direct from his book, Dvorak Predicts (published in 1994 by McGraw-Hill, ISBN 0-07-881981-4) are some real stinker predictions:

“We can expect IBM to someday quit the mainframe business just as it quit the scientific computer business.” (page 20)

Although this prediction may come true some day, 20 years later the mainframe is still viable and still a core offering from IBM. Why didn’t he predict that IBM would quit the PC business (which it did in 2004)? – Craig

“Voice recognition will be the killer application of the 1990s.” (page 21)

Didn’t happen. Oh, IBM and Dragon had some voice recognition applications that sold in so-so amounts, but killer app? Nope. Still not even a semi-killer app today. — Craig

“Microsoft will open stores.” (page 64)

Never happened; good thing for Microsoft, too since some of the other companies that opened stores didn’t fare all that well (e.g. Gateway). — Craig

Furthermore, Dvorak predicted that Unicode would lead to the death of ASCII by 1995. Well, in the long term I’m sure that prediction will come true, but the”by 1995″ part didn’t! Did he ever really believe that? I mean, even though Unicode is important today, ASCII is still very much alive and kicking.

And a few things that Dvorak does not even mention include the World Wide Web, Java, XML, and spam. One would think that a prescient prognosticator would foresee these three facts of everyday life in the world of IT. But no…

Of course, Dvorak did get some things absolutely correct. For example: “Piracy will increase despite efforts to stop it.” He wrote that prediction about software piracy but it is absolutely applicable today with regard to media, especially music and movies. But predicting that people will continue to do bad stuff is simplistic and easy to do.

Furthermore, Dvorak said that “Gerstner will be good for IBM,” which he undoubtedly was. And he predicted a rosy future for recordable CD and optical media (then again, back then, who wasn’t predicting that?)

A lot of the predictions (and there are many more than are mentioned here) strike me, here in the future, as of the “who cares” variety. By that I mean, I would expect a useful book of predictions to predict about things that matter in the future, and not about things that are just dust bunnies from the past (e.g. Apple Newton, OS/2). And he never foresees the Palm or Windows PocketPCsjust calls the basic idea behind the Newton “stupid” (page 83)… nor does he anticipate the tablet or smartphone.

So, what good is a book of predictions if the majority of them don’t come true. Exactly! No dang good at all – other than (maybe) as an amusing read. The book is out of print, but if you find a copy you might shell out the dollar or two it takes to buy it these days. And here in the future, 20 years later, you might get a chuckle out of some of it… or at least enjoy reading it as an historical piece on an interesting time in the history of computers – that being, the timeframe right before the Web exploded.

What does any of this have to do with database management? Well, not much, at least directly. But the next time someone tells you that some new database technology is going to render everything you know to be obsolete (hmmm, NoSQL? Hadoop?), slow down and think about the problem with prediction before you make any hasty moves.

Posted in books | Leave a comment

The Ethical DBA?

Today’s posting is a re-blog of a post I wrote several years ago for another blog (that has since been discontinued). But I think the subject matter is important enough that it warrants re-blogging… So I touched it up and made a few changes here and there to make sure links still worked… and here it is…

Reading Stephen Wynkoop’s thoughts on a code of ethics for DBAs over at SSWUG recently got me to thinking. He brings up a lot of good points. His basic premise is that no code of ethical behavior for DBAs exists, but one could be useful. I think he is on to something here.

In fact, I liked his idea so much that I e-mailed him some of my thoughts that he shared with his readers. I will share the contents of that e-mail with you here:

“I applaud your notion of creating a DBA code of ethics. However, it is not going to be a very easy task to complete. The problem is one of degree. By that I mean, at what level should the ethical code be created? The approach taken by Google, for example, whose much-ballyhooed notion of “do no evil” is, in my opinion, worthless. I mean, who would disagree with “do no evil”? But what does that mean at the detailed level. For example, is it “evil” for Google to agree to censor its results in China in order to do business there? From the perspective of the free market, it might be perceived as such. From the perspective of the Google stockholder though, it might be viewed as “evil” not to try to expand its business into one of the world’s largest markets. So what is evil?

Okay, I hope I’ve made my point that a high-level ethical code is too nebulous to be useful. What next? Who can envision all of the potential uses for data and what might be done with it? And then, once done with this impossible task, who can correctly indicate what the ethical approach is in each situation? And wouldn’t the ethics of the situation differ (even slightly) based on who is observing? Yes, there would be clear-cut cases, such as the one your reader pointed out about tracking employee’s position by cell phone. But what about areas that are more gray? Once again, let me use Google as an example. They have announced their intention to digitize the world’s books to make it possible to search their contents. Is this ethical? To the many people out there who would like to be able to search for a book based on a phrase or portion of its content, this probably seems like the furthest thing from evil. But to many authors it seems like a copyright violation. Is this fair use? Who gets to decide? That may be easier than deciding if it is “ethical”?

In many cases, what is ethical can come down to a religious issue. But this makes it an individual issue. There is no global religion…  What is right for a Muslim is not necessarily right for a Christian, Jew, or Shintoist (to name but four of the world’s numerous religions). And then there are always the agnostics and atheists, who usually subscribe to a humanist ethical code… but not always. So no help there.

I laud your thinking, but remain skeptical on the practicality of the matter. Perhaps, at least at present, we are best served by government regulation – at least here in the good old US of A. When I see regulations like HIPAA and Sarbanes-Oxley I applaud – not because of all the work they generate, but because they seem (to me at least) to be written such that they make companies do what they should be doing anyway – if they were “ethical”.”

OK, so where does that leave us? I think most people would agree that a code of ethics would be helpful, but we need to set some guidelines on what those ethics would cover. A reader of one of my recent blog posts (Massive Data Sweep) posted a comment raising some of the pertinent issues. Things like personal ethics versus legality, whistleblowing, job security, and the like.

Perhaps a nice useful starting point is offered by the USENIX System Administrator’s Code of Ethics. Feel free to post your thoughts on this issue below. I think this is something that the profession will have to deal with soon, but it is not something that we’ve given much thought to yet.

Posted in DBA | Tagged | Leave a comment

Ten Steps to Minimize Data Risk During an Acquisition

Merger and acquisition activity continues to be a constant in today’s fast-paced, ever-changing world… and those of us who work in the field of IT and data management are likely to be involved in an acquisition, perhaps multiple times during your career.

What will you do when you find out you’re about to acquire or consolidate with another firm or division? Are you aware of the risks you may be inheriting? What data is going to demand the highest availability? What IT regulations will you have to address and how do you know if existing controls already address them?

Below are 10 “data health” checks you can conduct to answer these questions before giving a green light to an M&A or consolidation.

Step One: Assess your data – From a data perspective, the first step needs to be an assessment of the independent data assets of each organization participating in the merger. If you do not know what data exists before the acquisition, gaining this understanding after combining the data, if it can be combined at all, will be extremely difficult. The task at hand will be simpler if both organizations practiced strong data governance. This is rarely the case though.

Step Two: Plug the governance gaps – After completing an honest assessment of where each organization stands in terms of data governance, the next step needs to be plugging the gaps. Work toward creating a definition of data that is not well understood or undocumented. Do not turn this into a long process; define what data you have and where it is stored. Consider using tools like data dictionaries and repositories and consult the subject matter experts (business users, programmers, data architects, etc.) at each organization for this information.

Step Three: Leverage the M&A for governance improvements – Use the acquisition as a springboard for instituting new or stronger data governance policies and procedures. Lack of insight into important business data, such as can occur during an acquisition, can be a strong motivational tool for implementing improved data management practices.

Step Four: Plan for increased workload and capacity. – The basic premise behind mergers is that the combined companies can conduct the same, or more, business more efficiently than two separate companies. Will you need more powerful hardware? Will the new combined business require more uptime, which translates to higher data availability? Existing hardware, transactions, applications, databases, and data maintenance processes may need to be overhauled to meet the new requirements. For mainframe shops, perhaps you can better utilize cheaper specialty engines like zIIP, zAAP, and IFL processors. And look into more efficient software and utilities for performance management, change control, and backup and recovery.

Step Five: Evaluate backup and recovery plans -  A simple question, such as “Is everything backed up?”, can be a crucial starting point. In the clamor of an acquisition evaluating recoverability is often forgotten or ignored. The on-going health of those backups must be checked periodically.

A systematic method of reviewing recovery health, include examination of backups, evaluation of IT hardware and software specification, and matching objectives to reality is helpful. The process of assuring recoverability is exacerbated due to the hectic nature of a merger or acquisition, and recoverability will certainly be impacted if it is ignored.

Step Six: Determine your new exposure to IT regulations – If the acquisition brings new types of business, even if it is related to your existing business, be sure to allot time and resources to examining and ensuring compliance with required IT regulations, such as PCI DSS, HIPAA and Sarbanes Oxley. Tools are available to help reduce the cost and complexity of compliance programs and processes.

Step Seven: Implement a database auditing solution – Because of the hectic nature of mergers and acquisitions, many last minute requests and requirements can because  things can fall through the cracks. A database auditing solution will routinely monitor data activity and keep an audit trail of users and changes in content.

Additionally, database auditing can be used to keep an eye on your privileged users, such as DBAs and system administrators, who have unfettered access to critical data. Some can even track data access patterns to look for anomalous activity and raise alerts when it occurs.

Step Eight: Plan personnel to cover combined systems –  During most acquisitions, dual systems and applications need to be supported. Plan for the proper personnel and support to protect and manage all of the data while redundant systems are required. In other words, don’t pull the switch too soon – on systems or personnel.

Step Nine: Identify and eliminate the redundant – Inventory and eliminate redundant system software and utilities to reduce cost post-acquisition. For example, if the majority of your most functional applications use DB2 as the DBMS, but a few use a different DBMS, choosing a less functional application that uses DB2 instead may allow you to eliminate a costly DBMS license. However, such decisions need to be made in an informed, business manner and not simply with an eye toward reducing system software cost.

Step Ten: Intelligently automate your data management solutions –  The more you can automate maintenance tasks the fewer problems that will occur that can stall the integration of your systems. Using software to automate and conduct on-going health checks on your data and data management activities can improve the overall responsiveness of IT to business needs.

The Bottom Line

Getting a handle on all of the data in your post-acquisition organization requires time and effort. But it need not be fraught with risk. With proper planning, tools, and resources, along with realistic expectations and 10 steps, you can reduce the risk to your valuable business data.

 

Posted in auditing, backup & recovery, compliance, data, data availability | 1 Comment

How to Measure the DBA’s Effectiveness?

Every now and then when I am talking to clients (and database professionals of all types) somebody raises the age old question of how to measure the effectiveness and quality of their DBA staff. Now I’ve tackled a related question on this blog a little bit ago, which is How Many DBAs?, but it can be a difficult question to effectively determine the appropriate staffing level for a DBA group. Basically, it boils down to the techies usually thinking that more DBAs are needed, and management saying that there are already enough (or, even worse, too many) DBAs on staff.

Actually, it always seems like the answer to “How many DBAs do we need?” is always “One more!” no matter how many are on staff… but I digress.

So let’s get back to today’s entry, in which we look at what type of metrics are useful for measuring the DBA’s quality of work. So, what is a good way to manage how effective your DBA group is?

This is not a very easy question to answer because a DBA has to be a Jack of all Trades… and to quote a retired friend of mine (Hello Roger!), “DBAs should be a Jack of all Trades and a master of some.” The important consideration here is that each of these “Trades” can have multiple metrics for measuring success. For example, a metric suggested by some DBAs is to measure the number of SQL statements that are processed successfully. But what does “successfully” mean? Does it mean simply that the statement returned the correct results, or does it mean it returned the correct results in a reasonable time? And what is a “reasonable” time? Two seconds? One minute? A half hour? Unless you have established service level agreements it is unfair to measure the DBA on response time. And the DBA must participate in establishing reasonable SLAs (in terms of cost and response time) lest he be handed a task that cannot be achieved. Of course, service levels are not just a good idea for SQL performance, you should develop SLAs for change management, recovery, and so on.

Measuring the number of incidence reports is another oft-cited potential metric. Well, this is fine if it is limited to only true problems that might have been caused by the DBA. But not all database problems are legitimately under the control of the DBA. Should the DBA be held accountable for bugs in the DBMS (caused by the DBMS vendor); or for design elements forced on him or her by an overzealous development team (happens all the time these days with agile development and e-rushing around).

I like the idea of using an availability metric, but it should be tempered against your specific environment and your organization’s up-time requirements. In other words, what is the availability required? Once again, back to SLAs. And the DBA should not be judged harshly for not achieving availability if the DBMS does not deliver the possibility of availability (e.g. online reorg and change management) or the organization does not purchase reasonable availability solutions from a third party vendor. Many times the DBA is hired well after the DBMS has been selected. Should the DBA be held accountable for deficiencies in the DBMS itself if he or she had no input at all into the DBMS purchase decision?

And what about those DBA tools that can turn downtime into up-time and ease administrative tasks? Well, most DBAs want all of these tools they can get their hands on. But if the organization has no (or little) budget, then the tools will not be bought. And should the DBA be held responsible for downtime when s/he is not given the proper tools to manage the problem?

OK then, what about a metric based on response to problems? This metric would not necessarily mean that the problem was resolved, but that the DBA has responded to the “complaining” entity and is working on a resolution. Such a metric would lean toward treating database administration as a service or help desk type of function. This sounds more reasonable, at least from the perspective of the DBA, but I actually think this is much too narrow a metric for measuring DBAs.

Any fair DBA evaluation metric must be developed with an understanding of the environment in which the DBA works. This requires in-depth analysis of things like:

  • number of applications that must be supported,
  • number of databases and size of those databases,
  • number of database servers,
  • use of the databases (OLTP, OLAP, web-enabled, data mining, ad hoc, etc.),
  • number of different DBMSs (that is, Oracle, DB2, Sybase, MySQL, IMS, etc.),
  • number of OS platforms to be supported (Windows 2000, UNIX, OS/390, AS/400, etc.),
  • special consideration for ERP applications due to their non-standard DBMS usage,
  • number of users and number of concurrent users,
  • type of Service Level Agreements in effect or planned,
  • availability required (24/7 or something less),
  • the impact of database downtime on the business ($$$),
  • performance requirements (subsecond or longer – gets back to the SLA issue),
  • type of applications (mission critical vs. non-mission critical),
  • frequency of change requests.

This is probably an incomplete list, but it accurately represents the complexity and challenges faced by DBAs on a daily basis. Of course, the best way to measure DBA effectiveness is to judge the quality of all the tasks that they perform. But many aspects of such measurement will be subjective. Keep in mind that a DBA performs many tasks to ensure that the organization’s data and databases are useful, useable, available, and correct. These tasks include data modeling, logical and physical database design, database change management, performance monitoring and tuning, assuring availability, authorizing security, backup and recovery, ensuring data integrity, and, really, anything that interfaces with the company’s databases. Developing a consistent metric for measuring these tasks in a non-subjective way is challenging.

You’ll probably need to come up with a complex formula of all of the above — and more — to do the job correctly. Which is probably why I’ve never seen a fair, non-subjective, metric-based measurement program put together for DBAs. If any of my readers have a measurement program that they think works well, I’d like to hear the details of the program — and how it has been accepted by the DBA group and management.

Posted in DBA | 3 Comments

Auditing Mainframe Database Data Access and Modification

Tracking who did what to which piece of data, and when they did it, is important because there are many threats to data security. External agents trying to compromise security and access company data are rightly viewed as a threat to security. But industry studies have shown that many security threats are internal – within an organization.

Some organizations are extra vigilant in tightening down access to intruders from outside the organization but forget about the security threat that comes from a disgruntled or malevolent current or ex-employee that has valid access to the mainframe. Auditing is the best way to find an unauthorized access emanating from an authorized user.

Tactics for Compliance

So how can financial institutions ensure they are in compliance with industry and governmental regulations… and protect their data against threats both internal and external? Data access auditing, sometimes simply called database auditing, can track the use of database resources and authority.

Each audited database operation produces an audit trail of information. The audit trail will show which database objects were impacted, what the operation was, who performed the operation, and when it occurred. This comprehensive audit trail of database operations produced can be maintained over time to show in-depth analysis of access and modification patterns against data in the mainframe.

But as with any technology, there are multiple considerations to understand and deliberate upon before implementation.  The first step is to make a list of the regulations to be complied with, based on the types of data your institution holds. After you have created a compliance roadmap, determine what level of data access auditing is required, with input from an internal auditing group. A good database access auditing solution should answer at least the following questions:

  1. Who accessed the data?
  2. At what date and time was the access?
  3. What program or client software was used to access the data?
  4. From what location was the request issued?
  5. What command was issued to access the data?
  6. Was the request successful; and if so, how many rows of data were retrieved?
  7. If the request was a modification, what data was changed? (A before and after image of the change should be accessible)

When choosing a solution, consider one that delivers pre-canned compliance reports. For example, if you are looking to comply with PCI DSS, a database auditing solution that delivers out-of-the-box PCI reports will shorten your implementation timeline.

Furthermore, the method by which the audit trail is produced is a significant consideration. This is especially important for mainframe data, as we will soon see.

There are basically four options for auditing database access and modification:

  1. deploying native DBMS traces
  2. scanning database transaction logs
  3. sniffing network traffic
  4. tapping requests at the source.

Of these methods, only the last provides 100% visibility into database activities without becoming a performance drain. Native  traces can be performance prohibitive and often times do not collect all of the required information needed to show compliance. Scanning transaction logs can capture database changes, but not access (as is required for some regulations, such as HIPAA). Sniffing network traffic works well when all database requests must go across the network (as is common for Linux, Unix and Windows workloads), but what about when the request is all on one box (such as a CICS transaction that access DB2)? The only sure method of gathering all required auditing information, especially for mainframe auditing, is for data access requests (both read and modification) to be tapped at the DBMS level.

There are auditing solutions on the market that deploy all or a combination of these methods so be sure to investigate how auditing is accomplished by any product you are considering before making a purchase decision. For each product, be sure to investigate the methodology being used to ensure its comprehensiveness, performance, and usability.

Synopsis

Ensuring compliance with tedious government and industry regulations is a daunting task. This, along with the growing need to protect databases from the increasing online and internal threats to sensitive data has resulted in business executives being asked to be more personally responsible for the safety of corporate data. Data access auditing solutions can help your organization meet its growing requirements safely and proactively.

Posted in auditing, compliance, data, Database security, tools | 1 Comment

Are DBAs Obsolete?

Before we go any further, let me briefly answer the question posed in this column’s title: “No Way!” OK … with that out of the way, let’s discuss the issue …

Every so often, some industry pundit gets his opinions published by declaring that “Database administrators are obsolete” or that “we no longer need DBAs.” Every time I hear this, it makes me shake my head sadly as I regard just how gullible IT publications can be.

At one time, these types of proclamations were based on the increasing autonomics within database systems. And yes, database management system (DMBS) software is becoming easier to use; some of the things that used to require manual effort are now automated. But that does not eliminate the need for DBAs!

Today, the DBA is supposed to be dying off because of NoSQL database systems and appliances like Exadata and IBM DB2 Analytics Accelerator. But that is absurd, too. Neither of these solutions will encompass all of the data management needs of an organization, and even these appliances still need a DBA to perform some tasks.

It is simply not possible to entirely automate all of the duties performed by DBAs. For example, how can a DBMS make changes to its database structures (such as adding new columns to support a new business requirement)? How would it know what to add-and what data types to choose? Just as importantly, how would it know where to add the new columns?

Database design, both logical and physical, will always be needed (along with knowledge of the business) to create and maintain properly running database systems-and that is the domain of the DBA.

What about backup and recovery? Would you trust a DBMS that had corrupted data to recover itself? If it was that darn smart, why did it become corrupted in the first place? Yes, the job of the DBA will morph and change-as it has already during its 40 or so years of existence. But that doesn’t mean it will become obsolete … just different.

Database administration needs to be practiced in a more rigorous manner. Too often the DBA is viewed as a fireman. This is not meant to disparage firemen, of course, but the fireman is called after the fire has started. Database administration practiced as a management discipline should be focused on prevention first, cure second. This requires a set of best practices for database implementation and management to be designed and followed.

There are several layers of misunderstanding and poor DBA practices running rampant across the industry that need to be addressed. The most pervasive is probably faced by DBAs working for organizations predominantly focused on the internet. As companies move from traditional development to web development, there is the inevitable change in mindset that creates “mad rushes.” This “get it done NOW!” mentality is bad news for DBAs because it is detrimental to good database design practices. Poorly planned or unplanned changes get made to the database in order to speed development. These mistakes persist for years and can cause poor performance, data integrity problems, and make future changes more difficult.

Another problem is the “we don’t need no stinking DBAs” attitude. Sometimes smaller organizations attempt to implement database applications without DBAs. Well, not really-usually one of the application developers acts as a pseudo-DBA and performs just the basics to get the application delivered. Meaning, database design, performance, availability, and maintenance will suffer. Whenever you hear of an enterprise application being built and managed without a DBA, I can guarantee that the system will be riddled with problems.

Perhaps the greatest issue facing DBAs these days is the whole “proactive versus reactive” argument. Many so-called mature organizations approach database administration only as a “reactive” chore. This gets back to my fireman metaphor. Oh yes, everyone says they are “proactive” but that is usually a great big lie. Many DBAs are up to their ears in paperwork, design tasks, and performance tuning-with a line of folks out the door of their cubicles looking for help. Now, how many of these DBAs do you think are being proactive and looking for more “potential” problems so they can fix them before they occur? None! They are all trying to put out the fires that are on their desks.

The bottom line is this: As portions of the DBA job are automated, the DBA can focus on those things that have been ignored or poorly addressed. They can better implement practices and procedures for the things that cannot become autonomic: backup/recovery and change management, to name two.  And, they can begin to tackle bigger problems … such as business issues, compliance and governance, data quality issues, metadata management, and so on.

Long live the DBA!

Posted in DBA, NoSQL, performance | 3 Comments

The Log Is The Database!

The database log, sometimes referred to as the transaction log, is a fundamental component of a database management system. All changes to application data in the database are recorded serially in the database log. Using this information the DBMS can track which transaction made which changes to the database. Furthermore, rollback and recovery operations utilize the database log to reset the database to a particular point-in-time.

The manner in which the database log is created varies depending on the DBMS being used. Some DBMSs specify the log at the database system level, others define a database log for each database that is created within the database system. Depending on the DBMS the database log may consist of several files. For example, Oracle uses a transaction log and rollback segments.

Some DBMSs provide a parameter to enable and disable logging. In general, though, you should avoid disabling database logging for any database or database system where the data being managed is valuable. In other words, you might consider turning off database logging only for test databases (and be careful even then).

During normal database application processing SQL inserts, updates, and deletes will be issued to modify data in the database. As these database modifications are made, the transaction log for the database will grow. Since each database change is logged, the DBA will need to actively monitor the size of the transaction log files. Since data is constantly changing, the log will be continuously growing.

Really, the database log is the most important component of the DBMS. Another way to think about it is that the log is the database. There is actually more information in the log because it contains a record of every change made to the database. The database itself can be thought of as just an optimized access path to the most recent version of the log.

How the Log Works

A database transaction log is a write-ahead log. This means that changes are made to the transaction log before they are actually made to the data in the database tables. When the database modification has been fully recorded on the log, recovery of the transaction is guaranteed.

Typically a system checkpoint is taken by the DBMS to guarantee that all log records and all modified database pages are written safely to disk. The frequency of database system checkpoints can be set up by the DBA using database configuration parameters – usually checkpoint frequency is set either as a predetermined time interval or as a preset number of log records written.

Generally, the following type of information is recorded on the database log:

  • the beginning and ending time of each transaction
  • the actual changes made to the data and enough information to undo the modifications made during each transaction (accomplished using before and after images of the data)
  • the allocation and deallocation of database pages
  • the actual commit or rollback of each transaction

Using this data the DBMS can accomplish data integrity operations to ensure consistent data is maintained in the database. The transaction log is used when the DBMS is restarted, when transactions are rolled back, and to restore a database to a prior state. Let’s examine these scenarios.

When the DBMS is restarted, each database goes through a recovery process. During restart processing, the DBMS will check to determine which transactions must be rolled forward. This occurs for transactions where it is unknown if all the modifications were actually written from the cache to disk. A checkpoint forces all modified pages to disk. Therefore, it represents the point at which the startup recovery must start to roll transactions forward. Because all pages modified before the checkpoint are guaranteed to be written accurately to disk, there is no need to roll forward anything done before the checkpoint.

When a transaction is rolled back the DBMS copies before images to the database for every modification made since the transaction began.

During a recovery scenario the DBA can use the transaction log to restore a database. First, a backup copy of the database must be restored and then subsequent transaction log backups can be restored. This causes a roll forward of the transaction log. During a roll forward the DBMS will copy after images of each modification to the database. Using the logged data the DBMS ensures that each modification is applied in the same order that it originally occurred.

You can see where the transaction log is a useful item to have around in case of database errors, transaction errors, and to ensure data integrity. And effective setup and management of your database logs is a critically important aspect of database administration.

Posted in backup & recovery, contingency planning, data | 1 Comment

The Change Management Perspective of the DBA

The DBA is the custodian of database changes. Usually, the DBA is not the one to request a change; that is typically done by the application owner or business user.  But there are times, too, when the DBA will request changes, for example, to address performance reasons or to utilize new features or technologies. At any rate, regardless of who requests the change, the DBA is charged with carrying out the database changes.

To effectively make those changes, the DBA needs to ensure that the database change management discipline incorporates proactivity, intelligence, analyses (planning and impact), automation, standardization, reliability, predictability, and quick and efficient delivery. Without a robust, time-tested product that is designed to effect database changes, the DBA will encounter a very difficult job. Why?

Well, today’s major RDBMS products do not always support fast and efficient database structure changes. Each RDBMS provides differing levels of support for making changes to its databases. But none easily supports every type of change that might be required. One quick example: most RDBMSs today do not enable a column to be added to the middle of an existing row. To do so, the table must be dropped and recreated with the new column in the middle. But what about the data? When the table is dropped the data is deleted, unless the DBA was wise enough to first unload the data. But what about the indexes on the table? Well, they were dropped when the table was dropped, so unless the DBA knows this and recreated the indexes too, performance will suffer. The same is true for database security – when the table was dropped all security for the table was also dropped. And this is but one simple example. Other types of database change that are required from time to time include:

  • changing the name of a database, table, view, or column
  • modifying a stored procedure, trigger, or user-defined function
  • changing or adding relationships using referential integrity features
  • changing or adding database partitioning.
  • moving a table from one database, dbspace, or tablespace to another.
  • rearranging the order of column in a table.
  • changing a column’s data type or length.
  • adding or removing columns from a table.
  • changing the primary key without dropping and adding the primary key.
  • adding or removing columns from a view.
  • changing the SELECT statement on which a view is based.
  • changing the columns used for indexing.
  • changing the uniqueness specification of an index.
  • clustering the table data by a different index.
  • changing the order of an index (ascending or descending).

And this is an incomplete listing. Adding to the dilemma is the fact that most organizations have at least two, and sometime more, copies of each database. At the very least, a test and production version will exist. But there may be multiple testing environments for example, to support simultaneous development, quality assurance, unit testing, and integration testing. And each database change will need to be made to each of these copies, as well as, eventually, the production copy. So, you can see how database change quickly can monopolize a DBA’s time.

The solution is to use an automated product to implement database change. The product will enable the DBA to focus on all of the issues of change management because it is built to understand not just the discipline of change management, but also the RDBMS in which the changes are to be made. This built in intelligence shifts the burden of ensuring that a change to a database object does not cause other implicit changes from the DBA to the tool. And once the change has been identified and implemented for one system, it can easily be deployed on other database copies with minimal, or perhaps no changes.

Another feature of a database change manager product must be to support analysis and planning. When the impact of changes can be examined prior to implementing any change it becomes easier to ensure safe and efficient database changes. This type of tool also uses automation to minimize the resources required to implement database change. Instead of writing a new, complex change script from scratch for each database change, the DBA can rely on the change management tool to accomplish this. And application and database availability will be enhanced because the product will implement the change in the least intrusive, quickest manner possible.

All in all, a database change management product will improve availability, minimize errors, and speed up your time to market. And I think we all can relate to that!

Posted in change management, DBA | 4 Comments

Treating DBA as a Management Discipline

An oft-repeated story about database administration underscores both the necessity for database administration and the lack of understanding of a DBA’s function:

The CIO of Acme Corporation hires a consulting company to streamline their I.T. operations. The consultant, determined to understand the way Acme works, begins by interviewing the CIO. One of his first questions is: “So, I see that you have a DBA on staff. What does he do?”

The CIO replies, “Well, I’m told that we need the DBA to make sure our production databases stay online. I know that some of our critical business processes like order entry and inventory use Oracle, but I really don’t know what the DBA does. Now please don’t tell me I need another one, because we can barely afford to pay the one we have!”

This is a sad, but too often true, commentary on the state of database administration in many organizations. DBMS software is complex and few people understand more than just the basics (like SQL). DBAs understand the complexities of the DBMS, making them a valuable resource — and sometimes the only reliable source of database management and development knowledge within the organization is the DBA.

The DBA, often respected as a database guru, is just as frequently criticized as a curmudgeon with technical knowledge but limited people skills. Most programmers have their favorite DBA story. You know, those anecdotes that begin with “I had a problem…“ and end with “…and then he told me to stop bothering him and read the manual.

DBAs simply do not have a warm and fuzzy image. This may have something to do with the nature and scope of the job. The DBMS spans the enterprise, effectively placing the DBA on call for the applications of the entire organization. Database issues can require periods of quiet reflection and analysis to resolve, so DBAs generally do not want to be disturbed. But their quiet time is usually less than quiet; constant interruptions to answer questions and solve problems are a daily fact of life.

DBAs need to acquire exceptional communication skills. The DBA is at the center of the development life cycle — ensuring that application programs have efficient, accurate access to the corporation’s data. So DBAs frequently interface with many different types of people: technicians, programmers, end users, managers, and executives. However, many DBAs are so caught up in the minutiae of the inner workings of the DBMS that they never develop the skills required to relate appropriately to their coworkers and customers.

So what is a DBA? The short answer is simple: A DBA is the information technician responsible for ensuring the ongoing operational functionality and efficiency of an organization’s databases and the applications that access those databases. The long answer to that question requires a book to answer. But let’s start by endorsing the need to tackle database administration as a management discipline: not art, not magic, but a craft that ensures the ongoing operational functionality and efficiency of an organization’s databases and applications.

Database administration is rarely approached as a management discipline. The term discipline implies a plan, and implementation according to that plan. When database administration is treated as a management discipline, the treatment of data within your organization will improve. It is the difference between being reactive and proactive.

All too frequently, the DBA group is overwhelmed by requests and problems. This ensues for many reasons, such as understaffing, over-commitment to supporting new (and even legacy) application development projects, lack of repeatable processes, or lack of budget. The reactive DBA functions more like a firefighter than an administrator; he attempts to resolve problems only after problems occur. The reactive DBA is focused on resolving the biggest problem confronting him.

In contrast, the proactive DBA implements practices and procedures to avoid problems before they occur. A proactive database administrator develops and implements a strategic blueprint for deploying databases within the organization that spans every phase of the application development life cycle.

The DBA is responsible for managing the overall database environment. A good DBA is integral to the entire life cycle of database support, implementation, and maintenance and will treat the job in a formal and disciplined manner… anything less should be unacceptable.

Posted in DBA | 1 Comment