Minimizing Database Outages with Replication

A primary goal of most organizations is to minimize downtime and improve the availability of their IT systems and applications. According to a recent study on achieving 2018 database goals by Unisphere Research, “Enterprises seek to base most, if not all, decision making on insights drawn from data, which means there is no margin for error if the flow of that data is ever disrupted.” This means that DBAs are charged with keeping databases up and running to meet the business need for non-stop access to data.

If the DBMS is down, data cannot be accessed. If the data is not available, applications cannot run. And if your applications cannot run, your company is not making crucial decisions, or worse yet, making decisions without the data to back them up. At any rate, the Unisphere study indicates that customer service will suffer the most when a database goes down (cited by 74% of data managers).

All of these things can translate to losing business, which means lower earnings and perhaps even a lower stock valuation for your company. These are all detrimental to the business and therefore, the DBA is called upon to do everything in his or her power to ensure that databases are kept on line and operational. One respondent to the Unisphere study said ““Management already considers databases to be an integral part of the infrastructure that is always up. Any downtime reflects poorly on the DBA team and time (however much) must be put in to restore the systems until they are up again.” Does that sound familiar?

Assuring availability becomes that much more difficult as organizations expand the amount of data that is generated, captured, and managed. Although putting an exact figure on how fast data is growing can be difficult, there are surveys that show the growth of data on premises to range between 10% and 50% for most organizations. And that does not even take into account cloud data, which is growing rapidly and still must be managed.

Further complicating the issue is that fewer DBAs are being asked to manage more data. Although data continues to expand that is not translating into additional DBAs being hired. According to research firm Computer Economics, data management staff as a percentage of IT staff has risen only .5% in four years, and IT spending per user continues to decline.

Nevertheless, DBAs are required to provide uninterrupted availability for many, probably most of their systems. There are many tactics at the DBA’s disposal to improve data availability and performance. One of the most powerful tools for supporting availability is data replication.

When data is replicated, it is stored in more than one site or node, thereby improving the availability of the data. With full replication, where a copy of the whole database is stored at every site, rapid failover from a failing system to an operational one can be achieved to avoid outages. You can also implement partial replication, whereby some fragments of the database are replicated and others are not.

Replication improves the reliability of your applications because your data will be replicated across multiple machines, instead of on a single machine (aka single point of failure). With replicated data you may be able to achieve improved performance with parallel access. And by locating replicas closer to where transactions are executing, data replication may be able to decrease the need for data movement.

Nevertheless, implementing data replication is not free. Concurrency control and recovery techniques tend to be more complicated and therefore more expensive. And in some cases you may need to buy additional hardware to support it properly. That said, replication can be cost effective because it is much more likely that the cost of an outage on mission critical data and applications will dwarf the cost of data replication.

Of course, data replication is not a panacea. You cannot stop taking backups or cease performance management efforts by replicating data. For example, consider a human error that pushes the wrong data to the database. The replication system will dutifully replicate the wrong data out to all replicated nodes. So you can’t rely on replicated data to recovery from such a scenario. I mention this not because it is a fault of data replication, but because some folks wrongly think that replication solves recovery issues.

The Bottom Line

So the bottom line here is that organizations are adding more data and relying on that data to be available around-the-clock. DBAs are charged with assuring the availability of database data and cannot be caught having a hardware error cause a prolonged outage. Replication, therefore, is a useful technology for ensuring that downtime is for vacations, not databases!

The days are gone when organizations enjoyed a long batch window where databases could be offline for extended periods to perform nightly processing and maintenance tasks. Replication and intelligent failover is the modern way to eliminate unplanned downtime.



I'm a strategist, researcher, and consultant with nearly three decades of experience in all facets of database systems development.
This entry was posted in availability, data integration, DBA. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s