How Much Data Availability is Enough?

I bet that some of you reading the title of this blog post scoffed at it. I mean, in this day age, isn’t round-the-clock availability for every application and user just a basic requirement?

No, it shouldn’t be. Let’s discuss.

Availability is traditionally discussed in terms of the percentage of total time that a service needs to be up. For example, a system with 99% availability will be up and running 99% of the time and down, or unavailable, 1% of the time.

Another term used to define availability is MTBF, or mean time between failure. More accurately, MTBF is a better descriptor of reliability than availability. However, reliability has a definite impact on availability. In general, the more reliable the system the more available it will be.

In this Internet age, the push is on to provide never-ending uptime, 365 days a year, 24 hours a day. At 60 minutes an hour that mean 525,600 minutes of uptime a year. Clearly to achieve 100% availability is a laudable goal, but just as clearly an unreasonable one. Why? Because things break, human error is inevitable, and until everybody and everything is perfect, there will be downtime.

The term five nines is often used to describe highly available systems. Meaning 99.999% uptime, five nines describes what is essentially 100% availability, but with the understanding that some downtime is unavoidable (see the accompanying table).

Table 1. Availability versus Downtime

Availability Approximate downtime per year
In minutes In hours
99.999% 5 minutes .08 hours
99.99% 53 minutes .88 hours
99.95% 262 minutes 4.37 hours
99.9% 526 minutes 8.77 hours
99.8% 1,052 minutes 17.5 hours
99.5% 2,628 minutes 43.8 hours
99% 5,256 minutes 87.6 hours
98% 10,512 minutes 175.2 hours
(or 7.3 days)

Even though 100% availability is not reasonable, some systems are achieving availability approaching five nines. DBAs can take measures to design databases and build systems that are created to achieve high availability. However, just because high availability can be built into a system does not mean that every system should be built with a high-availability design. That is so because a highly available system can cost many times more than a traditional system designed with unavailability built into it. The DBA needs to negotiate with the end users and clearly explain the costs associated with a highly available system.

Whenever high availability is a goal for a new system, database, or application, careful analysis is required to determine how much downtime users can really tolerate, and what the impact of an outage would be. High availability is an alluring requirement, and end users will typically request as much as they think they can get. As a DBA, your job is to investigate the reality of the requirement.

The amount of availability that should be built into the database environment must be based on service level agreements and cost. How much availability does the business require? And just as importantly, how much availability can the business afford to implement?

That is the ultimate question. Although it may be possible to achieve high availability, it may not be cost-effective, given the nature of the application and the budget available to support it. The DBA needs to be proactive in working with the application owner to make sure the cost aspect of availability is fully understood by the application owner.

Posted in availability, DBA, SLA | Leave a comment

Database Performance Management Solutions

Performance management, from a database perspective, is comprised of three basic components:

  1. Monitoring a database system and applications accessing it to find problems as they arise. This is typically referred to as performance monitoring.
  2. Analyzing performance data (logs, trace records, etc.) from the system to determine the root cause of the problem.
  3. Assembling a corrective action to implement a fix to the problem.

Database performance software can aid in all three areas. But some simply monitor systems or fix problems, while others deliver combined functionality.

Database performance management software can also be broken down by the type of database performance issues it addresses. Database performance problems can be arise in any of the following three areas:

  • The DBMS itself, which must interact with other system software and hardware, requiring proper configuration to ensure it functions accurately and performs satisfactorily. Additionally, there are many database system parameters used to configure the resources to be used by the DBMS, as well as its behavior. This includes important performance criteria such as memory capacity, I/O throughput and locking of data pages.
  • Database schema/structures. The design of databases, tables and indexes can also impact database performance. Issues include the physical design of the database, disk usage, number of tables, index design and data definition language parameters. How the data is organized must also be managed. And as data is modified in the database, its efficiency will degrade. Reorganization and defragmentation are required to periodically remedy disorganized data.
  • SQL and application code. Coding efficient SQL statements can be complicated because there are many different ways to write SQL that return the same results. But the efficiency and performance of each formulation can vary significantly. DBAs need tools that can monitor the SQL code that’s being run, show the access paths it uses and provide guidance on how to improve the code.

Database performance tools can identify bottlenecks and points of contention, monitor workload and throughput, review SQL performance and optimization, monitor storage space and fragmentation, and view and manage your system and DBMS resource usage. Of course, a single tool is unlikely to perform all of these tasks, so you may need multiple tools (perhaps integrated into a functional suite) to perform all of your required database performance management tasks.

Without proactive tools that can identify problems as they occur, database performance problems are most commonly brought to the attention of the DBA by end users. The phone rings and the DBA hears a complain that is usually vague and a bit difficult to interpret… things like “my system is a bit sluggish” or “my screen isn’t working as fast as it used to.” In such cases, the DBA needs tools that can help uncover the exact problem and identify a solution. Database performance management tools can help to find the problem as well as to put together and deploy a solution to the problem.

A lot organizations use more than one production DBMS. Frequently, the same DBA team (and sometimes even the same excact DBA) will have to assure the performance of more than one DBMS (such as Oracle and SQL Server… or Db2 and MySQL). But each DBMS has different interfaces, parameters and settings that affect how it performs. Database performance tools can mitigate this complexity with intelligent interfaces that mask the complexing making disparate components and settings look and feel similar from DBMS to DBMS.

There are many providers of database performance management tools, including the DBMS vendor (IBM, Microsoft and Oracle), large ISVs like BMC, CA and Quest and a wide array of niche vendors that focus on DBA and database peformance software.

What database performance tools do you use and recommend? Share your expeiences with us in a comment here on the blog.

Posted in DBA, performance, tools | Leave a comment

SQL Basics

It is hard to imagine a time when SQL was unknown and not the lingua franca it is today for accessing databases. That said, there are still folks out there who don’t know what SQL is… so for them, here is an introductory place to start…

SQL is an acronym for Structured Query Language. It is often procounced a sequel, but also spelled out as letters, like ess-cue-ell. SQL is a powerful tool for accessing and manipulating data. It is the de facto standard query language for relational database management systems, used by all of the leading RDBMS products including Oracle, SQL Server (natch), Db2, MySQL, Postgres, SAP Adaptive Server, and more.

Intereestingly enough, NoSQL database systems are increasingly being adapted to allow SQL access, too! So SQL is ubiquitous and it makes sense for anybody with an interest in data management to learn how to code SQL.

SQL is a high-level language that provides a greater degree of abstraction than do procedural languages. Most programming languages require that the programmer navigate data structures. This means that program logic needs to be coded to proceed record-by-record through data elements in an order determined by the application programmer or systems analyst. This information is encoded in programs and is difficult to change after it has been programmed.

SQL, on the other hand, is fashioned so that the programmer can specify what data is needed, instead of how to retrieve it. SQL is coded without embedded data-navigational instructions. The DBMS analyzes the SQL and formulates data-navigational instructions “behind the scenes.” These data-navigational instructions are called access paths.

By having the DBMS determine the optimal access path to the data, a heavy burden is removed from the programmer. In addition, the database can have a better understanding of the state of the data it stores, and thereby can produce a more efficient and dynamic access path to the data. The result is that SQL, used properly, can provide for quicker application development.

Another feature of SQL is that it is not merely a query language. The same language used to query data is used also to define data structures, control access to the data, and insert, modify, and delete occurrences of the data. This consolidation of functions into a single language eases communication between different types of users. DBAs, systems programmers, application programmers, systems analysts, and end users all speak a common language: SQL. When all the participants in a project are speaking the same language, a synergy is created that can reduce overall system-development time.

Arguably, though, the single most important feature of SQL that has solidified its success is its capability to retrieve data easily using English-like syntax. It is much easier to understand the following than it is to understand pages and pages of program source code.

SELECT   LASTNAME
FROM      EMP
WHERE   EMPNO = '000010';

Think about it; when accessing data from a file the programmer would have to code instructions to open the file, start a loop, read a record, check to see if the EMPNO field equals the proper value, check for end of file, go back to the beginning of the loop, and so on.

SQL is, by nature, quite flexible. It uses a free-form structure that gives the user the ability to develop SQL statements in a way best suited to the given user. Each SQL request is parsed by the DBMS before execution to check for proper syntax and to optimize the request. Therefore, SQL statements do not need to start in any given column and can be strung together on one line or broken apart on several lines. For example, the following SQL statement is equivalent to the previously listed SQL statement:

SELECT LASTNAME FROM EMP WHERE EMPNO = '000010';

Another flexible feature of SQL is that a single request can be formulated in a number of different and functionally equivalent ways. One example of this SQL capability is that it can join tables or nest queries. A nested query always can be converted to an equivalent join. Other examples of this flexibility can be seen in the vast array of functions and predicates. Examples of features with equivalent functionality are:

  • BETWEEN versus <= / >=
  • IN versus a series of predicates tied together with AND
  • INNER JOIN versus tables strung together in the FROM clause separated by commas
  • OUTER JOIN versus a simple SELECT, with a UNION, and a correlated subselect
  • CASE expressions versus complex UNION ALL statements

This flexibility exhibited by SQL is not always desirable as different but equivalent SQL formulations can result in extremely differing performance. The ramifications of this flexibility are discussed later in this paper with guidelines for developing efficient SQL.

As mentioned, SQL specifies what data to retrieve or manipulate, but does not specify how you accomplish these tasks. This keeps SQL intrinsically simple. If you can remember the set-at-a-time orientation of a relational database, you will begin to grasp the essence and nature of SQL. A single SQL statement can act upon multiple rows. The capability to act on a set of data coupled with the lack of need for establishing how to retrieve and manipulate data defines SQL as a non-procedural language.

Because SQL is a non-procedural language a single statement can take the place of a series of procedures. Again, this is possible because SQL uses set-level processing and DB2 optimizes the query to determine the data-navigation logic. Sometimes one or two SQL statements can accomplish tasks that otherwise would require entire procedural programs to do.

Summary

Of course, this brief introduction does not constitute an education in SQL and it will not make you a SQL programmer. For that, you will need education and practice. A good place to start is with a SQL book or two. I can recommend these:

After reading through some good books, practice writing some SQL and keep learning… move on to more advanced texts and if you can, attend a class on the SQL. Because learning SQL makes sense in this day and age of analytics!

Posted in DBA, SQL | Leave a comment

DBA Corner

Just a quick blog post today to remind my readers that I write a regular, monthly column for Database Trends & Applications magazine called DBA Corner.

The DBA Corner is geared toward news, issues, and technologies that will be of interest to database administrators. Sometimes the material is in-depth and technical (well, as much as 700 or so words allows) and sometimes it will be more philosophical or newsy.

If you are not a DBA, do not worry, as the column regularly expands to focus on issues of interest to data architects, data analysts and even programmers and developers. Issues addressed recently in my column include data modelingdatabase design, database standards, SQL coding, DBA practices and procedures, performance , application development, optimization techniques, data governance, regulatory compliance with regard to data, industry trends, and more.

So I hope you will check back each month to read the DBA Corner column at the DBTA web site… and if you have any ideas or topics that you’d like me to address, add them as a comment to this blog post.

Posted in DBA, NoSQL, performance, SQL | Leave a comment

Managing MongoDB Databases with Navicat

I’ve written about Navicat tools for managing data in this blog before (performance monitoring, heterogeneous database administration ), so I thought I’d take a look at their most recent offering which provides many useful DBA features for MongoDB.

MongoDB is a NoSQL, open-source, cross-platform document-oriented database management system. MongoDB uses JSON-like documents with schemas. Use cases for which MongoDB excels include web commerce applications, content management, blogs, real-time analytics, and social networking. It is not particularly well-suited for systems with high transaction rates.

But I don’t really want to discuss MongoDB in-depth here. As a proponent of performing database administration as a management discipline though, the world of NoSQL database systems lacks the in-depth management and administration tooling enjoyed by relational database systems. That has to change, and Navicat has obviously recognized this fact, with its new Navicat for MongoDB offering.

Navicat for MongoDB delivers GUI interface for MongoDB database management, administration and development (see Figure 1). You can use it to connect to local and remote MongoDB servers and it is compatible with MongoDB Atlas.

Navicat for MongoDB

Figure 1. Navicat for MongoDB GUI – main screen

Navicat for MongoDB offers many useful features for DBAs to manage, monitor, query, and visualize your MongoDB data. It supports adding, modifying, and deleting documents using built-in editors including a tree view, JSON view, and the classic spreadsheet-like grid view.

One of the bigger headaches of using a new database technology like MongoDB can be moving data around. Navicat for MondoDB makes this easier as it comes with an Import Wizard that can be used to transfer data into and out of your MongoDB databases. It supports multiple, diverse formats like Excel, Access, CSV and more. You also can ingest data from ODBC after setting up a data source connection. It provides strong data transfer and synchronization capabilities to enable the migration of your data. You can transfer data across databases, compare the data in your databases, and synchronize the data.

Querying data in MongoDB is a snap with the Navicat Visual Query Builder. It can be used to create, edit and run queries without having to worry about syntax and proper usage of commands. Additional features, like Code Completion and customizable Code Snippet simplify your coding efforts by providing suggestions for keywords and eliminating repetitious coding.

For DBAs, Navicat for MongoDB provides an Intelligent Object Designer. It enables you to create, modify and manage all database objects using built-in professional advice and guidance. You can preview results on each step and debug the sampled data before running your jobs. And the Smart Schema Analyzer can be used to help you visually discover and explore your schema. With it, you can analyze your documents and display the structures within your collections, thereby making it easy to understand your data’s schema, find schema anomalies and inspect outliers.

Navicat for MongoDB even provides an intuitive Backup Utility that can be used to automate your backup process and reduces the potential for errors. You can set up a repeatable deployment process for job and script execution at a specific time or day.

Security is built into the product, too. It uses SSH Tunneling and SSL to ensure every connection is secure, stable, and reliable. You can use different authentication methods of database servers such as Kerberos and X.509 authentication.

Finally, you can use Navicat for MongoDB in the cloud to enable collaboration with your co-workers. Share your connection settings, queries and virtual groups with your coworkers anytime and anywhere.

All-in-all, Navicat for MongoDB goes a long way toward making your MongoDB environment as manageable as your relational environments.

You can download and try Navicat for MongoDB here: https://www.navicat.com/en/download/navicat-for-mongodb.

Posted in DBA | Tagged , , , | Leave a comment

My Computer Mug Collection, Part 4

Recently I tweeted a mug a day from my collection of coffeee mugs given to me over the years by computer and software companies. Since I finished up tweeting all of my mugs, I have been posting them here to my blog, as well. So far I have posted three previous posts showcasing my computer mug collection. Today, as promised, here are the remaining mugs I have yet to blog…

First up is a mug I forgot to include in Part 2 (mugs from computer conferences). This one is from SHARE, and I received it as a Best Session Award for my DB2 for z/OS Performance Roadmap presentation. It is clear glass, so it is a bit difficult to see:

IMG_1075

Next up is a series of mugs from German software vendor Software Engineering. I think I was lucky enough to collect all of the mugs in the series:

IMG_0865

And here is a mug from from ComputerWorld  with one of their IT cartoons on it. I sure hope the ESC key worked!

IMG_0853

And this mug is back from the days when Oracle actually developed and marketed a version of their DBMS for MVS! Sure, you can run Oracle on a mainframe today, but it has to be in a Linux partition.

IMG_0858

Here are several mugs from IBM in my collection. The first one says “Tame you Data Monster with IBM” – and that is a good overall summation of what I’ve done my entire career!  And the there is the newest of these mugs, the IBM Champion mug. I use this one every day as a pen and pencil holder! And the last one is a joke, of sorts. Documentation and memos that are not meant to be shared are often marked “Internal Use Only” as is this mug, probably referring to the coffee it will hold.

Next we have a mug from Memorex. Some of you might question as to whether it is actually a “computer” mug, but it is! This is from back in the day when Memorex was a big manufacturer of floppy discs.

IMG_0874

Here is a nice little mug from Peppers & Rogers that I think I got when I took a product management class from them back in the mid-1990s:

IMG_0871

And finally, here is a mug from Software Marketing Journal. I only subsribed to this magazine for a short time in the late 1990s when I was VP of marketing for Platinum’s database solutions… so I’m pretty sure that it is from that timeframe:

IMG_0870

And that concludes my cavalcade of computer mugs… I think. There may be another mug or two hiding around here somewhere… if I discover any more I’ll be sure to share them with you.

So what next? I have an extensive button/pin collection from various computer companies and conferences. Anybody want me to start sharing those? Or have you had enough?

Posted in DBA | Leave a comment

My Computer Mug Collection, Part 3

So far I have published two posts showing the computer-related mugs in my collection.  Over several weeks I first tweeted a mug a day from my collection, and now I am blogging them for those that missed any of the tweets.

In part 1 of my computer mug collection I highlighted the mugs from companies where I worked; and in part 2 I showed the mugs I received from conferences and user groups. Today’s post I call the dearly-departed — mugs from companies that have been acquired or gone out of business.

First up, we have Goal Systems. This is not the current company named Goal Systems (transport industry software), but the company that sold the performance monitor Insight for Db2 before CA acquired them.

IMG_0714

And here are two more companies that were bought by CA: Pansophic and I just had to include a Platinum mug again.

Then we have BGS Systems, the maker of mainframe capacity planning and performance software that I believe was acquired first by Boole and Babbage, and then by BMC Software.

IMG_0771

And here is a nice mug from Easel Corporation, which was a popular software development firm for GUI design. It was acquired by VMARK in the mid-1990s.

IMG_0847

Here we have a mug from R&O, the original makers of the Rochade repository. They have since been acquired by Allen Systems Group.

IMG_0855

Then we have this mug, from Cogito, the makers of EZ-Db2… they have since been acquired by Syncsort.

IMG_0860

And then there is XA Systems, which was acquired by Compuware back in the early 1990s.

IMG_0859

And finally, here is a mug from Sablesoft. They made the Compile/QMF (which at some point was renamed to Compile/QQF) product. Sabelsoft was acquired by Platinum technology, inc., and then CA.

IMG_0856

That conclude today’s post… but there are still a few more mugs I have yet to blog. Stay tuned for the final post in this series coming next week!

 

Posted in data, DB2, mugs | Leave a comment

My Computer Mug Collection, Part 2

Last week I posted part 1 of my computer mug collection and promised to continue posting the rest of the collection in later blog posts.

Well, here is the second in this series of posts. Today we will look at the mugs I received from user groups and industry conferences. And we’ll start with IDUG!

As you can see, I have several IDUG mugs, the newest of which is from this year. That would be the one with the orange lettering celebrate IDUG’s 30th Anniversary.

Next we have a mug from the old IBM DB2 Tech Conference (yes, I know Db2 is now spelled with a lower-case “b” but it never was when the Tech Conference was held, so I used an upper case “B” there)! This one is from the 1992 event held in Miami Beach. I remember attending this conference quite well… better than I should given that it was over 25 years ago!

IMG_0862

Then we have this nice mug from the Reno CMG conference in 2006. CMG always puts on a great annual conference for those of us who toil away in the realm of performance and capacity management.

IMG_0789

And how many of you remember DCI (Digital Consulting, Inc.) and their series of conferences? They host quite a few, as this mug attests. I remember speaking at Database World in Chicago (but I can’t remember the year).

IMG_0863

Finally, for today anyway, here is a mug I was given for speaking at the Heart of America Db2 User Group many years ago. Interestingly enough, I’ll be speaking there again this September (2018), so if you are in the Kansas City area, stop in and say “Hi!”

IMG_0877

More mugs to come in the next installment…

Posted in IDUG, mugs | 1 Comment

My Computer Mug Collection, Part 1

Recently I have been tweeting a mug a day from my collection of coffeee mugs given to me over the years by computer and software companies. I just tweeted the last one (I think), and since these posts seemed to be populat, I thought I’d post a blog entry with all of the mugs.

Places I Have Worked

First up, are all of the mugs from the companies I have worked for, starting with my first consulting company in Pittsburgh, ASSET, Inc. Myself and a couple of friends started this and we kept it going for a few years (hi Bernard and Cindy):

IMG_0879

Next up is Platinum Technology, inc. I worked there for a number of years on two separate runs. It is probably the most fun I ever had working for a corporation. Here are the numerous mugs I have from Platinum:

After Platinum I worked at BMC Software for awhile. Here are the mugs I accumulated while at BMC:

I also worked at NEON Enterprise Software for a spell:

IMG_0823

And finally, I have my own consulting company, so I made myself a mug!

That is all of the mugs from the software companies I worked. Over the course of the next few posts I will show the remaining mugs from other software and hardware companies, computer conferences, and other miscellaneous mugs.

Hope you enjoy!

Posted in DB2, mugs, tools | 2 Comments

Don’t Forget the Free Space

When you design your databases you must take care to specify appropriate free space to allow for data growth.

Free space, sometimes called fill factor, is a DDL option that can be specified when creating or altering your table spaces and indexes. It is used to leave a portion of a the database obect empty and available to store newly added data. The specification of free space in a table space or index can reduce the frequency of reorganization, minimize contention, and improve the efficiency of inserting new data.

Each DBMS provides a method of specifying free space for a database object in the CREATE and ALTER statements. A typical parameter is PCTFREE, where the DBA specifies the percentage of each data page that should remain available for future inserts. Another possible parameter is FREEPAGE, where the DBA indicates the specified number of pages after which a completely empty page is available.

Ensuring a proper amount of free space for each database object provides the following benefits:

  • Inserts are faster when free space is available.
  • As new rows are inserted, they can be properly clustered.
  • Variable-length rows and altered rows have room to expand, potentially reducing the number of relocated rows.
  • Having fewer rows on a page results in better concurrency because less data is unavailable to other users when a page is locked.

However, free space also has several disadvantages:

  • Disk storage requirements are greater.
  • Scans take longer.
  • Fewer rows on a page can necessitate more I/O operations to access the requested information.
  • Because the number of rows per page decreases, the efficiency of data caching can decrease because fewer rows are retrieved per I/O.

The DBA should monitor free space and ensure that the appropriate amount is defined for each database object. The correct amount of free space must be based on:

  • The expected frequency of inserts and modifications
  • Amount of sequential versus random access
  • Impact of accessing unclustered data
  • Type of processing
  • Likelihood of row chaining, row migration, and page splits

Be careful, though, not to fall prey to a common database design mistake, namely specifying the same default amount of free space for every object. I’ve seen many database implementations where 20% (or 10%) free space has been specified for every object. This is surely not a good idea. Why?

Well, not every table grows at the same rate. A transaction history table will likely grow rapidly, but perhaps all new transactions are added to the end. Free space is not needed for such a table because your rarely, if ever, go back and edit old transactions. A customer table, hopefully grows over time. Perhaps the data is clustered by Customer Name, in which case some free space will be helpful.

Finally, consider static tables. For example, a code table or maybe a table that contains the 50 state codes with the state name. This type of table rarely changes, so do not define any free space for it because it does not need room in which to expand.

Summary

Armed with appropriate knowledge of your applications and data you can define reasonable free space into your database objects to allow for data growth with minimal impact on performance. Be sure to understand how your DBMS enables free space and define it accordingly for your production databases.

Posted in database design, DBA, free space, performance | Leave a comment