Common Database Design Errors

Before we begin today’s blog post, wherein I explain some of the more common mistakes that rookies and non-database folks make (heck, even some database folks make mistakes), I first want to unequivocally state that your organization should have a data architecture team that is responsible for logical and conceptual modeling… and your DBA team should work in tandem with the data architects to ensure well-designe databses.

OK, so what if that isn’t your experience? Frankly, it is common for novices to be designing databases these days, so you aren’t alone. But that doesn’t really make things all that much better, does it?

The best advice I can give you is to be aware of design failures that can result in a hostile database. A hostile database is difficult to understand, hard to query, and takes an enormous amount of effort to change.

So with all of that in mind, let’s just dig in and look at some advice on things not to do when you are designing your databases.

Assigning inappropriate table and column names is a common design error made by novices. Database names that are used to store data should be as descriptive as possible to allow the tables and columns to self-document themselves, at least to some extent. Application programmers are notorious for creating database naming problems, such as using screen variable names for columns or coded jumbles of letters and numbers for table names. Use descriptive names!

When pressed for time, some DBAs resort to designing the database with output in mind. This can lead to flaws such as storing numbers in character columns because leading zeroes need to be displayed on reports. This is usually a bad idea with a relational database. It is better to let the database system perform the edit-checking to ensure that only numbers are stored in the column.

If the column is created as a character column, then the developer will need to program edit-checks to validate that only numeric data is stored in the column. It is better in terms of integrity and efficiency to store the data based on its domain. Users and programmers can format the data for display instead of forcing the data into display mode for storage in the database.

Another common database design problem is overstuffing columns. This actually is a normalization issue. Sometimes a single column is used for convenience to store what should be two or three columns. Such design flaws are introduced when the DBA does not analyze the data for patterns and relationships. An example of overstuffing would be storing a person’s name in a single column instead of capturing first name, middle initial, and last name as individual columns.

Poorly designed keys can wreck the usability of a database. A primary key should be nonvolatile because changing the value of the primary key can be very expensive. When you change a primary key value you have to ripple through foreign keys to cascade the changes into the child table.

A common design flaw is using Social Security number for the primary key of a personnel or customer table. This is a flaw for several reasons, two of which are: 1) a social security number is not necessarily unique and 2) if your business expands outside the USA, no one will have a social security number to use, so then what do you store as the primary key?

Actually, failing to account for international issues can have greater repercussions. For example, when storing addresses, how do you define zip code? Zip code is USA code but many countries have similar codes, though they are not necessarily numeric. And state is a USA concept, too.

Of course, some other countries have states or similar concepts (Canadian provinces). So just how do you create all of the address columns to assure that you capture all of the information for every person to be stored in the table regardless of country? The answer, of course, is to conduct proper data modeling and database design.

Denormalization of the physical database is a design option but it can only be done if the design was first normalized. How do you denormalize something that was not first normalized? Actually, a more fundamental problem with database design is improper normalization. By focusing on normalization, data modeling and database design, you can avoid creating a hostile database.

Without proper upfront analysis and design, the database is unlikely to be flexible enough to easily support the changing requirements of the user. With sufficient preparation, flexibility can be designed into the database to support the user’s anticipated changes. Of course, if you don’t take the time during the design phase to ask the users about their anticipated future needs, you cannot create the database with those needs in mind.

Summary

Of course, these are just a few of the more common database design mistakes. Can you name more? If so, please discuss your thoughts and experiences in the comments section.

Posted in data, data modeling, database design, DBA | Tagged | Leave a comment

Happy New Year 2019

Just a quick post today to wish everybody out there a very Happy New Year!

Happy-New-Year-

I hope you have started 2019 off with a bang and that the year is successful and enjoyable for one and all!

Posted in Happy New Year | Leave a comment

FaunaDB: A multi-model, distributed database system with ACID consistency

Although relational, SQL database systems continue to dominate the DBMS market, modern database management has shifted to encompass additional types of database systems. This is exemplified in the rise of the NoSQL database system to serve the needs of modern applications that are not as well-suited for existing relational, SQL database systems.

What used to be rather simple – choosing from three or four market leading SQL DBMS products – has now become confusing and difficult trying to understand the morass of different DBMS types and offerings on the market.

A Multi-Model Approach

Well, one solution to avoid the confusion is to select a multi-model DBMS offering. A multi-model database system supports multiple types of database models, such as relational, document, graph, wide column, and key/value. FaunaDB is an example of a multi-model DBMS capable of managing both relational and NoSQL data, and designed to support modern, scalable, real-time applications.

FaunaDB combines the scale and flexibility of NoSQL with the safety and data integrity of relational systems. The company refers to this as Relational NoSQL. Unlike many NoSQL database systems, FaunaDB delivers ACID compliance. You can scale transactions across multiple shards and regions while FaunaDB guarantees the accuracy and integrity of your data and transactions.

FaunaDB enables your developers to write sophisticated transactions using languages they already know. And you can pull data from document, relational, graph, and temporal data sets all from within a single query.

Since FaunaDB is NoSQL, you won’t be using SQL to access databases. The Fauna Query Language (FQL) is the primary interface for interacting with a FaunaDB cluster. FQL is not a general-purpose programming language, but it provides for complex, manipulation and retrieval of data stored within FaunaDB. The language is expression-oriented: all functions, control structures, and literals return values. This makes it easy to group multiple results together by combining them into an Array or Object, or map over a collection and compute a result – possibly fetching more data – for each member.

A query is executed by submitting it to a FaunaDB cluster, which computes and returns the result. Query execution is transactional, meaning that no changes are committed when something goes wrong. If a query fails, an error is returned instead of a result.

FQL supports a comprehensive set of data types in four categories: simple types, special types, collection type and complex types. A simple data type is one that is native to FaunaDB and also native to JSON, such as Boolean, Null, Number and String. Special data types in FaunaDB extend the limited number of native JSON data types; Bytes, Date, Query, Ref, Set and Timestamp. A complex data type is a composite of other existing data types, such as an Object or Instance. And the collection data type is able to handle multiple items while maintaining order, such as Array and Page.

Consistency

Perhaps the most impressive aspect of FaunaDB is how it enables strict serializability for external transactions. By supporting serializable isolation, FaunaDB can process many transactions in parallel, but the final result is the same as processing them one after another. The FaunaDB distributed transaction protocol processes transactions in three phases:

  • In the first, speculative phase, reads are performed as of a recent snapshot, and writes are buffered.
  • The second phase uses a consensus protocol to insert the transaction into a distributed log. At this point, the transaction gets a global transaction identifier that indicates its equivalent serial order relative to all other concurrent transactions. This is the only point at which global consensus is required.
  • Finally, the third phase checks each replica verifying the speculative work. If there are no potential serializability violations, the work is made permanent and buffered writes are written to the database. Otherwise, the transaction is aborted and restarted.

This software approach is novel and allows for the scaling of transactions across multiple shards and regions while guaranteeing transactional correctness and data accuracy. Contrast this with other database systems, such as Google Spanner, that rely on distributed clock synchronization to ensure data consistency.

The FaunaDB approach is based on a 2012 Yale University paper titled “Calvin: Fast Distributed Transactions for Partitioned Database Systems.” You can download that paper here. And if you are interested in additional details, consult this blog post: Consistency without Clocks: The FaunaDB Distributed Transaction Protocol.

Multi-Tenancy

Many database systems provide multi-tenant capabilities. They can contain multiple databases, each with their own access controls. FaunaDB takes this further by allowing any database to have multiple child databases. This enables an operator to manage a single large FaunaDB cluster, create a few top-level databases, and give full administrative access of those databases to associated teams. Each team is free to create as many databases as they need without requiring operator intervention. As far as the team is concerned, they have their own full FaunaDB cluster.

Temporality

Strong temporal support is an additional capability of FaunaDB. Traditionally, a database system stores only data that is valid at the current point-in-time; it does not track the past state of the data. Most data changes over time, and different users and applications can have requirements to access that data at different points in time. Temporal support makes it possible to query data “as of” different past states.

All records in FaunaDB are temporal. When instances are changed, instead of overwriting the prior contents, a new instance version at the current transaction timestamp is inserted into the instance history, and marked as a create, update, or delete event. This means that with FaunaDB, all reads can be executed consistently at any point in the past or transformed into a change feed of events between any two points in time. This is useful for many different types of applications, such as auditing, rollback, cache coherency, and others.

 Strong Security

Data protection and security has become more important as data breaches continue to dominate the news. Regulation and data governance practices dictate that organizations implement strong protective measure on sensitive data.

FaunaDB implements security at the API level. Access to the FaunaDB API uses  access keys, which authenticate connections as having particular permissions. This access key system applies to administrator- and server-level connections, as well as to object- and user-level connections.

In other words, reading or writing instances of user-defined classes in FaunaDB requires a server key, or an instance token with appropriate permissions.

Delivery Models

FaunaDB can run anywhere you need it to run: on-premises, in your cloud, the public cloud, even multiple clouds. Basically, FaunaDB can run anywhere you can run a JVM.

The FaunaDB Serverless Cloud enables developers to implement and elastically scale cloud applications with no capacity planning or provisioning. FaunaDB Cloud provides essential features that enable developers to safely build and run serverless applications without configuring or operating infrastructure.

The serverless approach uses an event-driven architecture where developers code functions and deploy them to the infrastructure. The functions only consume resources when they are invoked, at which point they run within the architecture. A serverless architecture is conducive to modern development practices because it can eliminate many of the difficulties developers face reconciling their database infrastructure with today’s development methods.

Summing Things Up

Prior to founding Fauna in 2012, the team at FaunaDB was part of the team that developed the infrastructure at Twitter. And FaunaDB is already being used at many leading enterprises. Check out these write ups about FaunaDB usage at NVIDIA, ShiftX, and VoiceConnect. Others are available at Fauna’s web site.

So, if you are looking for a multi-model, secure NoSQL database platform with strong consistency, horizontal scalability, multi-tenenacy and temporal capabilities, that can run on-premise and in the cloud, consider taking a look at FaunaDB.

Posted in cloud, data availability, DBMS, Isolation Level, NoSQL, relational, temporal | Leave a comment

SQL Performance and Optimization

Just a quick post today to refer my readers to a series of blog posts that I recently made to the IDERA database community blog.

This four-part blog series took a look into SQL performance and optimization from a generic perspective.  By that I mean that I did not focus on any particular DBMS, but on the general things that are required of and performed during the optimization of SQL.

Part one – Relational Optimization, introduces and explains the general concept of relational optimization and what it entails;

Part two – Query Analysis and Access Path Formulation, examines the process of analyzing SQL queries and introduces the types of access that can be performed on a single table;

Part three – Multiple Table Access Methods – takes a look at optimization methods for combining data from more than one table

And finally, part four – Additional Considerations – concludes the series with an overview of several additional aspects of SQL optimization that we have yet to discuss.

If you are looking for a nice overview of SQL and relational optimization without DBMS-specific details, give these posts a read!

 

Posted in optimization, performance, SQL | 4 Comments

My Data Quotes – 2018

I am frequently approached by journalists and bloggers for my thoughts on the data-related news of the day… and I am usually happy to discuss data with anybody! Some of these discussions wind up getting quoted in news articles and posts. I like to try to keep track of these quotes.

With that in mind, I thought I’d share the articles where I have been quoted (so far) in 2018:

I may be missing some, so if you remember chatting with me last year and you don’t see your piece listed above please ping me to let me know…

And if you are interested in some of the older pieces where I’ve been quoted I keep a log of them on my web site at mullinsconsulting.com/quoted.html.  (Note, some of the older articles/posts are no longer available, so some of the links are inoperable.)

Posted in DBA | Leave a comment

Teradata Analytics Universe 2018 and Pervasive Data Intelligence

I spent last week in Las Vegas at the Teradata Analytics Universe conference, Teradata’s annual user conference. And there was a lot to do and learn there.

 

IMG_0182

Attendees heading to the Expo Hall at the Teradata Analytics Universe conference in Las Vegas, NV — October 2018

 

The major message from Teradata is that the company is a “new Teradata.” And the message is “Stop buying analytics,” which may sound like a strange message at a conference with analytics in its name!

But it makes sense if you listen to the entire strategy. Teradata is responding to the reality of the analytics marketplace. And that reality centers around three findings from a survey the company conducted of senior leaders from around the world:

  1. Analytics technology is too complex. 74 percent of senior leaders said their organization’s analytics technology is complex; 42 percent said that analytics is not easy for their employees to use and understand.
  2. Users don’t have access to all the data they need. 79 percent of said they need access to more company data to do their job effectively.
  3. Data scientists are a bottleneck. Only 25 percent said that, within their enterprise, business decision makers have the skills to access and use intelligence from analytics without the need for data scientists.

 

WhereAreDataScientists_02x600

 

To respond to these challenges, Teradata says you should buy “answers” not “analytics.” And they are correct. Organizations are not looking for more complex, time-consuming, difficult-to-use tools, but answers to their most pressing questions.

Teradata’s calls their new approach “pervasive data intelligence,” which delivers access to all data, all the time, to find answers to the toughest challenges. This can be done on-premises, in the cloud, and anywhere in between.

A big part of this new approach is founded on Teradata Vantage, which provides businesses the speed, scale and flexibility they need to analyze anything, deploy anywhere and deliver analytics that matter. At the center of Vantage is Teradata’s respected analytics database management system, but it also brings together analytic functions and engines within a single environment. And it integrates with all the popular open source workbenches, platforms, and languages, including SQL, R, Python, Jupyter, RStudio, SAS, and more.

“Uncovering valuable intelligence at scale has always been what we do, but now we’re taking our unique offering to new heights, unifying our positioning while making our software and consulting expertise available as-a-service, in the cloud, or on-premises,” said Victor Lund, Teradata CEO.

Moving from analytical silos to an analytics platform that can deliver pervasive data intelligence sounds to me like a reasonable way to tackle the complexity, confusion, and bottlenecks common today.

Check out what Teradata has to offer at teradata.com

Posted in analytics, data, Teradata, tools | Leave a comment

Data Modeling with Navicat Data Modeler

Data modeling is the process of analyzing the things of interest to your organization and how these things are related to each other. The data modeling process results in the discovery and documentation of the data resources of your business. Data modeling asks the question “What?” instead of the more common data-processing question “How?”

As data professionals, it is important that we understand what the data is and what it means before we attempt to build databases and applications using the data. Even with today’s modern infrastructure that includes databases with flexible schemas that are applied when read (instead of the more traditional method of applying the schema on write), you still need a schema and an understanding of the data in order to do anything useful with it. And that means a model of the data.

The modeling process requires three phases and types of models: conceptual, logical and physical. A conceptual data model is generally more abstract and less detailed than a complete logical data model. It depicts a high-level, business-oriented view of information. The logical data model consists of fully normalized entities with all attributes defined. Furthermore, the domain or data type of each attribute must be defined. A logical data model provides an in-depth description of the data independent of any physical database manifestations. The physical data model transforms the logical model into a physical implementation using a specific DBMS product such as Oracle, MySQL or SQL Server.

Navicat Data Modeler

Which brings us to the primary focus of today’s blog post: Navicat Data Modeler. We have looked at other Navicat products in this blog before (1, 2, 3), but those were performance and DBA tools. Navicat Data Modeler is designed to be used by data architects and modelers (but it can, of course, be used by DBAs, too).

A good data modeling tool provides the user with an easy-to-use palette for creating data models, and Navicat Data Modeler succeeds in this area. Navicat Data Modeler provides a rich interface for visually designing and building conceptual, logical and physical data models. Figure 1 shows a portion of larger logical data model for a university application.

fig1

Figure 1. A Logical Data Model in Navicat Data Modeler

 

The interface enables the user to clearly see the relationships, entities, and attributes at a high level, as well as the ability to zoom in to see details (see Figure 2).

fig2

Figure 2. Attribute details for the Student entity

 

The tool offers a lot of flexibility, so you can create, modify, and design models in a user-friendly manner and the way you like. Navicat Data Modeler supports three standard notations: Crow’s Foot, IDEF1x and UML.

Although easy to use, Navicat Data Modeler is a powerful data modeling and database design tool. As already mentioned, it supports conceptual, logical, and physical modeling. Importantly, though, the tool manages migration of models using reverse and forward engineering processes.  Using the Model Conversion feature, you can convert a conceptual model into a logical model, modify and further design at the logical level, and then convert into a physical database implementation. Navicat Data Modeler supports MySQL, MariaDB, Oracle, Microsoft SQL Server, PostgreSQL, and SQLite. See Figure 3.

fig3

Figure 3. Forward engineering to a target database

 

OK, so that covers forward engineering, but what about reverse engineering? You can use Navicat Data Modeler to reverse engineer a physical database structure into a physical model, thereby enabling you to visualize the database to see the physical attributes (tables, columns, indexes, RI, and other objects) and how they relate to each other without showing any actual data.

Furthermore, you can import models from ODBC data sources, print models to files, and compare and synchronize databases and models. The Synchronize to Database function can be used to discover all database differences. You can view the differences and generate a synchronization script to update the destination database to make it identical to your model. And there are settings that can be used to customize how comparison and synchronization works between environments.

It is also worth noting that Navicat Data Modeler is fully integrated with Navicat Cloud. This makes sharing models much easier. You can sync your model files and virtual groups to the cloud for a real-time access at anytime and anywhere.

Summary

A proper database design cannot be thrown together quickly by novices. Data professionals require domain and design knowledge and powerful tools to implement their vision. Navicat Data Modeler offers one such tool that is worthy of your consideration.

Posted in data modeling, database design, DBA | Leave a comment

10 Rules for Succeeding as a DBA

Being a successful database administrator requires more than just technical acumen and deep knowledge of database systems. You also must possess a proper attitude, sufficient fortitude, and a diligent personality to achieve success in database administration. Gaining the technical know-how is important, yes, but there are many sources that offer technical guidance for DBAs. The non-technical aspects of DBA are just as challenging, though. So with that in mind, let’s take a look at ten “rules of thumb” for DBAs to follow as they improve their soft skills.

Rule #1: Write Down Everything – DBAs encounter many challenging tasks and time-consuming problems. The wise DBA always documents the processes used to resolve problems and overcome challenges. Such documentation can be very valuable (both to you and others) should you encounter a similar problem in the future. It is better to read your notes than to try to re-create a scenario from memory.

Rule #2: Keep Everything – Database administration is the perfect job for you if you are a pack rat. It is a good practice to keep everything you come across during the course of performing your job. If not, it always seems like you’ll need that stuff the day after you threw it out! I still own some manuals for DB2 Version 2.

Rule #3: Automate – Why should you do it by hand if you can automate your DBA processes? Anything you can do, probably can be done better by the computer – if it is programmed to do it properly. And once it is automated you save yourself valuable time that is better spent tackling other problems.

Rule #4: Share Your Knowledge – The more you learn the more you should try to share what you know with others. There are many vehicles for sharing your knowledge: local user groups, online forums, web portals, magazines, blogs, Twitter, and so on. Sharing your experiences helps to encourage others to share theirs, so we can all benefit from each other’s best practices.

Rule #5: Focus Your Efforts – The DBA job is complex and spans many diverse technological and functional areas. It is easy for a DBA to get overwhelmed with certain tasks – especially those tasks that are not performed regularly. Understand the purpose for each task you are going to perform and focus on performing the steps that will help you to achieve that purpose. Do not be persuaded to broaden the scope of work for individual tasks unless it cannot be avoided. Analyze, simplify, and focus. Only then will tasks become measurable and easier to achieve.

Rule #6: Don’t Panic! – Problems will occur. There is nothing you can do to eliminate every possible problem or error. Part of your job is to be able to react to problems calmly and analytically. When a database is down and applications are unavailable your environment will become hectic and frazzled. The best things you can do when problems occur is to remain calm and go about your job using your knowledge and training.

Rule #7: Measure Twice, Cut Once – Being prepared means analyzing, documenting, and testing your DBA policies and procedures. Creating simple procedures in a vacuum without testing will do little to help you run an efficient database environment. And it will not prepare you to react rapidly and effectively to problem situations.

Rule #8: Understand the Business – Remember that being technologically adept is just a part of being a good DBA. Technology is important but understanding your business needs is more important. If you do not understand the business reasons and impact of the databases you manage then you will simply be throwing technology around with no clear purpose.

Rule #9: Don’t Be a Hermit – Be accessible; don’t be one of those “curmudgeon in the corner” DBAs that developers are afraid to approach. The more you are valued for your expertise and availability, the more valuable you are to your company. By learning what the applications must do you can better adjust and tune the databases to support the business.

Rule #10: Use All of the Resources at Your Disposal – Remember that you do not have to do everything yourself. Use the resources at your disposal. Many times others have already encountered and solved the problem that vexes you. Use your DBMS vendor’s technical support to help with particularly thorny problems. Use internal resources for areas where you have limited experience, such as network specialists for connectivity problems and system administrators for OS and system software problems. Build a network of colleagues that you can contact for assistance. Your network can be an invaluable resource and no one at your company even needs to know that you didn’t solve the problem yourself.

Achieve DBA Success!

The job of the DBA is a challenging one – from a technological, political and interpersonal perspective. Follow the rules presented in this blog post to improve your success as a DBA.

Posted in DBA | 1 Comment

How Much Data Availability is Enough?

I bet that some of you reading the title of this blog post scoffed at it. I mean, in this day age, isn’t round-the-clock availability for every application and user just a basic requirement?

No, it shouldn’t be. Let’s discuss.

Availability is traditionally discussed in terms of the percentage of total time that a service needs to be up. For example, a system with 99% availability will be up and running 99% of the time and down, or unavailable, 1% of the time.

Another term used to define availability is MTBF, or mean time between failure. More accurately, MTBF is a better descriptor of reliability than availability. However, reliability has a definite impact on availability. In general, the more reliable the system the more available it will be.

In this Internet age, the push is on to provide never-ending uptime, 365 days a year, 24 hours a day. At 60 minutes an hour that mean 525,600 minutes of uptime a year. Clearly to achieve 100% availability is a laudable goal, but just as clearly an unreasonable one. Why? Because things break, human error is inevitable, and until everybody and everything is perfect, there will be downtime.

The term five nines is often used to describe highly available systems. Meaning 99.999% uptime, five nines describes what is essentially 100% availability, but with the understanding that some downtime is unavoidable (see the accompanying table).

Table 1. Availability versus Downtime

Availability Approximate downtime per year
In minutes In hours
99.999% 5 minutes .08 hours
99.99% 53 minutes .88 hours
99.95% 262 minutes 4.37 hours
99.9% 526 minutes 8.77 hours
99.8% 1,052 minutes 17.5 hours
99.5% 2,628 minutes 43.8 hours
99% 5,256 minutes 87.6 hours
98% 10,512 minutes 175.2 hours
(or 7.3 days)

Even though 100% availability is not reasonable, some systems are achieving availability approaching five nines. DBAs can take measures to design databases and build systems that are created to achieve high availability. However, just because high availability can be built into a system does not mean that every system should be built with a high-availability design. That is so because a highly available system can cost many times more than a traditional system designed with unavailability built into it. The DBA needs to negotiate with the end users and clearly explain the costs associated with a highly available system.

Whenever high availability is a goal for a new system, database, or application, careful analysis is required to determine how much downtime users can really tolerate, and what the impact of an outage would be. High availability is an alluring requirement, and end users will typically request as much as they think they can get. As a DBA, your job is to investigate the reality of the requirement.

The amount of availability that should be built into the database environment must be based on service level agreements and cost. How much availability does the business require? And just as importantly, how much availability can the business afford to implement?

That is the ultimate question. Although it may be possible to achieve high availability, it may not be cost-effective, given the nature of the application and the budget available to support it. The DBA needs to be proactive in working with the application owner to make sure the cost aspect of availability is fully understood by the application owner.

Posted in availability, DBA, SLA | Leave a comment

Database Performance Management Solutions

Performance management, from a database perspective, is comprised of three basic components:

  1. Monitoring a database system and applications accessing it to find problems as they arise. This is typically referred to as performance monitoring.
  2. Analyzing performance data (logs, trace records, etc.) from the system to determine the root cause of the problem.
  3. Assembling a corrective action to implement a fix to the problem.

Database performance software can aid in all three areas. But some simply monitor systems or fix problems, while others deliver combined functionality.

Database performance management software can also be broken down by the type of database performance issues it addresses. Database performance problems can be arise in any of the following three areas:

  • The DBMS itself, which must interact with other system software and hardware, requiring proper configuration to ensure it functions accurately and performs satisfactorily. Additionally, there are many database system parameters used to configure the resources to be used by the DBMS, as well as its behavior. This includes important performance criteria such as memory capacity, I/O throughput and locking of data pages.
  • Database schema/structures. The design of databases, tables and indexes can also impact database performance. Issues include the physical design of the database, disk usage, number of tables, index design and data definition language parameters. How the data is organized must also be managed. And as data is modified in the database, its efficiency will degrade. Reorganization and defragmentation are required to periodically remedy disorganized data.
  • SQL and application code. Coding efficient SQL statements can be complicated because there are many different ways to write SQL that return the same results. But the efficiency and performance of each formulation can vary significantly. DBAs need tools that can monitor the SQL code that’s being run, show the access paths it uses and provide guidance on how to improve the code.

Database performance tools can identify bottlenecks and points of contention, monitor workload and throughput, review SQL performance and optimization, monitor storage space and fragmentation, and view and manage your system and DBMS resource usage. Of course, a single tool is unlikely to perform all of these tasks, so you may need multiple tools (perhaps integrated into a functional suite) to perform all of your required database performance management tasks.

Without proactive tools that can identify problems as they occur, database performance problems are most commonly brought to the attention of the DBA by end users. The phone rings and the DBA hears a complain that is usually vague and a bit difficult to interpret… things like “my system is a bit sluggish” or “my screen isn’t working as fast as it used to.” In such cases, the DBA needs tools that can help uncover the exact problem and identify a solution. Database performance management tools can help to find the problem as well as to put together and deploy a solution to the problem.

A lot organizations use more than one production DBMS. Frequently, the same DBA team (and sometimes even the same excact DBA) will have to assure the performance of more than one DBMS (such as Oracle and SQL Server… or Db2 and MySQL). But each DBMS has different interfaces, parameters and settings that affect how it performs. Database performance tools can mitigate this complexity with intelligent interfaces that mask the complexing making disparate components and settings look and feel similar from DBMS to DBMS.

There are many providers of database performance management tools, including the DBMS vendor (IBM, Microsoft and Oracle), large ISVs like BMC, CA and Quest and a wide array of niche vendors that focus on DBA and database peformance software.

What database performance tools do you use and recommend? Share your expeiences with us in a comment here on the blog.

Posted in DBA, performance, tools | Leave a comment