Inside the Data Reading Room – 1Q2019

It has been awhile since I have published a blog post in the Inside the Data Reading Room series, but that isn’t because I am not reading any more!  It is just that I have not been as active reviewing as I’d like to be. So here we go with some short reviews of data and analytics books I’ve been reading.

Let’s start with Paul Armstrong’s Disruptive Technologies: Understand Evaluate Respond.  Armstrong is a technology strategist who has worked for and with many global companies and brands (including Coca Cola, Experian, and Sony, among others). In this book he discusses strategies for businesses to work with new and emerging technologies.

Perhaps the strongest acclaim that I can give the book is that after reading the book, you will feel that its title is done justice. Armstrong defines what a disruptive technology is and how embrace the change required when something is “disruptive.”

The books offers up a roadmap that can be used to assess, handle, and resolve issues as you identify upcoming technology changes and respond to them appropriately. It idendifies a decision-making framework that can be used that is based on the dimensions of Technology, Behaviour and Data (TBD).

The book is clear and concise, as well as being easy to read. It is not encumbered with a lot of difficult jargon. Since technology is a major aspect of all businesses today (digital transformation) I think both technical and non-technical folks can benefit from the sound approach as outlined in this book.

Another interesting book you should take a look at if you are working with analytics and AI is Machine Learning: A Constraint-Based Approach by Marco Gori. This is a much weightier tome that requires attention and dilgence to digest. But if you are working with analytics, AI, and/or machine learning in any way, the book is worth reading.

The book offers an introductory approach for all readers with an in-depth explanation of the fundamental concepts of machine learning. Concepts such as neural networks and kernel machines are explained in a unified manner.

Information is presented in a unified manner is based on regarding symbolic knowledge bases as a collection of constraints. A special attention is reserved to deep learning, which nicely fits the constrained- based approach followed in this book.

The book is not for non-mathematicians or those only peripherally interested in the subject. Over more than 500 pages the author

There is also a companion web site that procides additional material and assistance.

The last book I want to discuss today is Prashanth H. Southekal’s Data for Business Performance. There is more data at our disposal than ever before and we continue to increase the rate at which we manufacture and gather more data. Shouldn’t we be using this data to improve our businesses? Well, this book provides guidance and techniques to derive value from data in today’s business environment.

Southekal looks at deriving value for three key purposes of data: decision making, compliance, and customer service. The book is structured into three main sections:

  • Part 1 (Define) builds fundamental concepts by defining the key aspects of data as it pertains to digital transformation. This section delves into the different processes that transform data into a useful asset
  • Part 2 (Analyze) covers the challenges that can cause organizations to fail as they attempt to deliver value from their data… and it offers solutions to these challenges that are practical and can be implemented.
  • Part 3 (Realize) provides practical strategies for transforming data into a corporate asset. This section also discusses frameweorks, procedures, and guidelines that you can implement to achieve results.

The book is well-organized and suitable for any student, business person, or techie looking to make sense of how to use data to optimize your business.

If you’ve read any of these books, let me know what you think… and if you have other books that you’d like to see me review here, let me know. I’m always looking for more good books!

Posted in AI, book review, books, business planning, data, data governance, Machine Learning | Leave a comment

Navicat Enables DBAs to Adopt Modern Platforms and Practices

Database administration is a tried and true IT discipline with well-defined best practices and procedures for ensuring effective, efficient database systems and applications. Of course, as with every discipline, best practices must constantly be honed and improved. This can take on many forms. Sometimes it means automating a heretofore manual process. Sometimes it means adapting to new and changing database system capabilities. And it can also mean changing to support new platforms and methods of implementation.

To be efficient, effective, and up-to-date on industry best practices, your DBA team should be incorporating all of these types of changes. Fortunately, there are tools that can help, such as Navicat Premium which can be used to integrate all of these forms of changes into your database environment.

What is Navicat Premium? Well, it is a data management tool that supports and automates a myriad of DBA tasks from database design through development and implementation. Additionally, it supports a wide range of different database management systems, including MariaDB, Microsoft SQL Server, MongoDB, MySQL, Oracle Database, PostgreSQL, SQLite and multiple cloud offerings (including Amazon, Oracle, Microsoft, Oracle, Google, Alibaba, Tencent, MongoDB Atlas and Huawei).

The automation of the DBA tasks using Navicat reduce the amount of time, effort, and human error involved in implementing and maintaining efficient database systems. And for organizations that rely on multiple database platforms – which is most of them these days – Navicat helps not only with automation, but with a consistent interface and methodology across the different database technologies you use.

Navicat can also assist DBAs as their organizations adapt to new capabilities and new platforms. For example, cloud computing.

Although Navicat Premium is typically installed on your desktop, it connects not only to on-premises databases, but also cloud databases such as Amazon RDS, Amazon Aurora, and Amazon Redshift. Amazon removes the need to set up, operate, and scale a relational database, allowing you to focus on the database design and management. Together with an Amazon instance, Navicat Premium can help your DBAs to deliver a high-quality end-to-end database environment for your business applications.

Let’s face it, you probably have a complex data architecture with multiple databases on premises, as well as multiple different databases in the cloud. And almost certainly you are using more than one flavor of DBMS. Without a means to simplify your administrative tasks things are going to fall through the cracks, or even worse, be performed improperly. Using Navicat Premium your DBA team will have an intuitive GUI to manipulate and manage all of your database instances – on premises and in the cloud with a set of comprehensive features for database development and maintenance

You can navigate the tree of database structures just like for on premises data. And then connect to the database in the cloud to access and manage it, as we see here for “Amazon Aurora for MySQL connection”:


Perhaps one of the more vexing issues with cloud database administration is data movement. Navicat Premium provides a Data Transfer feature that automates the movement of data across database platforms – local to local, local to cloud, or to an SQL file.

Another important consideration is the ability to collaborate with other team members, especially for organizations with remote work teams. The Navicat Cloud options provides a central space for your team to collaborate on connection settings, queries and models. Multiple co-workers can contribute to any project, creating and modifying work as needed. All changes are synced automatically, giving all team members the latest information.

For example, here we see the Navicat Cloud Navigation pane:


Another reality of modern computing is that a lot of work is done on mobile devices, such as phones and tablets. DBA work is no longer always conducted on a laptop or directly on the database server. Being able to perform database administration tasks from mobile devices enables DBAs to react quickly, wherever they are whenever their help is needed. You can run Navicat on iOS to enable your mobile workforce to use the devices they always have with them.

When migrating from the large screen common on PCs and laptops, to the smaller screen, common on mobile phones and tablets, you do not want the same layout because it can be difficult to navigate on the smaller devices. Users want the interface to conform to the device, and that is what you get with Navicat iOS.

Let’s look at some examples. Here we see a data grid view for a MySQL table as it would look on an iPhone and an iPad:


But you may want to design databases from your mobile device. That is possible with Navicat iOS… here we see the Object Designer interface on the iPhone and iPad:


Another common task is building SQL queries, which is also configured appropriately for the mobile experience, as shown here:


Adapting to mobile technologies is important because, mobile workers are here to stay. And we need to be ready to support them with robust software designed to operate properly in a mobile, modern workforce.

The Bottom Line

We must always be adapting to new and changing requirements by adopting tools and methodologies that not only automate tasks, but also incorporate new and modern capabilities. Take a look at what Navicat can do to help you accomplish these goals.

Posted in cloud, database design, DBA, mobile, SQL | Leave a comment

Common Database Design Errors

Before we begin today’s blog post, wherein I explain some of the more common mistakes that rookies and non-database folks make (heck, even some database folks make mistakes), I first want to unequivocally state that your organization should have a data architecture team that is responsible for logical and conceptual modeling… and your DBA team should work in tandem with the data architects to ensure well-designe databses.

OK, so what if that isn’t your experience? Frankly, it is common for novices to be designing databases these days, so you aren’t alone. But that doesn’t really make things all that much better, does it?

The best advice I can give you is to be aware of design failures that can result in a hostile database. A hostile database is difficult to understand, hard to query, and takes an enormous amount of effort to change.

So with all of that in mind, let’s just dig in and look at some advice on things not to do when you are designing your databases.

Assigning inappropriate table and column names is a common design error made by novices. Database names that are used to store data should be as descriptive as possible to allow the tables and columns to self-document themselves, at least to some extent. Application programmers are notorious for creating database naming problems, such as using screen variable names for columns or coded jumbles of letters and numbers for table names. Use descriptive names!

When pressed for time, some DBAs resort to designing the database with output in mind. This can lead to flaws such as storing numbers in character columns because leading zeroes need to be displayed on reports. This is usually a bad idea with a relational database. It is better to let the database system perform the edit-checking to ensure that only numbers are stored in the column.

If the column is created as a character column, then the developer will need to program edit-checks to validate that only numeric data is stored in the column. It is better in terms of integrity and efficiency to store the data based on its domain. Users and programmers can format the data for display instead of forcing the data into display mode for storage in the database.

Another common database design problem is overstuffing columns. This actually is a normalization issue. Sometimes a single column is used for convenience to store what should be two or three columns. Such design flaws are introduced when the DBA does not analyze the data for patterns and relationships. An example of overstuffing would be storing a person’s name in a single column instead of capturing first name, middle initial, and last name as individual columns.

Poorly designed keys can wreck the usability of a database. A primary key should be nonvolatile because changing the value of the primary key can be very expensive. When you change a primary key value you have to ripple through foreign keys to cascade the changes into the child table.

A common design flaw is using Social Security number for the primary key of a personnel or customer table. This is a flaw for several reasons, two of which are: 1) a social security number is not necessarily unique and 2) if your business expands outside the USA, no one will have a social security number to use, so then what do you store as the primary key?

Actually, failing to account for international issues can have greater repercussions. For example, when storing addresses, how do you define zip code? Zip code is USA code but many countries have similar codes, though they are not necessarily numeric. And state is a USA concept, too.

Of course, some other countries have states or similar concepts (Canadian provinces). So just how do you create all of the address columns to assure that you capture all of the information for every person to be stored in the table regardless of country? The answer, of course, is to conduct proper data modeling and database design.

Denormalization of the physical database is a design option but it can only be done if the design was first normalized. How do you denormalize something that was not first normalized? Actually, a more fundamental problem with database design is improper normalization. By focusing on normalization, data modeling and database design, you can avoid creating a hostile database.

Without proper upfront analysis and design, the database is unlikely to be flexible enough to easily support the changing requirements of the user. With sufficient preparation, flexibility can be designed into the database to support the user’s anticipated changes. Of course, if you don’t take the time during the design phase to ask the users about their anticipated future needs, you cannot create the database with those needs in mind.


Of course, these are just a few of the more common database design mistakes. Can you name more? If so, please discuss your thoughts and experiences in the comments section.

Posted in data, data modeling, database design, DBA | Tagged | Leave a comment

Happy New Year 2019

Just a quick post today to wish everybody out there a very Happy New Year!


I hope you have started 2019 off with a bang and that the year is successful and enjoyable for one and all!

Posted in Happy New Year | Leave a comment

FaunaDB: A multi-model, distributed database system with ACID consistency

Although relational, SQL database systems continue to dominate the DBMS market, modern database management has shifted to encompass additional types of database systems. This is exemplified in the rise of the NoSQL database system to serve the needs of modern applications that are not as well-suited for existing relational, SQL database systems.

What used to be rather simple – choosing from three or four market leading SQL DBMS products – has now become confusing and difficult trying to understand the morass of different DBMS types and offerings on the market.

A Multi-Model Approach

Well, one solution to avoid the confusion is to select a multi-model DBMS offering. A multi-model database system supports multiple types of database models, such as relational, document, graph, wide column, and key/value. FaunaDB is an example of a multi-model DBMS capable of managing both relational and NoSQL data, and designed to support modern, scalable, real-time applications.

FaunaDB combines the scale and flexibility of NoSQL with the safety and data integrity of relational systems. The company refers to this as Relational NoSQL. Unlike many NoSQL database systems, FaunaDB delivers ACID compliance. You can scale transactions across multiple shards and regions while FaunaDB guarantees the accuracy and integrity of your data and transactions.

FaunaDB enables your developers to write sophisticated transactions using languages they already know. And you can pull data from document, relational, graph, and temporal data sets all from within a single query.

Since FaunaDB is NoSQL, you won’t be using SQL to access databases. The Fauna Query Language (FQL) is the primary interface for interacting with a FaunaDB cluster. FQL is not a general-purpose programming language, but it provides for complex, manipulation and retrieval of data stored within FaunaDB. The language is expression-oriented: all functions, control structures, and literals return values. This makes it easy to group multiple results together by combining them into an Array or Object, or map over a collection and compute a result – possibly fetching more data – for each member.

A query is executed by submitting it to a FaunaDB cluster, which computes and returns the result. Query execution is transactional, meaning that no changes are committed when something goes wrong. If a query fails, an error is returned instead of a result.

FQL supports a comprehensive set of data types in four categories: simple types, special types, collection type and complex types. A simple data type is one that is native to FaunaDB and also native to JSON, such as Boolean, Null, Number and String. Special data types in FaunaDB extend the limited number of native JSON data types; Bytes, Date, Query, Ref, Set and Timestamp. A complex data type is a composite of other existing data types, such as an Object or Instance. And the collection data type is able to handle multiple items while maintaining order, such as Array and Page.


Perhaps the most impressive aspect of FaunaDB is how it enables strict serializability for external transactions. By supporting serializable isolation, FaunaDB can process many transactions in parallel, but the final result is the same as processing them one after another. The FaunaDB distributed transaction protocol processes transactions in three phases:

  • In the first, speculative phase, reads are performed as of a recent snapshot, and writes are buffered.
  • The second phase uses a consensus protocol to insert the transaction into a distributed log. At this point, the transaction gets a global transaction identifier that indicates its equivalent serial order relative to all other concurrent transactions. This is the only point at which global consensus is required.
  • Finally, the third phase checks each replica verifying the speculative work. If there are no potential serializability violations, the work is made permanent and buffered writes are written to the database. Otherwise, the transaction is aborted and restarted.

This software approach is novel and allows for the scaling of transactions across multiple shards and regions while guaranteeing transactional correctness and data accuracy. Contrast this with other database systems, such as Google Spanner, that rely on distributed clock synchronization to ensure data consistency.

The FaunaDB approach is based on a 2012 Yale University paper titled “Calvin: Fast Distributed Transactions for Partitioned Database Systems.” You can download that paper here. And if you are interested in additional details, consult this blog post: Consistency without Clocks: The FaunaDB Distributed Transaction Protocol.


Many database systems provide multi-tenant capabilities. They can contain multiple databases, each with their own access controls. FaunaDB takes this further by allowing any database to have multiple child databases. This enables an operator to manage a single large FaunaDB cluster, create a few top-level databases, and give full administrative access of those databases to associated teams. Each team is free to create as many databases as they need without requiring operator intervention. As far as the team is concerned, they have their own full FaunaDB cluster.


Strong temporal support is an additional capability of FaunaDB. Traditionally, a database system stores only data that is valid at the current point-in-time; it does not track the past state of the data. Most data changes over time, and different users and applications can have requirements to access that data at different points in time. Temporal support makes it possible to query data “as of” different past states.

All records in FaunaDB are temporal. When instances are changed, instead of overwriting the prior contents, a new instance version at the current transaction timestamp is inserted into the instance history, and marked as a create, update, or delete event. This means that with FaunaDB, all reads can be executed consistently at any point in the past or transformed into a change feed of events between any two points in time. This is useful for many different types of applications, such as auditing, rollback, cache coherency, and others.

 Strong Security

Data protection and security has become more important as data breaches continue to dominate the news. Regulation and data governance practices dictate that organizations implement strong protective measure on sensitive data.

FaunaDB implements security at the API level. Access to the FaunaDB API uses  access keys, which authenticate connections as having particular permissions. This access key system applies to administrator- and server-level connections, as well as to object- and user-level connections.

In other words, reading or writing instances of user-defined classes in FaunaDB requires a server key, or an instance token with appropriate permissions.

Delivery Models

FaunaDB can run anywhere you need it to run: on-premises, in your cloud, the public cloud, even multiple clouds. Basically, FaunaDB can run anywhere you can run a JVM.

The FaunaDB Serverless Cloud enables developers to implement and elastically scale cloud applications with no capacity planning or provisioning. FaunaDB Cloud provides essential features that enable developers to safely build and run serverless applications without configuring or operating infrastructure.

The serverless approach uses an event-driven architecture where developers code functions and deploy them to the infrastructure. The functions only consume resources when they are invoked, at which point they run within the architecture. A serverless architecture is conducive to modern development practices because it can eliminate many of the difficulties developers face reconciling their database infrastructure with today’s development methods.

Summing Things Up

Prior to founding Fauna in 2012, the team at FaunaDB was part of the team that developed the infrastructure at Twitter. And FaunaDB is already being used at many leading enterprises. Check out these write ups about FaunaDB usage at NVIDIA, ShiftX, and VoiceConnect. Others are available at Fauna’s web site.

So, if you are looking for a multi-model, secure NoSQL database platform with strong consistency, horizontal scalability, multi-tenenacy and temporal capabilities, that can run on-premise and in the cloud, consider taking a look at FaunaDB.

Posted in cloud, data availability, DBMS, Isolation Level, NoSQL, relational, temporal | Leave a comment

SQL Performance and Optimization

Just a quick post today to refer my readers to a series of blog posts that I recently made to the IDERA database community blog.

This four-part blog series took a look into SQL performance and optimization from a generic perspective.  By that I mean that I did not focus on any particular DBMS, but on the general things that are required of and performed during the optimization of SQL.

Part one – Relational Optimization, introduces and explains the general concept of relational optimization and what it entails;

Part two – Query Analysis and Access Path Formulation, examines the process of analyzing SQL queries and introduces the types of access that can be performed on a single table;

Part three – Multiple Table Access Methods – takes a look at optimization methods for combining data from more than one table

And finally, part four – Additional Considerations – concludes the series with an overview of several additional aspects of SQL optimization that we have yet to discuss.

If you are looking for a nice overview of SQL and relational optimization without DBMS-specific details, give these posts a read!


Posted in optimization, performance, SQL | 4 Comments

My Data Quotes – 2018

I am frequently approached by journalists and bloggers for my thoughts on the data-related news of the day… and I am usually happy to discuss data with anybody! Some of these discussions wind up getting quoted in news articles and posts. I like to try to keep track of these quotes.

With that in mind, I thought I’d share the articles where I have been quoted (so far) in 2018:

I may be missing some, so if you remember chatting with me last year and you don’t see your piece listed above please ping me to let me know…

And if you are interested in some of the older pieces where I’ve been quoted I keep a log of them on my web site at  (Note, some of the older articles/posts are no longer available, so some of the links are inoperable.)

Posted in DBA | Leave a comment

Teradata Analytics Universe 2018 and Pervasive Data Intelligence

I spent last week in Las Vegas at the Teradata Analytics Universe conference, Teradata’s annual user conference. And there was a lot to do and learn there.



Attendees heading to the Expo Hall at the Teradata Analytics Universe conference in Las Vegas, NV — October 2018


The major message from Teradata is that the company is a “new Teradata.” And the message is “Stop buying analytics,” which may sound like a strange message at a conference with analytics in its name!

But it makes sense if you listen to the entire strategy. Teradata is responding to the reality of the analytics marketplace. And that reality centers around three findings from a survey the company conducted of senior leaders from around the world:

  1. Analytics technology is too complex. 74 percent of senior leaders said their organization’s analytics technology is complex; 42 percent said that analytics is not easy for their employees to use and understand.
  2. Users don’t have access to all the data they need. 79 percent of said they need access to more company data to do their job effectively.
  3. Data scientists are a bottleneck. Only 25 percent said that, within their enterprise, business decision makers have the skills to access and use intelligence from analytics without the need for data scientists.




To respond to these challenges, Teradata says you should buy “answers” not “analytics.” And they are correct. Organizations are not looking for more complex, time-consuming, difficult-to-use tools, but answers to their most pressing questions.

Teradata’s calls their new approach “pervasive data intelligence,” which delivers access to all data, all the time, to find answers to the toughest challenges. This can be done on-premises, in the cloud, and anywhere in between.

A big part of this new approach is founded on Teradata Vantage, which provides businesses the speed, scale and flexibility they need to analyze anything, deploy anywhere and deliver analytics that matter. At the center of Vantage is Teradata’s respected analytics database management system, but it also brings together analytic functions and engines within a single environment. And it integrates with all the popular open source workbenches, platforms, and languages, including SQL, R, Python, Jupyter, RStudio, SAS, and more.

“Uncovering valuable intelligence at scale has always been what we do, but now we’re taking our unique offering to new heights, unifying our positioning while making our software and consulting expertise available as-a-service, in the cloud, or on-premises,” said Victor Lund, Teradata CEO.

Moving from analytical silos to an analytics platform that can deliver pervasive data intelligence sounds to me like a reasonable way to tackle the complexity, confusion, and bottlenecks common today.

Check out what Teradata has to offer at

Posted in analytics, data, Teradata, tools | Leave a comment

Data Modeling with Navicat Data Modeler

Data modeling is the process of analyzing the things of interest to your organization and how these things are related to each other. The data modeling process results in the discovery and documentation of the data resources of your business. Data modeling asks the question “What?” instead of the more common data-processing question “How?”

As data professionals, it is important that we understand what the data is and what it means before we attempt to build databases and applications using the data. Even with today’s modern infrastructure that includes databases with flexible schemas that are applied when read (instead of the more traditional method of applying the schema on write), you still need a schema and an understanding of the data in order to do anything useful with it. And that means a model of the data.

The modeling process requires three phases and types of models: conceptual, logical and physical. A conceptual data model is generally more abstract and less detailed than a complete logical data model. It depicts a high-level, business-oriented view of information. The logical data model consists of fully normalized entities with all attributes defined. Furthermore, the domain or data type of each attribute must be defined. A logical data model provides an in-depth description of the data independent of any physical database manifestations. The physical data model transforms the logical model into a physical implementation using a specific DBMS product such as Oracle, MySQL or SQL Server.

Navicat Data Modeler

Which brings us to the primary focus of today’s blog post: Navicat Data Modeler. We have looked at other Navicat products in this blog before (1, 2, 3), but those were performance and DBA tools. Navicat Data Modeler is designed to be used by data architects and modelers (but it can, of course, be used by DBAs, too).

A good data modeling tool provides the user with an easy-to-use palette for creating data models, and Navicat Data Modeler succeeds in this area. Navicat Data Modeler provides a rich interface for visually designing and building conceptual, logical and physical data models. Figure 1 shows a portion of larger logical data model for a university application.


Figure 1. A Logical Data Model in Navicat Data Modeler


The interface enables the user to clearly see the relationships, entities, and attributes at a high level, as well as the ability to zoom in to see details (see Figure 2).


Figure 2. Attribute details for the Student entity


The tool offers a lot of flexibility, so you can create, modify, and design models in a user-friendly manner and the way you like. Navicat Data Modeler supports three standard notations: Crow’s Foot, IDEF1x and UML.

Although easy to use, Navicat Data Modeler is a powerful data modeling and database design tool. As already mentioned, it supports conceptual, logical, and physical modeling. Importantly, though, the tool manages migration of models using reverse and forward engineering processes.  Using the Model Conversion feature, you can convert a conceptual model into a logical model, modify and further design at the logical level, and then convert into a physical database implementation. Navicat Data Modeler supports MySQL, MariaDB, Oracle, Microsoft SQL Server, PostgreSQL, and SQLite. See Figure 3.


Figure 3. Forward engineering to a target database


OK, so that covers forward engineering, but what about reverse engineering? You can use Navicat Data Modeler to reverse engineer a physical database structure into a physical model, thereby enabling you to visualize the database to see the physical attributes (tables, columns, indexes, RI, and other objects) and how they relate to each other without showing any actual data.

Furthermore, you can import models from ODBC data sources, print models to files, and compare and synchronize databases and models. The Synchronize to Database function can be used to discover all database differences. You can view the differences and generate a synchronization script to update the destination database to make it identical to your model. And there are settings that can be used to customize how comparison and synchronization works between environments.

It is also worth noting that Navicat Data Modeler is fully integrated with Navicat Cloud. This makes sharing models much easier. You can sync your model files and virtual groups to the cloud for a real-time access at anytime and anywhere.


A proper database design cannot be thrown together quickly by novices. Data professionals require domain and design knowledge and powerful tools to implement their vision. Navicat Data Modeler offers one such tool that is worthy of your consideration.

Posted in data modeling, database design, DBA | Leave a comment

10 Rules for Succeeding as a DBA

Being a successful database administrator requires more than just technical acumen and deep knowledge of database systems. You also must possess a proper attitude, sufficient fortitude, and a diligent personality to achieve success in database administration. Gaining the technical know-how is important, yes, but there are many sources that offer technical guidance for DBAs. The non-technical aspects of DBA are just as challenging, though. So with that in mind, let’s take a look at ten “rules of thumb” for DBAs to follow as they improve their soft skills.

Rule #1: Write Down Everything – DBAs encounter many challenging tasks and time-consuming problems. The wise DBA always documents the processes used to resolve problems and overcome challenges. Such documentation can be very valuable (both to you and others) should you encounter a similar problem in the future. It is better to read your notes than to try to re-create a scenario from memory.

Rule #2: Keep Everything – Database administration is the perfect job for you if you are a pack rat. It is a good practice to keep everything you come across during the course of performing your job. If not, it always seems like you’ll need that stuff the day after you threw it out! I still own some manuals for DB2 Version 2.

Rule #3: Automate – Why should you do it by hand if you can automate your DBA processes? Anything you can do, probably can be done better by the computer – if it is programmed to do it properly. And once it is automated you save yourself valuable time that is better spent tackling other problems.

Rule #4: Share Your Knowledge – The more you learn the more you should try to share what you know with others. There are many vehicles for sharing your knowledge: local user groups, online forums, web portals, magazines, blogs, Twitter, and so on. Sharing your experiences helps to encourage others to share theirs, so we can all benefit from each other’s best practices.

Rule #5: Focus Your Efforts – The DBA job is complex and spans many diverse technological and functional areas. It is easy for a DBA to get overwhelmed with certain tasks – especially those tasks that are not performed regularly. Understand the purpose for each task you are going to perform and focus on performing the steps that will help you to achieve that purpose. Do not be persuaded to broaden the scope of work for individual tasks unless it cannot be avoided. Analyze, simplify, and focus. Only then will tasks become measurable and easier to achieve.

Rule #6: Don’t Panic! – Problems will occur. There is nothing you can do to eliminate every possible problem or error. Part of your job is to be able to react to problems calmly and analytically. When a database is down and applications are unavailable your environment will become hectic and frazzled. The best things you can do when problems occur is to remain calm and go about your job using your knowledge and training.

Rule #7: Measure Twice, Cut Once – Being prepared means analyzing, documenting, and testing your DBA policies and procedures. Creating simple procedures in a vacuum without testing will do little to help you run an efficient database environment. And it will not prepare you to react rapidly and effectively to problem situations.

Rule #8: Understand the Business – Remember that being technologically adept is just a part of being a good DBA. Technology is important but understanding your business needs is more important. If you do not understand the business reasons and impact of the databases you manage then you will simply be throwing technology around with no clear purpose.

Rule #9: Don’t Be a Hermit – Be accessible; don’t be one of those “curmudgeon in the corner” DBAs that developers are afraid to approach. The more you are valued for your expertise and availability, the more valuable you are to your company. By learning what the applications must do you can better adjust and tune the databases to support the business.

Rule #10: Use All of the Resources at Your Disposal – Remember that you do not have to do everything yourself. Use the resources at your disposal. Many times others have already encountered and solved the problem that vexes you. Use your DBMS vendor’s technical support to help with particularly thorny problems. Use internal resources for areas where you have limited experience, such as network specialists for connectivity problems and system administrators for OS and system software problems. Build a network of colleagues that you can contact for assistance. Your network can be an invaluable resource and no one at your company even needs to know that you didn’t solve the problem yourself.

Achieve DBA Success!

The job of the DBA is a challenging one – from a technological, political and interpersonal perspective. Follow the rules presented in this blog post to improve your success as a DBA.

Posted in DBA | 1 Comment