Bad Database Standards, Part 3

Limiting The Number of Tables in “Online” Joins

Today’s entry continues our on-going series on “bad” database standards. And the “bad” standard we’ll discuss tonight is a particularly nasty one. Although the general idea behind this standard might seem to be understandable — at least at first blush — the implementation and wording is typically wrong, wrong, wrong.

OK, so what am I talking about? Well, the way this “bad” standard typically is stated is something like this: “Code no more than a 3 table join in any program that is accessed in an online environment.”

The idea is to limit the amount of work that is done by a transaction so that it can be accomplished in a limited amount of time. Given that there is a user waiting for a response you do not want the transaction to take an inordinate amount of time. Sounds reasonable, right? So why is this wrong?

The “general idea” behind the standard is not necessarily wrong, but the standard is. Whether it says a 3 table join or a 5 table join (or some other number) is not important. The number of tables participating in a join is not the defining factor for performance. Instead, it is the number of qualifying rows to be returned. A well defined query that specifically limits the data to be returned appropriately, can be very efficient whether it is a single table access or a multiple-table join. So do not place artificial limits on the number of tables that can participate in a join. Each DBMS has its upper limit and, of course, you cannot exceed that (for example, 225 tables per SQL statement when using DB2 for z/OS).

Another misunderstanding is when standards are written that “worry” about the size of tables. Once again, this factor is not necessarily important — instead it is the number of qualifying rows again. Consider, for example, a table with 10 million rows in it. A query that uses a matching index to return 2 rows from that table can be very efficient. On the other hand, a query against a smaller table, say 100 pages, that must scan data will likely consume more resources and take longer to complete than the indexed access against the bigger table. The number of rows that qualify for the predicates is more important than the size of the table.

You should create standards and guidelines that help your application developers and SQL users to create efficient queries instead of creating arbitrary standards based on table size or number of tables in a join.

Advertisements

About craig@craigsmullins.com

I'm a strategist, researcher, and consultant with nearly three decades of experience in all facets of database systems development.
This entry was posted in SQL, standards. Bookmark the permalink.

3 Responses to Bad Database Standards, Part 3

  1. Pingback: Tweets that mention Bad Database Standards, Part 3 | Data and Technology Today -- Topsy.com

  2. wow .. very nice your blog
    a lot of information I can from your blog
    thanks

  3. Tamra Hart says:

    If only I had a penny for each time I came here… Incredible article!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s