This tip is excerpted from my book Database administration: The complete guide to DBA practices and procedures.
Today we start by discussing free space.
Free space, sometimes called fill factor, can be used to leave a portion of a table space or index empty and available to store newly added data. The specification of free space in a table space or index can reduce the frequency of reorganization, reduce contention, and increase the efficiency of access and insertion. Each DBMS provides a method of specifying free space for a database object in the CREATE and ALTER statements. A typical parameter is PCTFREE, where the DBA specifies the percentage of each data page that should remain available for future inserts. Another possible parameter is FREEPAGE, where the DBA indicates the specified number of pages after which a completely empty page is available.
Ensuring a proper amount of free space for each database object provides the following benefits:
- Inserts are faster when free space is available.
- As new rows are inserted, they can be properly clustered.
- Variable-length rows and altered rows have room to expand, potentially reducing the number of relocated rows.
- Fewer rows on a page results in better concurrency because less data is unavailable to other users when a page is locked.
However, free space also has several disadvantages.
- Disk storage requirements are greater.
- Scans can take longer.
- Fewer rows on a page can require more I/O operations to access the requested information (when sequentially accessing many rows).
- Because the number of rows per page decreases, the efficiency of data caching can decrease because fewer rows are retrieved per I/O.
The DBA should monitor free space and ensure that the appropriate amount is defined for each database object. The correct amount of free space must be based on:
- Frequency of inserts and modifications
- Amount of sequential versus random access
- Impact of accessing unclustered data
- Type of processing
- Likelihood of row chaining, row migration, and page splits
Don’t define free space for a static table — that is not going to grow in size. Static tables do not need room in which to expand.
Another useful database performance tuning tactic is to deploy data compression.
Compression can be used to shrink the size of a database. By compressing data, the database requires less disk storage. Some DBMSs provide internal DDL options to compress database files; third-party software is available for those that do not provide such features.
When compression is specified, data is algorithmically compressed upon insertion into the database and decompressed when it is read. Reading and writing compressed data consumes more CPU resources than reading and writing uncompressed data: The DBMS must execute code to compress and decompress the data as users insert, update, and read the data.
So why compress data? Consider an uncompressed table with a row size of 800 bytes. Five of this table’s rows would fit in a 4K data page (or block). Now what happens if the data is compressed? Assume that the compression routine achieves 30% compression on average (a very conservative estimate). In that case, the 800-byte row will consume only 560 bytes (800 x 0.30 = 560). After compressing the data, seven rows will fit on a 4K page. Because I/O occurs at the page level, a single I/O will retrieve more data, which will optimize the performance of sequential data scans and increase the likelihood of data residing in the cache (or buffer pool) because more rows fit on a physical page.
Of course, compression requires a trade-off that the DBA must analyze. On the positive side, we have disk savings and the potential for reducing I/O cost. On the negative side, we have the additional CPU cost required to compress and decompress the data. Depending on the DBMS and hardware being used though, the CPU consumption may be nominal or not worth considering. For example, DB2 for z/OS takes advantage of an on-board chip for database compression operations which does not require taxing the general CPU and is very efficient.
However, compression is not an option for every database index or table. For smaller amounts of data, it is possible that a compressed file will be larger than an uncompressed file. This is so because some DBMSs and compression algorithms require an internal dictionary to manage the compression. The dictionary contains statistics about the composition of the data that is being compressed. For a trivial amount of data, the size of the dictionary may be greater than the amount of storage saved by compression.
Free space and compression are two additional tools in the DBA’s arsenal for optimizing and controlling database performance. Understanding how each is handled for your specific DBMS will allow you to become a better DBA and improve the performance of your databases, applications, and systems.
This series of blog posts is excerpted from Craig Mullins’ classic book Database administration: The complete guide to DBA practices and procedures. Consult the book for more in-depth information on any of these topics – and much, much more