I recently received a question about hot to put together a database environment from scratch. This is a quick question that requires an in-depth answer — usually moreso than can adequately be explained over email or texting. Furthremore, to do a thorough job of it would require a lot of planning and analysis of the organization in question.
Nevertheless, I always try to provide some good advice whenever somebody seeks me out to ask a question. So here is the answer I offered up:
That is indeed an interesting question that is not always easy to answer. Here is how I would go about determining the database architecture for a an organization.
The first thing to do is to determine the primary relational DBMS that will serve as the backbone of the organization’s transactional systems. This could be MySQL, of course, (note: the query talked about experience with MySQL) or you may need to go with a DBMS that has a stronger track record of enterprise support (such as IBM Db2, Microsoft SQL Server, or Oracle Database). Factors to examine include the O/S supported by the DBMS and your organization’s required O/S support, performance and availability requirements, scalability needs, available technicians, support for the cloud, total cost of ownership, and analyst recommendations.
Keep in mind that an organization may have multiple relational/SQL database systems due to packaged application support and historical needs. The lerading relational/SQL DBMSes will support transaction processing for most applications. The next step is to decide on an analytical DBMS for the backbone of analytical, data warehousing and data science type processing. Again, we should choose a relational/SQL DBMS, but it should have the capability to perform OLAP and column-based queries. If the DBMS supports HTAP (Hybrid Transaction Analytical Processing) then perhaps the same DBMS can be used for both transactions and analytics, but this often requires additional configuration and perhaps an appliance (ex. IBM Db2 Analytics Accelerator).
Then you need to review the use cases for all upcoming projects. If the project can be accomplished using the DBMSes selected already, this is ideal. However, life is not always ideal and sometimes this requires adding a new DBMS, and even a new type of DBMS. For example, consider the NoSQL DBMS variants (graph, wide column, document, and key/value databases). Each of these support specific use cases and might be required for web, social media, IoT, or other types of projects that are not easily supported by relational/SQL database systems.
And do not forget legacy DBMS systems required to run applications that have been around for awhile. This may mean that you will have to also support database systems like IMS (hierarchical), IDMS (network/CODASYL), Adabas (inverted list), or OO database systems.
Once you have your infrastructure set, which will likely consist of multiple DBMS products), you need to plan for the update cycle for each DBMS product. This means coordinating with the published release cycle for each vendor. You must always ensure that you are running a supported version of the DBMS. Vendors typically release new versions every 18 to 36t months, but support a product for 5 or more years running from GA (general availability), thru EOM (end of marketing – meaning no longer sold), and EOS (end of support – meaning no longer supported). You must plan to move to a new release of each DBMS before the EOS deadline.
Further complicating issues is that many vendors have moved to an agile, continuous availability model. This means that smaller, more frequent updates are being made available. You must decide where your organization fits on the Gartner technology adoption framework (https://www.gartner.com/it-glossary/type-a-b-and-c-enterprises) and create a strategy for how frequently you accept new modifications.
The cloud throws another monkey-wrench in to this because if you use a cloud database like SQL Azure, for example, then the cloud provider (in this case Microsoft) will always be running the most recent version of the DBMS. So you will have to adopt a more aggressive strategy for accepting change into your DBMS environment.
What do you think? Is this a good high-level approach to assembling a database infrastructure for an organization?