FaunaDB: A multi-model, distributed database system with ACID consistency

Although relational, SQL database systems continue to dominate the DBMS market, modern database management has shifted to encompass additional types of database systems. This is exemplified in the rise of the NoSQL database system to serve the needs of modern applications that are not as well-suited for existing relational, SQL database systems.

What used to be rather simple – choosing from three or four market leading SQL DBMS products – has now become confusing and difficult trying to understand the morass of different DBMS types and offerings on the market.

A Multi-Model Approach

Well, one solution to avoid the confusion is to select a multi-model DBMS offering. A multi-model database system supports multiple types of database models, such as relational, document, graph, wide column, and key/value. FaunaDB is an example of a multi-model DBMS capable of managing both relational and NoSQL data, and designed to support modern, scalable, real-time applications.

FaunaDB combines the scale and flexibility of NoSQL with the safety and data integrity of relational systems. The company refers to this as Relational NoSQL. Unlike many NoSQL database systems, FaunaDB delivers ACID compliance. You can scale transactions across multiple shards and regions while FaunaDB guarantees the accuracy and integrity of your data and transactions.

FaunaDB enables your developers to write sophisticated transactions using languages they already know. And you can pull data from document, relational, graph, and temporal data sets all from within a single query.

Since FaunaDB is NoSQL, you won’t be using SQL to access databases. The Fauna Query Language (FQL) is the primary interface for interacting with a FaunaDB cluster. FQL is not a general-purpose programming language, but it provides for complex, manipulation and retrieval of data stored within FaunaDB. The language is expression-oriented: all functions, control structures, and literals return values. This makes it easy to group multiple results together by combining them into an Array or Object, or map over a collection and compute a result – possibly fetching more data – for each member.

A query is executed by submitting it to a FaunaDB cluster, which computes and returns the result. Query execution is transactional, meaning that no changes are committed when something goes wrong. If a query fails, an error is returned instead of a result.

FQL supports a comprehensive set of data types in four categories: simple types, special types, collection type and complex types. A simple data type is one that is native to FaunaDB and also native to JSON, such as Boolean, Null, Number and String. Special data types in FaunaDB extend the limited number of native JSON data types; Bytes, Date, Query, Ref, Set and Timestamp. A complex data type is a composite of other existing data types, such as an Object or Instance. And the collection data type is able to handle multiple items while maintaining order, such as Array and Page.

Consistency

Perhaps the most impressive aspect of FaunaDB is how it enables strict serializability for external transactions. By supporting serializable isolation, FaunaDB can process many transactions in parallel, but the final result is the same as processing them one after another. The FaunaDB distributed transaction protocol processes transactions in three phases:

  • In the first, speculative phase, reads are performed as of a recent snapshot, and writes are buffered.
  • The second phase uses a consensus protocol to insert the transaction into a distributed log. At this point, the transaction gets a global transaction identifier that indicates its equivalent serial order relative to all other concurrent transactions. This is the only point at which global consensus is required.
  • Finally, the third phase checks each replica verifying the speculative work. If there are no potential serializability violations, the work is made permanent and buffered writes are written to the database. Otherwise, the transaction is aborted and restarted.

This software approach is novel and allows for the scaling of transactions across multiple shards and regions while guaranteeing transactional correctness and data accuracy. Contrast this with other database systems, such as Google Spanner, that rely on distributed clock synchronization to ensure data consistency.

The FaunaDB approach is based on a 2012 Yale University paper titled “Calvin: Fast Distributed Transactions for Partitioned Database Systems.” You can download that paper here. And if you are interested in additional details, consult this blog post: Consistency without Clocks: The FaunaDB Distributed Transaction Protocol.

Multi-Tenancy

Many database systems provide multi-tenant capabilities. They can contain multiple databases, each with their own access controls. FaunaDB takes this further by allowing any database to have multiple child databases. This enables an operator to manage a single large FaunaDB cluster, create a few top-level databases, and give full administrative access of those databases to associated teams. Each team is free to create as many databases as they need without requiring operator intervention. As far as the team is concerned, they have their own full FaunaDB cluster.

Temporality

Strong temporal support is an additional capability of FaunaDB. Traditionally, a database system stores only data that is valid at the current point-in-time; it does not track the past state of the data. Most data changes over time, and different users and applications can have requirements to access that data at different points in time. Temporal support makes it possible to query data “as of” different past states.

All records in FaunaDB are temporal. When instances are changed, instead of overwriting the prior contents, a new instance version at the current transaction timestamp is inserted into the instance history, and marked as a create, update, or delete event. This means that with FaunaDB, all reads can be executed consistently at any point in the past or transformed into a change feed of events between any two points in time. This is useful for many different types of applications, such as auditing, rollback, cache coherency, and others.

 Strong Security

Data protection and security has become more important as data breaches continue to dominate the news. Regulation and data governance practices dictate that organizations implement strong protective measure on sensitive data.

FaunaDB implements security at the API level. Access to the FaunaDB API uses  access keys, which authenticate connections as having particular permissions. This access key system applies to administrator- and server-level connections, as well as to object- and user-level connections.

In other words, reading or writing instances of user-defined classes in FaunaDB requires a server key, or an instance token with appropriate permissions.

Delivery Models

FaunaDB can run anywhere you need it to run: on-premises, in your cloud, the public cloud, even multiple clouds. Basically, FaunaDB can run anywhere you can run a JVM.

The FaunaDB Serverless Cloud enables developers to implement and elastically scale cloud applications with no capacity planning or provisioning. FaunaDB Cloud provides essential features that enable developers to safely build and run serverless applications without configuring or operating infrastructure.

The serverless approach uses an event-driven architecture where developers code functions and deploy them to the infrastructure. The functions only consume resources when they are invoked, at which point they run within the architecture. A serverless architecture is conducive to modern development practices because it can eliminate many of the difficulties developers face reconciling their database infrastructure with today’s development methods.

Summing Things Up

Prior to founding Fauna in 2012, the team at FaunaDB was part of the team that developed the infrastructure at Twitter. And FaunaDB is already being used at many leading enterprises. Check out these write ups about FaunaDB usage at NVIDIA, ShiftX, and VoiceConnect. Others are available at Fauna’s web site.

So, if you are looking for a multi-model, secure NoSQL database platform with strong consistency, horizontal scalability, multi-tenenacy and temporal capabilities, that can run on-premise and in the cloud, consider taking a look at FaunaDB.

About craig@craigsmullins.com

I'm a strategist, researcher, and consultant with nearly three decades of experience in all facets of database systems development.
This entry was posted in cloud, data availability, DBMS, Isolation Level, NoSQL, relational, temporal. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s