Event Sourcing

Learn how data is synchronized using the event sourcing approach.

Synchronizing data using event sourcing

Event sourcing is another approach for data synchronization. It writes any update operations to an append-only event log. The interested applications consume events from this log and store the associated data in their preferred datastore. The current state of the system can be derived simply by consuming all the events from the beginning of the log.

However, applications typically save periodical snapshots (also known as checkpoints) of the state to avoid having to re-consume the whole log in case of failures. In this case, an application that recovers from a failure only needs to replay the events of the log after the latest snapshot.

Mitigating the problems of dual writes

Event sourcing approach does not violate atomicity, which means there is no need for an atomic commit protocol. The reason is every application is consuming the log independently, and they will eventually process all the events successfully, restarting from the last consumed event in case of a temporary failure.

The isolation problem in the dual writes approach is also mitigated since the applications will be consuming all the events in the same order.

Caveat in event sourcing approach

There is a small caveat in the event sourcing approach: applications might be consuming the events from the log at different speeds, which means an event will not be reflected instantly on all the applications. This phenomenon can be handled at the application level. For example, if an item is unavailable in the cache, the application can query the other datastores.

A different manifestation of this problem could be a successfully indexed item that was not stored in the authoritative datastore yet, leading to a dangling pointer. This could also be mitigated at the application level by identifying and discarding these items instead of displaying broken links. If no such technique can be applied at the application level, a concurrency control protocol could be used, e.g., a locking protocol with the associated performance and availability costs.

Problems

Some kinds of applications need to perform update operations that require an up-to-date view of the data. The simplest example is a conditional update operation, it creates a user if no user with the same username exists already.

Note: The Conditional update operation is also known as a compare-and-set (CAS) operation.

It is not easily achievable when using event sourcing because the applications consume the log asynchronously, so they are only eventually consistent.

In the next lesson, we will learn another approach that solves this problem.

Get hands-on with 1400+ tech skills courses.