blog

Event streaming: Capturing changes to data

5 min read

Last edited:  

Event streaming: Capturing changes to data

DevRev was designed to be a system of record, memorializing work that businesses perform to serve their customers. Changes with DevRev and quantized as events, which encapsulate the creation of—and updates to—data. While events are crucial to the functioning of any software system, they are especially important for a management system like DevRev because various downstream functions rely on events to take actions such as generating notifications, applying SLAs and workflows, and delivering webhooks.

Two concepts are important for event handling. Change data capture (CDC) refers to the emission of changes (events) in a data source so they can be relayed to target systems. The event bus is a component that records and distributes the events so that other services can consume them.

Event Example

Let's look at an example of an event from end to end.

  1. The user accesses a ticket in the UI and changes the severity of the work item.
  2. The UI sends an API request to the DevRev back-end on the works.update endpoint to apply the change.
  3. The validation subsystem checks that the ticket is valid and that the user has authorization to make the change.
  4. The DB persists a record in its journal stating that the ticket's severity was updated by a given user.
  5. A connector reads the journal entry and publishes it to the event bus.
  6. Two services take action based on the ticket severity change that they read from the event bus.
    • The timeline consumer detects the update and sees that it's a property change that should be captured in the ticket's Events panel. It sends an entry insertion to the timeline service, after which the UI shows the updated severity and "change severity" event on the ticket's timeline.
    • The SLA service recomputes the SLA of the ticket based on the new severity.

Change Data Capture (CDC)

Historically, databases have been ideal for retrieving the current state of the data (the "what") but not the incremental changes that were accumulated to reach the current state (the "how"). CDC, a more recent popularization, is a concept in which the database provides the ability to read the recent stream of changes. However, CDC isn't a formal protocol and doesn't specify much more than that, so databases must offer their own implementation. Since it's not their primary focus, capability in processing the data is limited: change retention may be relatively short and access restricted to only a few clients. This makes it unsuitable for widespread back-end data processing and so another mechanism is needed.

Event Bus

The event bus opens up the consumption of the changes emitted from CDC to other services. It enables many independent, concurrent consumers and retains control over the ownership of the data. Standard implementations will optimize the data distribution for throughput, as well as provide mechanisms for data rebalancing, metrics tracking, and disaster recovery solutions. When properly constructed, it gives various services the ability to reliably consume the CDC events independently of each other without affecting the performance or scalability of the originating database.

Updates are always made to the database first, and then the CDC event is emitted by the database asynchronously within a short period of time (tens of milliseconds). DevRev then leverages a database connector that copies the changes into the event bus where the various services consume it.

Consumers

Event bus data is segregated into different topics, where each topic contains related data. For example, a topic may exist that contains just changes for work items. This allows services to consume ‌CDC events in a more granular manner so that unnecessary bandwidth and processing isn't used to ignore events for unwanted object types.

Consumers track their processing of the CDC topics independently. This means that if a consuming service falls behind or goes offline, the ability for other services to consume CDC events is unaffected. This avoids false coupling between services and makes it easy to add new consumers as the system evolves.

External Processing

Front-end clients, such as DevRev's web and mobile applications, cannot read from the event bus directly. Therefore, in order to receive real-time events to create a dynamic user experience, clients must connect to DevRev via WebSocket connections. Clients will then subscribe to the particular objects that they're interested in; specifically, those objects that are currently in view, so that they are informed of any updates to those objects without having to explicitly poll for them.

Data Backbone

The union of the database and event bus serves as a strong and resilient data backbone for DevRev. The decoupling of consumers makes it tenable to grow and manage a machine that already has many moving parts, so that the motions of one service do not sway and interfere with the others. DevRev continues to invest in, and make improvements to, our CDC architecture so that the system remains dynamic and responsive to our users' needs.

Brian Byrne
Brian ByrneMember of Technical Staff

Brian is a software engineer at DevRev. Previously he held positions at Confluent, Nutanix, and Google.