AWS Aurora Anatomy

----PrimaryInstanceReplicaInstanceReplicaInstanceShared Storage VolumeStorageNodeStorageNodeStorageNodeStorageNodeStorageNodeStorageNodemydbcluster.cluster-123456789012.us-east-1.rds.amazonaws.com:3306S3InstanceMonitoringStorageMonitoringAvailability Zone AAvailability Zone BAvailability Zone CRemove vertex.Remove link.Link options.Remove vertex.Remove link.Link options.Remove vertex.Remove link.Link options.

Diagram Overview

AWS Aurora is a managed relational database service. It was purpose built for the AWS cloud to make common relational database engines more performant, reliable, and scalable.

In this interactive diagram we break down the components of the service for you to explore and learn about.

Before getting to the details, here is an overview of the Aurora value proposition. As you'll see by exploring the diagram, much of this is made possible by the custom database storage platform that AWS built.

Performance
Aurora can provide up to a 5x performance increase for MySQL workloads, and up to 2x performance for Postgress.

Scalability
Aurora provides easy horizontal read slave scaling. It also automates vertical scaling when you want to increase the size of the database hardware itself.

Reliability
Aurora instances and storage are constantly being monitored. It provides automatic recovery from a range of failure modes, including disk failure, node failure, and even datacenter failure.

Managed Service
Like its predecessor, the Relational Database Service, Aurora makes it easy to perform common database management tasks. Monitoring, failover, backups, scaling, patching and maintenance are managed by the service itself.

Primary Instance

When you create an Aurora instance AWS will automatically provision a number of resources for you. One of those is the primary instance. As with any MySQL or Postgres installation, the primary instance is the master node that handles both writes and reads for the system.

Aurora launched with the support of MySQL (several versions are available). This is a fully compatible version of MySQL. Any application that uses MySQL will be supported. AWS is also working to launch the Aurora Postgres version as well. That version is currently in preview mode.

When launching an Aurora instance you will select an instance type. The instance type defines the hardware specs for the instance. You can see the full list of instance types here.

Replica Instance

When you launch an Aurora instance the service will automatically provision two replica instances. These instances are not optional. They are deployed in separate Availability Zones as the primary instance. This ensures that your Aurora deployment has a primary or replica instance in at least three AZs in the region. This is an important aspect of how it all works.

The purpose of the replica instances are two fold. First, the replicas are going to help with ensuring a highly available implementation. If the primary instance goes down, Aurora will automatically detect and failover to one of the replicas within 30 seconds. That's pretty impressive for an open source database engine!

The second purpose of the replicas are to be read slaves. You can access each replica individually, and even specify a failover order. This is helpful if one of your replicas is being used for reporting and you don't want it to be promoted to primary if it doesn't need to be.

The replication process itself is part of the brilliance of the Aurora service. In traditional master/read-replica the replicated data would be sent from the master to the slaves. This reduces the overall performance of the master to server read/write requests.

Aurora is different in that it replicates data at the storage tier.

Replica Instance

When you launch an Aurora instance the service will automatically provision two replica instances. These instances are not optional. They are deployed in separate Availability Zones as the primary instance. This ensures that your Aurora deployment has a primary or replica instance in at least three AZs in the region. This is an important aspect of how it all works.

The purpose of the replica instances are two fold. First, the replicas are going to help with ensuring a highly available implementation. If the primary instance goes down, Aurora will automatically detect and failover to one of the replicas within 30 seconds. That's pretty impressive for an open source database engine!

The second purpose of the replicas are to be read slaves. You can access each replica individually, and even specify a failover order. This is helpful if one of your replicas is being used for reporting and you don't want it to be promoted to primary if it doesn't need to be.

The replication process itself is part of the brilliance of the Aurora service. In traditional master/read-replica the replicated data would be sent from the master to the slaves. This reduces the overall performance of the master to server read/write requests.

Aurora is different in that it replicates data at the storage tier.

Shared Storage Volume

The storage engine is perhaps the most innovative part of the Aurora service. AWS effectively created a service oriented architecture (SOA) database storage platform for the cloud.

As you can see in the diagram, the storage system is presented to the primary and replica instances as a virtualized volume. The virtualized volume spans all three Availability Zones (AZ). Within each AZ are two copies of the data. There are many benefits that come with a design like this.

If the primary instance goes down, one of the replicas can be promoted within 30 seconds (usually faster than that). In fact, an entire AZ could go down and the database would survive. It can do this because the data is already replicated in other AZs. The only thing that needs to happen is the Aurora service figuring out that a real failure happened (don't want false positives!), and for DNS to propagate.

This design helps the system improve performance by offloading the burden of replication. The master node no longer has to be responsible for managing the replication process. Rather, the replication is done at the storage tier level. This improves the overall performance of the master node (more resources to spend on app read/writes). It should be noted that you do have the option to turn on traditional MySQL replication if desired.

Another great thing about this storage system is that it can continuously backup to the Simple Storage Service (S3). And it does this with any impact to the database instances.

Storage Node

The storage nodes are a subset of the instances storage volume. The Aurora service is running hundreds of storage nodes to support AWS' customers.

In practice, when a write comes in through the primary instance, the storage system will save a total of six copies of the data; two copies in each of the three Availability Zones (AZs). This data is actually stripped across hundreds of storage nodes that AWS is managing behind the scenes.

It's at this tier that the actual replication occurs. The storage nodes are capable of doing peer to peer replication. If a storage node goes down, or a disk fails, the other copies are replicated seamlessly.

Storage Node

The storage nodes are a subset of the instances storage volume. The Aurora service is running hundreds of storage nodes to support AWS' customers.

In practice, when a write comes in through the primary instance, the storage system will save a total of six copies of the data; two copies in each of the three Availability Zones (AZs). This data is actually stripped across hundreds of storage nodes that AWS is managing behind the scenes.

It's at this tier that the actual replication occurs. The storage nodes are capable of doing peer to peer replication. If a storage node goes down, or a disk fails, the other copies are replicated seamlessly.

Storage Node

The storage nodes are a subset of the instances storage volume. The Aurora service is running hundreds of storage nodes to support AWS' customers.

In practice, when a write comes in through the primary instance, the storage system will save a total of six copies of the data; two copies in each of the three Availability Zones (AZs). This data is actually stripped across hundreds of storage nodes that AWS is managing behind the scenes.

It's at this tier that the actual replication occurs. The storage nodes are capable of doing peer to peer replication. If a storage node goes down, or a disk fails, the other copies are replicated seamlessly.

Storage Node

The storage nodes are a subset of the instances storage volume. The Aurora service is running hundreds of storage nodes to support AWS' customers.

In practice, when a write comes in through the primary instance, the storage system will save a total of six copies of the data; two copies in each of the three Availability Zones (AZs). This data is actually stripped across hundreds of storage nodes that AWS is managing behind the scenes.

It's at this tier that the actual replication occurs. The storage nodes are capable of doing peer to peer replication. If a storage node goes down, or a disk fails, the other copies are replicated seamlessly.

Storage Node

The storage nodes are a subset of the instances storage volume. The Aurora service is running hundreds of storage nodes to support AWS' customers.

In practice, when a write comes in through the primary instance, the storage system will save a total of six copies of the data; two copies in each of the three Availability Zones (AZs). This data is actually stripped across hundreds of storage nodes that AWS is managing behind the scenes.

It's at this tier that the actual replication occurs. The storage nodes are capable of doing peer to peer replication. If a storage node goes down, or a disk fails, the other copies are replicated seamlessly.

Storage Node

The storage nodes are a subset of the instances storage volume. The Aurora service is running hundreds of storage nodes to support AWS' customers.

In practice, when a write comes in through the primary instance, the storage system will save a total of six copies of the data; two copies in each of the three Availability Zones (AZs). This data is actually stripped across hundreds of storage nodes that AWS is managing behind the scenes.

It's at this tier that the actual replication occurs. The storage nodes are capable of doing peer to peer replication. If a storage node goes down, or a disk fails, the other copies are replicated seamlessly.

S3 Backups

Aurora is able to run continuous backups to the AWS Simple Storage Service (S3). The backups are performed without an impact to performance or availability. Backups are streamed to S3, which makes it possible to perform fine grained point in time restores.