Amazon Aurora

Amazon Aurora is AWS's cloud-native relational database, MySQL- and PostgreSQL-compatible, delivered through Amazon RDS. It combines the familiarity of open-source engines with a re-architected storage layer that distributes data across three Availability Zones with six-way replication — giving higher throughput, faster recovery, and near-zero data loss.


Key Features:


Aurora vs. RDS (MySQL/Postgres):


Common Use Cases:


Service Limits & Quotas:


Pricing Model:


Code Example — Create a Serverless v2 Aurora PostgreSQL Cluster:


aws rds create-db-cluster \
  --db-cluster-identifier prod-app \
  --engine aurora-postgresql \
  --engine-version 16.3 \
  --serverless-v2-scaling-configuration MinCapacity=0.5,MaxCapacity=16 \
  --master-username dbadmin \
  --manage-master-user-password \
  --vpc-security-group-ids sg-0abc123 \
  --db-subnet-group-name prod-private \
  --storage-encrypted \
  --kms-key-id alias/aurora \
  --backup-retention-period 14 \
  --deletion-protection

aws rds create-db-instance \
  --db-instance-identifier prod-app-writer \
  --db-cluster-identifier prod-app \
  --db-instance-class db.serverless \
  --engine aurora-postgresql

aws rds create-db-instance \
  --db-instance-identifier prod-app-reader \
  --db-cluster-identifier prod-app \
  --db-instance-class db.serverless \
  --engine aurora-postgresql
  

Connect via boto3 RDS Data API (Serverless):


import boto3

data = boto3.client("rds-data", region_name="us-west-2")
resp = data.execute_statement(
    resourceArn="arn:aws:rds:us-west-2:111122223333:cluster:prod-app",
    secretArn="arn:aws:secretsmanager:us-west-2:111122223333:secret:rds!cluster-xxx",
    database="orders",
    sql="SELECT id, total FROM orders WHERE created_at > :since",
    parameters=[{"name": "since", "value": {"stringValue": "2026-04-01"}}],
)
for row in resp["records"]:
    print(row)
  


Common Interview Questions:

How does Aurora storage differ from RDS storage?

Aurora uses a distributed log-structured storage layer shared by the writer and all readers — instances don't have their own EBS volumes. Writes propagate as redo log records to 6 storage nodes across 3 AZs and are acknowledged when 4 of 6 confirm (quorum). Readers replay log on read, eliminating replication lag in steady state.

Aurora Serverless v1 vs. v2?

v1 paused fully when idle and scaled in chunks with cold-start latency; v2 scales in fine 0.5-ACU steps in seconds without pausing, supports read replicas, and is the recommended option for production. v1 is being deprecated.

What is Backtrack and when use it?

Aurora MySQL feature that rewinds the cluster's storage volume to an earlier timestamp (up to 72 hours) in seconds, without restoring a snapshot. Used for "oops" recoveries — accidental DROP TABLE, bad migration. Not in PostgreSQL; use point-in-time restore there.

How does failover work in Aurora?

The cluster has a writer endpoint (DNS) pointing at the current writer; on writer failure, Aurora promotes one replica (priority-ordered) and re-points the writer endpoint, typically in under 30 seconds. Reader endpoints load-balance across the surviving readers.

What's Zero-ETL to Redshift?

Native managed replication from Aurora MySQL/PostgreSQL into a Redshift cluster — change data captured at the storage layer and applied to Redshift typically within seconds. Eliminates Glue/DMS pipelines for the common operational-to-analytical pattern.

When pick I/O-Optimized over Standard storage?

When I/O charges on Standard exceed ~25% of total cluster cost, or when workload IOPS are unpredictable enough that flat-rate storage simplifies budgeting. I/O-Optimized has higher per-GB price but no per-IO charges.