Databases - Aurora

  • Compatible API for PostgreSQL / MySQL
  • Define EC2 instance type for aurora instances
  • Automatic fail-over: Failover in Aurora is instantaneous. It’s HA native.
  • Auto-expanding: Automatically grows in increments of 10 GB, up to 64 TB
  • Does not have free tier.
  • DB engine can be MySQL / Postgres with version
  • DB instance must have unique identifier unique in AWS region
  • Other features: backup and recovery, industry compliance, push-button scaling, automated patching with zero downtime, advanced monitoring, routine maintenance
  • Capacity types
    • Provisioned: Provision and manage the instance sizes
    • Serverless: Specify min and max resources and Aurora auto-scales
    • Multi-AZ deployment: Through replicas in different zone
  • Can enable Enhanced monitoring for more metrics.
  • Monitoring: Export log types to publish to Amazon CloudWatch (Audit, Error, General, Slow query)
  • Deletion protection: You can’t delete the database (protects from accidental deletions)
  • You can handle parameters using
    • DB Parameter Group
    • DB Cluster Parameter Group
    • Option groups
  • Pricing
    • Aurora costs more than RDS (20% more) but more efficient
    • Billed per IO.
      • Read IO: Every database page read operation is 1 IO.
      • Write IO: Pushing transaction logs to shared storage for high availability.
  • 💡 Improve query performances
    • Hash Joins: If you need to join a large amount of data by using an equijoin.
    • Asynchronous key prefetch (AKP) Improve the performance of queries that join tables across indexes
  • Aurora DB Cluster

Aurora DB Cluster

  • Cluster-types
    • Multi-master clusters
      • All DB instances have read-write capability.
    • Single-master clusters
      • Has two instances types:
        1. Primary DB instance: master
          • Each Aurora DB cluster has one primary DB instance.
          • You can use MySQL / PostgreSQL RDS as master in a cluster
        2. Aurora Replica
          • Read-only replicas of master
  • Cluster endpoints
    • Reader endpoint: Read replicas
    • Writer endpoint: Master
      • If master fails, clients still talk to the endpoint and will be redirected to the right instance
    • Custom Endpoints: E.g. point analytics workload to higher memory capacity replicas
    • 💡 Best practice: connect writer endpoint to write and reader endpoint to read.

Availability

  • Automated backups are always enabled.
    • Or use Backtrack to restore data at any point of time without using backups
    • You can also take snapshots & share with other AWS accounts without impacting performance.
      • User-created snapshots are retained after deleting the DB.
      • You can share snapshots across accounts.
  • Faster RTO than other RDS engines by making buffer cache of DB immediately available.
  • Multi-AZ by default with 6 copies (replicas) across 3 AZ:
    • 4 copies out of 6 needed for writes
    • 3 copies out of 6 need for reads
    • You can add additional MySQL or Aurora (recommended) replicas.
      • External MySQL databases as replicas are supported.
      • ❗ Global databases are recommended instead.
  • You can have multi-region replicas.
  • Self healing with peer-to-peer replication
    • Auto healing i.e. Aurora will recover if we lose one replica
  • One Aurora Instance takes writes (master)
  • Automated failover for master in less than 30 seconds
  • Failover behavior
    • You can failover to any replica -> Secondary becomes the master, replication breaks.
      • You can assign priority for replicas, or the one which has same size as primary will be more prioritized.
    • Single instance: Create a new DB in the same AZ, if it fails => try different AZ
      • RTO: 15 min
    • Multi replicas: Flips the canonical name record (CNAME) for your DB Instance to point at the healthy replica
      • RTO: 15 seconds
  • Aurora Global databases
    • Span multiple regions (global)
    • One primary region, one DR region (can be used for lower latency read)
    • < 1 second replica lag on average
    • 📝If not using Global Databases, you can create cross-region Read Replicas
      • 💡 Global databases are recommended: much faster replicas.

Scalability

  • Vertical Scaling
    • Storage Scaling
      • No need to provision
      • Auto-scales up to 64 TB, in 10GB increments
      • Zero-impact to database performance when scaling.
    • Compute resources
      • Can happen during maintainance window
      • Or immediately (a few minutes of downtime)
  • Horizontal Scaling
    • Write Scalability
      • Multi-Master: create multiple read-write instances across multi-AZ.
    • Read Scalability
      • Aurora can have 15 replicas (+ master)
        • MySQL has 5, an the replication process is faster (sub 10ms replica lag)
      • Support for Cross Region Replication
      • All replicas and master uses a shared storage that’s auto expandable
      • Read Replicas can be global
      • Read replicas can have second their (their own) read replicas.
      • 💡 You can use Aurora Replicas as failover targets.
      • Auto-scaling is done through Auto Scaling policy
        • You can scale with based on metrics e.g. average CPU utilization of Aurora Replicas
        • For target value e.g. 60% of CPU utilization, specify minimum & maximum replicas and cooldown periods (Seconds between scale-in and out), and enable/disable “Scale in”
  • Shared storage architecture
    • Makes your data independent from the DB instances in the cluster.
    • You can add a DB instance quickly because Aurora doesn’t make a new copy of the table data
    • Storage is striped across 100s volumes
    • You have auto-scaling for all replicas
    • All read replicas connects to Reader Endpoint (e.g. dbname-ro.id.region.amazonaws.com)
      • Reader endpoint is connection load balancing
      • Load balancing happens at the connection level not the statement level.
        • Anytime client connects to reader endpoint, the client will get connected to the one of the reader endpoint.

Security

  • Encryption at rest using KMS
    • Automated backups, snapshots and replicas are also encrypted
  • Encryption in flight using SSL (same process as MySQL or Postgres)
  • 📝Can use IAM authentication for Aurora MySQL and Postgres
    • You can also use master username, master password but it’s worse.
  • Network level: VPC, subnet, public accessibility (can SSH), availability zone (specify where), create new security group or choose existing one.
  • ❗ You cannot access the SSH into underlying instances directly by design.
  • ❗ You cannot encrypt an existing unencrypted database.
    • Create a new database with encryption enabled and migrate your data instead.

Aurora Serverless

  • No need to choose an instance size -> it auto-scales for you
  • Helpful when you can’t predict the workload
  • DB cluster starts, shutdown and scales automatically based on CPU / connections
  • Can migrate from Aurora Cluster to Aurora Serverless and vice versa
    • Through restoring snapshots
  • Aurora Serverless usage is measured in ACU (Aurora Capacity Units)
    • Billed in 5 minutes increment of ACU
  • ❗ Some of Aurora features are not supported, check documentation

Aurora parallel query

  • Ability to push down and distribute the computational load of a single query across thousands of CPUs in Aurora’s storage layer
  • Good fit for analytical workloads requiring fresh data and good query performance, even on large tables.
  • Can be enabled/disabled globally and on session level with aurora_pq parameter.
  • Causes higher I/O costs as it scans all data.
  • ❗ Not available for serverless, Backtrack-enabled instances, and non R-* instances.
  • 💡 You can extend your own MySQL databases to Aurora
  • You can use Aurora to scale reads for external MySQL databases
  • You can use Aurora for DR with external MySQL databases

Licenses and Attributions


Speak Your Mind