28 November 2018

Accelerate Database Development and Testing with Amazon Aurora (DAT313)

by mo

  • devs, devops, dbs
  • db architecture
  • deployment, automation
  • improve dev/test outcomes
  • performance focused dev
  • cost effective

aurora provides native capabilities to meet these goals

develop with scale of production in mind. not an afterthought for a dba.

architecture, deployment and automation

aurora: cloud native db monolithic model of database -> scale out cloud patterns

scale out distributed architecture, purpose build storage system servic oriented architecture leveraging aws services

  • compute -> ec2
  • iam
  • lambda triggers
  • integrates with s3 for imports and exports

fully managed service, they automate admin tasks so i don’t have to worry about things like patching and HA.

aurora is 5x more throughput than mysql; 3x more througput then postgresql

key charactersistics

  1. log structure is custom built and distributed. shared between all cluster nodes.
  2. readers strictly read only
  3. optimized for high throughput

mysql uses binary log repliation. (binlogs)

  • replicate binlogs to another instance then apply binlogs to local storage.

aurora sees all the same storage, no binlog transport necessary. aurora does more for less

core use cases: read scaling architecture

  • auto scaling optional
  • all reader treated same size as the master
  • any reader can be a failover target
  • application shoud use cluster and reader endpoints dev tip: monitor cluster topology for faster failover

core use cases: separated workloads arch

  • each reader used for a specific purpose
  • ordered failover targeting
  • aplication uses cluster, custom or event db instance

caveat: reader share the same storage and undo log

deployment automations with amazon aurora


  • reduce risk of human errors
  • repeatabiliy
  • accountability and management of change


  • declarative automation (cloudfront, terraform) (it’s about the outcome, less about the journey)
  • procedural automation (awscli, aws sdks) bring it from state a to state b (it’s about the journey, less about the outcome)

procedurabl automation

  • think about your steps and validating the initial state meets those expectations. otherwise, garbage in, garbage out.


  • check for feature gaps between aws cf and ards apis.
  • check feasibility of workloads using aws cloudformation custom resources
  • ensure efficient ordering during resource provisioning
  • create alternative automations for minimal downtime

improve development and test outcomes

bad practices:

  • app dev and testing on tiny subset of non-productiond ata
  • bulk data exports from live production instances
  • running long, unoptimized queries on cluster readers
  • lack of outcome focus

right way:

  • know your features; point in time restore, database cloning, backtrack
  • poper workload isolation
  • test, don’t hope for the best

database cloning

  • test changes in pre-prod on relevant data sets
  • reorganize a database with minimal production impact
  • save a point in time snapshot for data analysis witout impacting production systems

if you need to run db intensive reporting tasks on a cron, then spin up a clone, run the queries on the cron interval and tear it down when it is done. then you do not impact your running instance that is serving a production load


allows you move the database back in time.

  • reduce risk of db changes at scale. (try it at scale, if it fails then backtrack)
  • undo unintentional dml and ddl changes
  • mitigate risk

how does backtraxck work?

  • we keep periodic snapshots of each segment, we also preserve the logs
  • for backtrack, we identify the appropriate segment snapshots
  • apply log streams to segment snapshots in parallel and async
  • backtrack is not destructive
  • you can backtrack multi times to find the right piont in time
  • pay for the volume of change records retained for the desired duration
  • available for aurora mysql 5.6 compatible. working to bring it to other engines.

backtrack operates on the whole db not on a single table.

rds console database cluster

dev focused on production scale bad:

  • app dev and testing on tiny subsets of non prod data
  • database is always there for me and never fails
  • running long, unoptimized queries on cluster readers
  • lack of outcome focus

right way:

  • be aware of acrch diff between aurora and equivalent engines
  • know aurora capabilities and operation procedures: automated failover, upgrades
  • extend aurora features with application side cluster awareness
  • test, don’t hope for the best: failure injection
    • simulate loss of a disk

everything fails all the time - werver vogels

robust self healing but need to be aware in an app perspective

  • mysql drivers are not cluster or topology aware.
  • aurora is wire compatible with mysql and pg.
  • use client side connection pool, recycle connections periodically, avoid connection storms
  • use smart driver, or build topology awareness in your client data layer, validate connections
  • understand the behaviour of the aurora dns endpoints, manage dns caching
  • split reads to reader nodes and writes to write node.

monitor perf of queries start with outcome in mind

  • establish baseline of acceptable/desired query perf
  • assess perf impaxt of workload hanges
  • troubleshoot poor perf, identify bottlenecks
  • effective capacity planning

health at a glance

  • cloudwatch metrics: (query throughput, latency)
  • cloudwatch logs: (send slow query logs, audit)
  • enhanced monitoring: additional db specific metrics at up to 1 second granularity
  • performance insights: query and wait level perf data

performance insights provides dashboard of top slow queries and other performance issues

  • available for mysql and pg
  • 7 days free storage of perf metrics
  • api access to perf insights for 3rd party tools integration