Sometimes as backend developer you find yourself in a challenge of updating quite large MySQL table schema. Lately, I had to add a new column to rather large (around 32GB) table in MySQL cluster running on AWS Aurora. This can’t be difficult, right?

Sure, the right way of doing is always simple:

ALTER [table] ADD column [column] AFTER [column]

This works out of the box but was not necessarily acceptable because altering locks the table for writes. Even using quite fast DB cluster it would still take around an hour to complete the task.

Another solution to explore was to use MySQL 5.6 online DDL. To find out why this is a bad idea check this article: Why you should not use MySql 5.6 online DDL on AWS Aurora

Entering Percona Toolkit.

Thanks to pt-online-schema-change by Percona Toolkit its possible to alter large table without downtime. Long story short, this clever tool automates the following process:

  1. Create new, empty table with altered schema
  2. Create insert, delete and update triggers on old table to make sure all changes are automatically replicated to the new table
  3. Perform copy of old table data into new table in chunks
  4. When in sync, replace the new table with the old one by renaming both one by one.

Because of the nature of this operation, there are some risks and limitations so it’s critical to familiarise with documentation.

It’s always a good idea to perform this change in isolated environment. AWS gives all the tools to restore RDS from the snapshot and test this solution without affecting the production database.

What do I need?

  1. Large table on AWS Aurora DB instance (obviously)
  2. EC2 Instance within the same availability zone as Aurora instance — yes you can install ‘pt-online-schema-change’ locally, but I strongly recommend using separate instance for optimal networking performance with your Aurora instance
  3. Some of the DB parameter groups are required on Aurora, luckily below settings are dynamic so your instances do not require instance reboot.
binlog_checksum = CRC3

The above is needed, because pt-online-schema-change performs sanity checks, without this DB Cluster Parameter Group setting you will get “Segmentation fault” error.

log_bin_trust_function_creators = true

By default log_bin_trust_function_creators is set to false, preventing non-admin users to create triggers, this needs to be changed in DB Parameter Group if you’re using non-admin user.

Ready to go! Let’s go through this step by step.

  1. First ssh to EC2 instance.
    ssh -i /path/to/my/private/key.pem [email protected]
  2. Lets make sure DB instance is reachable from EC2
    # nc -zv [db-instance-hostname] 3306
    Connection to [db-instance-hostname] 3306 port [tcp/mysql] succeeded!
  3. Time to install Percona Toolkit.
    apt-get install percona-toolkit
  4. This process may take few hours, so lets create screen session
    screen -S schema_change
  5. Start altering table with ‘pt-online-schema-change’
    pt-online-schema-change — execute — recursion-method none — progress percentage,1 — alter “ADD COLUMN [columnt] INT DEFAULT NULL” h=[db_host],D=[database],t=[table],u=[user],p=[password]

    Few words on used parameters:
    — execute for the exact explanation, read here:
    — recursion-method none prevents from performing checks against Aurora internal replica, if not disabled this would cause connection error
    >Cannot connect to [database]
    >No slaves found. See — recursion-method if host [host] has slaves.
    — alter
     statement to be run on the new table
    — progress percentage,1 shows percentage progress of the process
  6. Alteration has started. At this point you should see new table in database called _[tablename]_new. As data are being migrated, we can detach from screen process using CTRL + A + D key combination.
  7. Once the process has finished, lets go back to our screen session.
    screen -r schema_change
    At this point you should see similar report:
    Copied rows OK.
    Analyzing new table…
    Swapping tables…
    Swapped original and new tables OK.
    Dropping old table…
    Dropped old table [old_table] OK.
    Dropping triggers…
    Dropped triggers OK.
    Successfully altered [table]
  8. Done! At this point everything be ready and your table is now updated.

Percona has done some amazing job. Thanks to this clever tool I was able to complete critical schema update without affecting production environment.
For more information about Percona Toolkit, visit