Third & GroveThird & Grove
Jul 9, 2015 - Mike DeWolf

Creating a robust local development environment for Drupal

 

Engineers are at their best when they can create a robust local development environment.

A local development machine should be able to replicate, in a tangible form, all of the features of the production server. It should share a similar operator system and build, contain equivalent versions of software and the application database and code should match.

Bad local development environments slow software production. There aren't studies about this, but most engineers will agree. Being able to replicate production with a reasonable degree of precision allows engineers to be confident in their code, whether it is for alpha testing, refinements, UAT or production.

This series will address some common challenges to the ideal, at least when it comes to Drupal. We will examine some common pitfalls when it comes to local development (especially prevalent on enterprise Drupal websites), and propose possible solutions which we’ve tested when developing for live high-volume websites.

Part 1: Ignore Revision Tables

One common problem when setting up local development for an enterprise Drupal content website is the database’s sheer size. At some point when a database has GB and GB, the download and import  process can become prohibitive. Here is one trick that TAG engineers have come up with to alleviate this. Spoiler alert: 95% of Drupal’s revision table data is unnecessary.

As Drupal goes along, it intelligently collects data on every change to revisionable entities.  This sort of precision provides amazing functionality to editors, but ends up being a bane to developers who are attempting to set up for local dev. On a high volume Drupal content website, revision tables can amount to 50%+ of the database (often more), resulting in slow downloads/import for the team.

The mysqldump directive has a oft unused but highly useful syntax for ignoring tables. You can ignore tables from the dump by name or even identified by REXEP.

For one of our use cases, we needed to download a database where revision tables occupied over 85% of the total volume. In order to optimize this MySQL dump for our developers we approached the problem in a calculated way,

We first created a list of ignore tables and added them to a .cnf file. Since this was a Drupal website, we decided to ignore all field revision tables in our download. To do this, we first had to generate a list of field revision tables from the production MySQL schema.

 

mysql -u{user} -p{pword} -h{host} -e "select concat('ignore-table=', table_schema, '.', table_name) as '[mysqldump]' from information_schema.tables where table_schema=‘{db-name} and table_name regexp '^field_revision';" > ~/dumpignore.cnf

 

The preceding command created a .cnf file with a list of revision tables to ignore. From here, all we had to do was a traditional mysqldump while passing these ignored tables as our options.

 

mysqldump --defaults-file=~/dumpignore.cnf -u{user} -p{word} -h{host} {dbname} > ~/bh.sql

 

Unfortunately, Drupal will crash without its field revision tables, but conveniently, Drupal’s field revision tables are essentially identical to its field data tables. By copying over their data, this trick works.

 

echo "select table_name from information_schema.tables where table_schema=‘{dbname}’ and table_name regexp '^field_data';" | mysql -u{user} -p{word} -h{host} {dbname} | while read -r datatable
do
  revisionstable=$(sed 's/_data_/_revision_/g' <<< $datatable)
  mysql -u{user} -p{word} -h{host} {dbname} -e "create table ${revisionstable} like ${datatable}; insert ${revisionstable} select * from ${datatable};"
done

 

The preceding code block identifies all Drupal field data tables from mysql schema via regex and then essentially makes a copy of them (with proper names) to serve as revision tables. In doing this, we were able to preserve a fully functional local development environment, while reducing database bloat and not touching the amazing revisioning features available to editors in staging and production.

Part 2 of this series will deal with choice of local Drupal development machine. Please comment below with questions and feedback.