Keeping local data in sync with prod
I wanted to write this mini-post to highlight a developer productivity hack that might not be obvious but provides a huge productivity boost.
I maintain two production systems that are relatively small in their operating demands — the database size is ~25MB — but the service is critical and needs to be up and running at all hours for trading purposes. I maintain a separate development environment where I can develop new features, fix bugs, and experiment with various trading concepts before implementing them in production.
Typically with such a setup, you end up with separate production and development databases. Occasionally, you might import a subset of production trades to get a “production-like” feel. That’s what I used to do. Now, for almost all the small systems I work on, I ensure I have a script in place that allows me to restore production data locally and examine it from my development environment.
I recognize I wouldn’t have this luxury if I worked on a larger system or had strict privacy controls and possession of other users’ data. However, if it’s your own data and you have full control over how you use it, don’t hesitate to implement a “clean slate restore from production” script that can reset your local environment to match the production state. It’s been a huge time saver.
Here’s the script I consistently use for backing up production PostgreSQL databases to my local environment. While the database names and login values are specific to one of my databases, you can easily adjust them for your own setup:
dropdb -U stockanalysis stockanalysis
createdb -U stockanalysis stockanalysis
pg_dump -U stockanalysis -h HOST -p PORT -d PRODDB -F c -v -f stockanalysis.dump
pg_restore -U stockanalysis -d stockanalysis -v stockanalysis.dump
REM print the size of the database
psql -U stockanalysis -c "SELECT pg_size_pretty(pg_database_size('stockanalysis'))"
REM print the size of the dump file
DIR stockanalysis.dump
erase stockanalysis.dump
This script:
- Drops the local database
- Recreates it
- Creates a backup dump from the prod db
- Restores local db from the backup dump
For tracking purposes, I also output the size of the database to monitor data growth, then delete the backup to prevent system clutter.
As mentioned, I now include this type of script in all my smaller projects as part of the repository setup. It’s invaluable to have the ability to start fresh locally and restore from production, ensuring you can recover your data if needed in case of failure scenarios.