Google Cloud SQL Live Machine Learning System Migration

https://BitBank.nz is a cryptocurrency forecasting tool, its constantly running and churning out high frequency forecasts every few seconds that traders are using to trade cryptocurrency both manually and automatically via our API

Important to all machine learning systems we keep three main groups of data: features we observe/compute, what we predict will happen and when available, what actually happened.

With Google Cloud SQL it can auto increase memory size, we where a bit lazy in using this feature without building any data retention policy/strategy (e.g. off-lining data to google cloud storage) fast forward to now and our database has balooned to an expensive ~256G SSD most of which old market data thats never accessed live by anyone (the site only shows the past 2 hours of high frequency forecasts)

There are a few other reasons to undergo a database migration e.g. resizing a machine, changing storage type (SSD/HDD), changing database systems ect, but ours was cutting costs, after the database size shrunk we needed to shrink the database.

The architecture of BitBank.nz is that the database is used for only historical queries of larger amounts of market data, for live updates they are pushed straight to the browser via Firebase, The database is also used for storing user subscriber data so we can't afford to loose any data in the process.

Because of this there are (atleast) three systems that are writing data to the database at any one time: the forecaster, the web frontend and cron jobs (checking peoples bitcoin balances to see if we have been paid and cron jobs adding data about what the markets actually did so we can see our accuracy, predict volatility and make better predictions in future).

The migration procedure involves spinning up a new database to migrate to, exporting the database and importing running the new forecaster process on the new database (with pushing to the live firebase disabled) we can also run a new instance of the cronjobs and website on the new database, this way we can bring forward testing of things before we actually cut over to use the new database in a live environment and serve live traffic which allows us to fix any issues e.g. connection issues beforehand, the forecaster process also takes a while to startup to build indexing structures of live data in memory, cutting over using two running instances of forecaster processes reduces downtime due to a restart.

Once we have verified the new systems running okay we can cut over by changing a flag to promote the new forecaster to push into the live firebase and old forecaster to not, this way we can rollback if anything immediately goes wrong and eventually remove the old forecaster/cronjob hardware later.

The web frontend stores users passwords in firebase (salted and hashed) and uses a get or create mechanism to make the equivalent record in our postgres database, this means when migrating if any users have been created since doing the export they will get their details created automatically when they login to our system with the new database that doesn't have their details.

Also another thing that makes database migration easier is the way that the bitcoin/altcoin payments are accepted, a cronjob polls every users unique bitcoin address for payment and for any new amounts we see in the address we add it to our total for a user and credit the bitcoin amount in time on the platform, a complication is that we can withdraw from a users bitcoin address they paid us to e.g. for paying out users of the referral program, the system also tracks an amount taken from a bitcoin address.

This means having both the payment acceptance cron jobs running in the new/old databases is enough as long as we have every user in the new system attached to the same unique bitcoin address (we can run a re-sync after we switch over to ensure we havn't lost any users (between the time that the export started to the current time)).

Hiccup: while running the export we had this error about a readaccess role

Import failed: SET SET SET SET SET set_config ------------ (1 row) SET SET SET SET SET CREATE TABLE CREATE SEQUENCE CREATE TABLE CREATE SEQUENCE CREATE TABLE CREATE SEQUENCE CREATE TABLE CREATE SEQUENCE CREATE TABLE CREATE SEQUENCE CREATE TABLE CREATE SEQUENCE CREATE SEQUENCE CREATE TABLE CREATE TABLE CREATE SEQUENCE COPY 11905173 setval ----------- 998869359 (1 row) COPY 0 setval -------- 1 (1 row) COPY 1248939 setval ----------- 149081802 (1 row) COPY 0 setval -------- 1 (1 row) COPY 0 setval -------- 1 (1 row) COPY 4632562 setval ----------- 344940961 (1 row) setval ----------- 507136919 (1 row) COPY 4865056 COPY 627 setval -------- 628 (1 row) ALTER TABLE ALTER TABLE ALTER TABLE ALTER TABLE ALTER TABLE ALTER TABLE ALTER TABLE ALTER TABLE CREATE INDEX CREATE INDEX CREATE INDEX CREATE INDEX CREATE INDEX CREATE INDEX CREATE INDEX CREATE INDEX CREATE INDEX CREATE INDEX CREATE INDEX CREATE INDEX CREATE INDEX CREATE INDEX CREATE INDEX ALTER TABLE REVOKE REVOKE GRANT GRANT ERROR: role "readaccess" does not exist Import error: exit status 3  

I dont have a user called readaccess in the Google Cloud SQL ui.. although under closer inspection this was because there is one which was created one outside the google cloud SQL interface

SELECT * FROM pg_roles;  

... some roles in here dont show up in the ui, remember to create them through the interface next time...

After deleting the rouge unused roles the procedure all when't smoothly :)

We also had to run a script to delete any data forecasting created when testing with the new database and resync the missed data added since the export started to ensure the new database has forecasts that where made live in it before the cutover.

If your into cryptocurrency trading checkout https://BitBank.nz for up to date forecasts and try our referral program to earn bitcoin.