We added four new fields to a "featureset" which represents a prediction in the BitBank.nz forecasting system.
fields are estimated_future_wstd_5
estimated_future_wstd_30
estimated_future_wstd_60
estimated_future_wstd_120
They are the standard deviation over different time periods in the trades weighted by trade amount.
e.g if you picked a random dollar that was traded in the market in the next 5 minutes 95% chance it would be within 2 * estimated_future_wstd_5
of the mean assuming our forecasted std deviation metric is accurate
forecasting process
every 5-15 seconds different features in the market are computed such as:
- how badly the data fits straight lines over time (r values of best fit slopes)
- how much price has changed between now (current midpoint) and the weighted trade average 2, 5, 30 minutes ago. e.g.
wavg_distance_to_midpoint_percent5min
- weighted standard deviation in the trades/orderbook over time.
- spread
- orderbook imbalance (midpoint vs weighted average of the top n ordern)
The features are mapped to volatility forecasts using a machine learning algorithm trained via backtesting similarly to how we forecast weighted average price of trades
other important metrics include:
- past data on weighted standard deviations in the trades over different time periods
- best fit slopes in the orderbook midpoint over time or change in other metrics over time
- 2nd derivatives/ change in the change over time.
visualisation
We visualise this predicted volatility data by adding space above and bellow our predictions: two standard deviations above and below our predicted weighted average price (estimated_trade_wavg_5
) to represent where we predict 95% of the volume to be traded within.
The graph shows the previous prediction (in orange with error bars either side) in this example the previous prediction was working well with around 5% outside the predicted shaded 95% / two std deviations from predicted mean area, we can reason that with the underlying trades weighted average price (what the forecasts are for) would have been about this distribution of price too (especially true for liquid pairs like btc/eth).
Unfortunately adding error bars equally either side of a mean can make graphs look like they the volatility must be normally distributed (probably is over a long enough time period), we are not predicting the price distribution just the std deviation, something we will keep in mind.
Checkout the live BTC/ETH charts
Accuracy
We compute accuracy in a similar way to our estimated_future_wave_5
forecasts fore weighted averages, we have a background process that fills in what actually happens following our forecast future_wavg
and future_wstd
so we can see how far off in percent our accuracy and other forecasts are
A view of the current accuracy of our volatility predictions over the past 5 days (ending 2 feb 2018 20:52 UTC) shows
estimated_future_wstd_5
was on average 0.114731% UNDER actual wstd, σ: 0.592479%,
estimated_future_wstd_30
was on average 0.083497% UNDER actual wstd, σ: 0.445679%,
estimated_future_wstd_60
was on average 0.099871% UNDER actual wstd, σ: 0.483839%,
estimated_future_wstd_120
was on average 0.167406% UNDER actual wstd, σ: 0.630792%
This means our wstd predictions have been lower than expected, (particularly with the recent volatility of bitcoin nearly dropping to 8k and rebounding/tether), theres still too much standard deviation in our prediction error, which means we aren't allways predicting ~.1% accurate results, 95% of our predictions have roughly been between 1% above/below actual-.1% ( 1% is roughly two std deviations).
Its also normal for the accuracy of the forecasts further out (estimated_future_wstd_120
) to be less accurate, its unfortunately a symptom of the compounding unpredictability over time inherent in life, many companies (and consumers who believe it) go wrong in this space trying to predict 7-30+ days out which is like trying to predict the weather next year today (hard if your in windy Wellington, New Zealand).
We are always working hard to get the accuracy errors to zero and our algorithm is being continuously retrained with newer market data, this has meant when volatility suddenly sparks up or down there can be some delay before our accuracy predictions catch up.
These hyper-parameters like how the algorithm continuously retrains itself are things we also run machine learning algorithms to optimise with increasingly harder and harder goals (first aiming for low prediction error, then how to make money in simulated/real markets) similar to what was outlined from openAI meta learning (continuous retraining) and proximal policy optimization (training first on easier goals).
Checkout the live forecasts/charts at https://bitbank.nz with a free 1 day trial! Also checkout our referral program to earn .003 BTC per paying user referred!