Trading with Machine Learning Models¶

This tutorial will show how to train and backtest a machine learning price forecast model with backtesting.py framework. It is assumed you're already familiar with basic framework usage and machine learning in general.

For this tutorial, we'll use almost a year's worth sample of hourly EUR/USD forex data:

In [1]:

Copied!

from minitrade.backtest.core.test import EURUSD, SMA

data = EURUSD.copy()
data
from minitrade.backtest.core.test import EURUSD, SMA

data = EURUSD.copy()
data

Out[1]:

	Open	High	Low	Close	Volume
2017-04-19 09:00:00	1.07160	1.07220	1.07083	1.07219	1413
2017-04-19 10:00:00	1.07214	1.07296	1.07214	1.07260	1241
2017-04-19 11:00:00	1.07256	1.07299	1.07170	1.07192	1025
2017-04-19 12:00:00	1.07195	1.07280	1.07195	1.07202	1460
2017-04-19 13:00:00	1.07200	1.07230	1.07045	1.07050	1554
...	...	...	...	...	...
2018-02-07 11:00:00	1.23390	1.23548	1.23386	1.23501	2203
2018-02-07 12:00:00	1.23501	1.23508	1.23342	1.23422	2325
2018-02-07 13:00:00	1.23422	1.23459	1.23338	1.23372	2824
2018-02-07 14:00:00	1.23374	1.23452	1.23238	1.23426	4065
2018-02-07 15:00:00	1.23427	1.23444	1.22904	1.22904	6143

5000 rows × 5 columns

In supervised machine learning, we try to learn a function that maps input feature vectors (independent variables) into known output values (dependent variable):

$$ f\colon X \to \mathbf{y} $$

That way, provided our model function is sufficient, we can predict future output values from the newly acquired input feature vectors to some degree of certainty. In our example, we'll try to map several price-derived features and common technical indicators to the price point two days in the future. We construct model design matrix $X$ below:

In [2]:

Copied!





def BBANDS(data, n_lookback, n_std):
    """Bollinger bands indicator"""
    hlc3 = (data.High + data.Low + data.Close) / 3
    mean, std = hlc3.rolling(n_lookback).mean(), hlc3.rolling(n_lookback).std()
    upper = mean + n_std*std
    lower = mean - n_std*std
    return upper, lower


close = data.Close.values
sma10 = SMA(data.Close, 10)
sma20 = SMA(data.Close, 20)
sma50 = SMA(data.Close, 50)
sma100 = SMA(data.Close, 100)
upper, lower = BBANDS(data, 20, 2)

# Design matrix / independent features:

# Price-derived features
data['X_SMA10'] = (close - sma10) / close
data['X_SMA20'] = (close - sma20) / close
data['X_SMA50'] = (close - sma50) / close
data['X_SMA100'] = (close - sma100) / close

data['X_DELTA_SMA10'] = (sma10 - sma20) / close
data['X_DELTA_SMA20'] = (sma20 - sma50) / close
data['X_DELTA_SMA50'] = (sma50 - sma100) / close

# Indicator features
data['X_MOM'] = data.Close.pct_change(periods=2)
data['X_BB_upper'] = (upper - close) / close
data['X_BB_lower'] = (lower - close) / close
data['X_BB_width'] = (upper - lower) / close
data['X_Sentiment'] = ~data.index.to_series().between('2017-09-27', '2017-12-14')

# Some datetime features for good measure
data['X_day'] = data.index.dayofweek
data['X_hour'] = data.index.hour

data = data.dropna().astype(float)
def BBANDS(data, n_lookback, n_std):
    """Bollinger bands indicator"""
    hlc3 = (data.High + data.Low + data.Close) / 3
    mean, std = hlc3.rolling(n_lookback).mean(), hlc3.rolling(n_lookback).std()
    upper = mean + n_std*std
    lower = mean - n_std*std
    return upper, lower


close = data.Close.values
sma10 = SMA(data.Close, 10)
sma20 = SMA(data.Close, 20)
sma50 = SMA(data.Close, 50)
sma100 = SMA(data.Close, 100)
upper, lower = BBANDS(data, 20, 2)

# Design matrix / independent features:

# Price-derived features
data['X_SMA10'] = (close - sma10) / close
data['X_SMA20'] = (close - sma20) / close
data['X_SMA50'] = (close - sma50) / close
data['X_SMA100'] = (close - sma100) / close

data['X_DELTA_SMA10'] = (sma10 - sma20) / close
data['X_DELTA_SMA20'] = (sma20 - sma50) / close
data['X_DELTA_SMA50'] = (sma50 - sma100) / close

# Indicator features
data['X_MOM'] = data.Close.pct_change(periods=2)
data['X_BB_upper'] = (upper - close) / close
data['X_BB_lower'] = (lower - close) / close
data['X_BB_width'] = (upper - lower) / close
data['X_Sentiment'] = ~data.index.to_series().between('2017-09-27', '2017-12-14')

# Some datetime features for good measure
data['X_day'] = data.index.dayofweek
data['X_hour'] = data.index.hour

data = data.dropna().astype(float)

Since all our indicators work only with past values, we can safely precompute the design matrix in advance. Alternatively, we would reconstruct the matrix every time before training the model.

Notice the made-up sentiment feature. In real life, one would obtain similar features by parsing news sources, Twitter sentiment, Stocktwits or similar. This is just to show input data can contain all sorts of additional explanatory columns.

As mentioned, our dependent variable will be the price (return) two days in the future, simplified into values $1$ when the return is positive (and significant), $-1$ when negative, or $0$ when the return after two days is roughly around zero. Let's write some functions that return our model matrix $X$ and dependent, class variable $\mathbf{y}$ as plain NumPy arrays:

In [3]:

Copied!





import numpy as np


def get_X(data):
    """Return model design matrix X"""
    return data.filter(like='X').values


def get_y(data):
    """Return dependent variable y"""
    y = data.Close.pct_change(48).shift(-48)  # Returns after roughly two days
    y[y.between(-.004, .004)] = 0             # Devalue returns smaller than 0.4%
    y[y > 0] = 1
    y[y < 0] = -1
    return y


def get_clean_Xy(df):
    """Return (X, y) cleaned of NaN values"""
    X = get_X(df)
    y = get_y(df).values
    isnan = np.isnan(y)
    X = X[~isnan]
    y = y[~isnan]
    return X, y
import numpy as np


def get_X(data):
    """Return model design matrix X"""
    return data.filter(like='X').values


def get_y(data):
    """Return dependent variable y"""
    y = data.Close.pct_change(48).shift(-48)  # Returns after roughly two days
    y[y.between(-.004, .004)] = 0             # Devalue returns smaller than 0.4%
    y[y > 0] = 1
    y[y < 0] = -1
    return y


def get_clean_Xy(df):
    """Return (X, y) cleaned of NaN values"""
    X = get_X(df)
    y = get_y(df).values
    isnan = np.isnan(y)
    X = X[~isnan]
    y = y[~isnan]
    return X, y

Let's see how our data performs modeled using a simple k-nearest neighbors (kNN) algorithm from the state of the art scikit-learn Python machine learning package. To avoid (or at least demonstrate) overfitting, always split your data into train and test sets; in particular, don't validate your model performance on the same data it was built on.

In [4]:

Copied!





import pandas as pd
from sklearn.neighbors import KNeighborsClassifier
from sklearn.model_selection import train_test_split

X, y = get_clean_Xy(data)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=.5, random_state=0)

clf = KNeighborsClassifier(7)  # Model the output based on 7 "nearest" examples
clf.fit(X_train, y_train)

y_pred = clf.predict(X_test)

_ = pd.DataFrame({'y_true': y_test, 'y_pred': y_pred}).plot(figsize=(15, 2), alpha=.7)
print('Classification accuracy: ', np.mean(y_test == y_pred))
import pandas as pd
from sklearn.neighbors import KNeighborsClassifier
from sklearn.model_selection import train_test_split

X, y = get_clean_Xy(data)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=.5, random_state=0)

clf = KNeighborsClassifier(7)  # Model the output based on 7 "nearest" examples
clf.fit(X_train, y_train)

y_pred = clf.predict(X_test)

_ = pd.DataFrame({'y_true': y_test, 'y_pred': y_pred}).plot(figsize=(15, 2), alpha=.7)
print('Classification accuracy: ', np.mean(y_test == y_pred))

Classification accuracy:  0.4210960032962505

No description has been provided for this image

We see the forecasts are all over the place (classification accuracy 42%), but is the model of any use under real backtesting?

Let's backtest a simple strategy that buys the asset for 20% of available equity with 20:1 leverage whenever the forecast is positive (the price in two days is predicted to go up), and sells under the same terms when the forecast is negative, all the while setting reasonable stop-loss and take-profit levels. Notice also the steady use of data.df accessor:

In [5]:

Copied!





%%time

from minitrade.backtest import Backtest, Strategy

N_TRAIN = 400


class MLTrainOnceStrategy(Strategy):
    price_delta = .004  # 0.4%

    def init(self):        
        # Init our model, a kNN classifier
        self.clf = KNeighborsClassifier(7)

        # Train the classifier in advance on the first N_TRAIN examples
        df = self.data.df.iloc[:N_TRAIN]
        X, y = get_clean_Xy(df)
        self.clf.fit(X, y)

        # Plot y for inspection
        self.I(get_y, self.data.df, name='y_true')

        # Prepare empty, all-NaN forecast indicator
        self.forecasts = self.I(lambda: np.repeat(np.nan, len(self.data)), name='forecast')

    def next(self):
        # Skip the training, in-sample data
        if len(self.data) < N_TRAIN:
            return

        # Proceed only with out-of-sample data. Prepare some variables
        high, low, close = self.data.High, self.data.Low, self.data.Close
        current_time = self.data.index[-1]

        # Forecast the next movement
        X = get_X(self.data.df.iloc[-1:])
        forecast = self.clf.predict(X)[0]

        # Update the plotted "forecast" indicator
        self.forecasts[-1] = forecast

        # If our forecast is upwards and we don't already hold a long position
        # place a long order for 20% of available account equity. Vice versa for short.
        # Also set target take-profit and stop-loss prices to be one price_delta
        # away from the current closing price.
        upper, lower = close[-1] * (1 + np.r_[1, -1]*self.price_delta)

        if forecast == 1 and not self.position().is_long:
            self.buy(size=.2, tp=upper, sl=lower)
        elif forecast == -1 and not self.position().is_short:
            self.sell(size=.2, tp=lower, sl=upper)

        # Additionally, set aggressive stop-loss on trades that have been open 
        # for more than two days
        for trade in self.trades():
            if current_time - trade.entry_time > pd.Timedelta('2 days'):
                if trade.is_long:
                    trade.sl = max(trade.sl, low[-1])
                else:
                    trade.sl = min(trade.sl, high[-1])


bt = Backtest(data, MLTrainOnceStrategy, commission=.0002, margin=.05)
bt.run()
%%time

from minitrade.backtest import Backtest, Strategy

N_TRAIN = 400


class MLTrainOnceStrategy(Strategy):
    price_delta = .004  # 0.4%

    def init(self):        
        # Init our model, a kNN classifier
        self.clf = KNeighborsClassifier(7)

        # Train the classifier in advance on the first N_TRAIN examples
        df = self.data.df.iloc[:N_TRAIN]
        X, y = get_clean_Xy(df)
        self.clf.fit(X, y)

        # Plot y for inspection
        self.I(get_y, self.data.df, name='y_true')

        # Prepare empty, all-NaN forecast indicator
        self.forecasts = self.I(lambda: np.repeat(np.nan, len(self.data)), name='forecast')

    def next(self):
        # Skip the training, in-sample data
        if len(self.data) < N_TRAIN:
            return

        # Proceed only with out-of-sample data. Prepare some variables
        high, low, close = self.data.High, self.data.Low, self.data.Close
        current_time = self.data.index[-1]

        # Forecast the next movement
        X = get_X(self.data.df.iloc[-1:])
        forecast = self.clf.predict(X)[0]

        # Update the plotted "forecast" indicator
        self.forecasts[-1] = forecast

        # If our forecast is upwards and we don't already hold a long position
        # place a long order for 20% of available account equity. Vice versa for short.
        # Also set target take-profit and stop-loss prices to be one price_delta
        # away from the current closing price.
        upper, lower = close[-1] * (1 + np.r_[1, -1]*self.price_delta)

        if forecast == 1 and not self.position().is_long:
            self.buy(size=.2, tp=upper, sl=lower)
        elif forecast == -1 and not self.position().is_short:
            self.sell(size=.2, tp=lower, sl=upper)

        # Additionally, set aggressive stop-loss on trades that have been open 
        # for more than two days
        for trade in self.trades():
            if current_time - trade.entry_time > pd.Timedelta('2 days'):
                if trade.is_long:
                    trade.sl = max(trade.sl, low[-1])
                else:
                    trade.sl = min(trade.sl, high[-1])


bt = Backtest(data, MLTrainOnceStrategy, commission=.0002, margin=.05)
bt.run()

CPU times: user 4.23 s, sys: 13.5 ms, total: 4.25 s
Wall time: 4.26 s

Out[5]:

Start                                                   2017-04-25 12:00:00
End                                                     2018-02-07 15:00:00
Duration                                                  288 days 03:00:00
Exposure Time [%]                                                 79.412365
Equity Final [$]                                                  14176.625
Equity Peak [$]                                                14979.134698
Return [%]                                                         41.76625
Buy & Hold Return [%]                                             12.869869
Return (Ann.) [%]                                                 42.978175
Volatility (Ann.) [%]                                             26.903561
Sharpe Ratio                                                        1.59749
Sortino Ratio                                                      3.607936
Calmar Ratio                                                       4.553512
Max. Drawdown [%]                                                 -9.438468
Avg. Drawdown [%]                                                 -1.105954
Max. Drawdown Duration                                     41 days 23:00:00
Avg. Drawdown Duration                                      2 days 15:00:00
# Trades                                                                353
Win Rate [%]                                                      52.691218
Best Trade [%]                                                     0.578258
Worst Trade [%]                                                   -0.519427
Avg. Trade [%]                                                      0.02372
Max. Trade Duration                                         3 days 09:00:00
Avg. Trade Duration                                         0 days 19:00:00
Profit Factor                                                      1.216404
Expectancy [%]                                                     0.024147
SQN                                                                1.850516
Kelly Criterion                                                    0.121275
_strategy                                               MLTrainOnceStrategy
_equity_curve                                        Equity        Asset...
_trades                        EntryBar  ExitBar Ticker   Size  EntryPri...
_orders                                       Ticker  Side   Size
Signal...
_positions                                      {'Asset': 0, 'Cash': 14176}
_trade_start_bar                                                          0
dtype: object

In [6]:

Copied!

bt.plot()
bt.plot()

/Users/ww/workspace/minitrade/minitrade/backtest/core/_plotting.py:737: UserWarning: found multiple competing values for 'toolbar.active_drag' property; using the latest value
  fig = gridplot(
/Users/ww/workspace/minitrade/minitrade/backtest/core/_plotting.py:737: UserWarning: found multiple competing values for 'toolbar.active_scroll' property; using the latest value
  fig = gridplot(

Out[6]:

GridPlot(

id = 'p1342', …)

Despite our lousy win rate, the strategy seems profitable. Let's see how it performs under walk-forward optimization, akin to k-fold or leave-one-out cross-validation:

In [7]:

Copied!





%%time

class MLWalkForwardStrategy(MLTrainOnceStrategy):
    def next(self):
        # Skip the cold start period with too few values available
        if len(self.data) < N_TRAIN:
            return

        # Re-train the model only every 20 iterations.
        # Since 20 << N_TRAIN, we don't lose much in terms of
        # "recent training examples", but the speed-up is significant!
        if len(self.data) % 20:
            return super().next()

        # Retrain on last N_TRAIN values
        df = self.data.df.iloc[-N_TRAIN:]
        X, y = get_clean_Xy(df)
        self.clf.fit(X, y)

        # Now that the model is fitted, 
        # proceed the same as in MLTrainOnceStrategy
        super().next()


bt = Backtest(data, MLWalkForwardStrategy, commission=.0002, margin=.05)
bt.run()
%%time

class MLWalkForwardStrategy(MLTrainOnceStrategy):
    def next(self):
        # Skip the cold start period with too few values available
        if len(self.data) < N_TRAIN:
            return

        # Re-train the model only every 20 iterations.
        # Since 20 << N_TRAIN, we don't lose much in terms of
        # "recent training examples", but the speed-up is significant!
        if len(self.data) % 20:
            return super().next()

        # Retrain on last N_TRAIN values
        df = self.data.df.iloc[-N_TRAIN:]
        X, y = get_clean_Xy(df)
        self.clf.fit(X, y)

        # Now that the model is fitted, 
        # proceed the same as in MLTrainOnceStrategy
        super().next()


bt = Backtest(data, MLWalkForwardStrategy, commission=.0002, margin=.05)
bt.run()

CPU times: user 4.6 s, sys: 30.3 ms, total: 4.63 s
Wall time: 4.64 s

Out[7]:

Start                                                   2017-04-25 12:00:00
End                                                     2018-02-07 15:00:00
Duration                                                  288 days 03:00:00
Exposure Time [%]                                                 71.720057
Equity Final [$]                                                5885.371145
Equity Peak [$]                                                10052.843539
Return [%]                                                       -41.146289
Buy & Hold Return [%]                                             12.869869
Return (Ann.) [%]                                                -41.902348
Volatility (Ann.) [%]                                             10.794943
Sharpe Ratio                                                      -3.881664
Sortino Ratio                                                     -2.803659
Calmar Ratio                                                      -1.010398
Max. Drawdown [%]                                                -41.471113
Avg. Drawdown [%]                                                -14.356287
Max. Drawdown Duration                                    261 days 18:00:00
Avg. Drawdown Duration                                     88 days 05:00:00
# Trades                                                                324
Win Rate [%]                                                      41.666667
Best Trade [%]                                                     0.384255
Worst Trade [%]                                                   -0.506643
Avg. Trade [%]                                                    -0.048462
Max. Trade Duration                                         3 days 07:00:00
Avg. Trade Duration                                         0 days 18:00:00
Profit Factor                                                      0.673567
Expectancy [%]                                                     -0.04804
SQN                                                               -2.792099
Kelly Criterion                                                   -0.212186
_strategy                                             MLWalkForwardStrategy
_equity_curve                                        Equity  Asset      ...
_trades                        EntryBar  ExitBar Ticker   Size  EntryPri...
_orders                                       Ticker  Side   Size
Signal...
_positions                                       {'Asset': 0, 'Cash': 5885}
_trade_start_bar                                                          0
dtype: object

In [8]:

Copied!

bt.plot()
bt.plot()

/Users/ww/workspace/minitrade/minitrade/backtest/core/_plotting.py:737: UserWarning: found multiple competing values for 'toolbar.active_drag' property; using the latest value
  fig = gridplot(
/Users/ww/workspace/minitrade/minitrade/backtest/core/_plotting.py:737: UserWarning: found multiple competing values for 'toolbar.active_scroll' property; using the latest value
  fig = gridplot(

Out[8]:

GridPlot(

id = 'p1710', …)

Apparently, when repeatedly retrained on past N_TRAIN data points in a rolling manner, our basic model generalizes poorly and performs not quite as well.

This was a simple and contrived, tongue-in-cheek example that shows one way to use machine learning forecast models with backtesting.py framework. In reality, you will need a far better feature space, better models (cf. deep learning), and better money management strategies to achieve consistent profits in automated short-term forex trading. More proper data science is an exercise for the keen reader.

Some instant optimization tips that come to mind are:

Data is king. Make sure your design matrix features as best as possible model and correlate with your chosen target variable(s) and not just represent random noise.
Instead of modelling a single target variable $y$, model a multitude of target/class variables, possibly better designed than our "48-hour returns" above.
Model everything: forecast price, volume, time before it "takes off", SL/TP levels, optimal position size ...
Reduce false positives by increasing the conviction needed and imposing extra domain expertise and discretionary limitations before entering trades.

Also make sure to familiarize yourself with the full Backtesting.py API reference