Parameter Heatmap¶
This tutorial will show how to optimize strategies with multiple parameters and how to examine and reason about optimization results. It is assumed you're already familiar with basic backtesting.py usage.
First, let's again import our helper moving average function. In practice, one should use functions from an indicator library, such as TA-Lib or Tulipy.
from minitrade.backtest.core.test import SMA
Our strategy will be a similar moving average cross-over strategy to the one in Quick Start User Guide, but we will use four moving averages in total: two moving averages whose relationship determines a general trend (we only trade long when the shorter MA is above the longer one, and vice versa), and two moving averages whose cross-over with daily close prices determine the signal to enter or exit the position.
from minitrade.backtest import Strategy
from minitrade.backtest.core.lib import crossover
class Sma4Cross(Strategy):
n1 = 50
n2 = 100
n_enter = 20
n_exit = 10
def init(self):
self.sma1 = self.I(SMA, self.data.Close.df, self.n1)
self.sma2 = self.I(SMA, self.data.Close.df, self.n2)
self.sma_enter = self.I(SMA, self.data.Close.df, self.n_enter)
self.sma_exit = self.I(SMA, self.data.Close.df, self.n_exit)
def next(self):
if not self.position():
# On upwards trend, if price closes above
# "entry" MA, go long
# Here, even though the operands are arrays, this
# works by implicitly comparing the two last values
if self.sma1[-1] > self.sma2[-1]:
if crossover(self.data.Close, self.sma_enter):
self.buy()
# On downwards trend, if price closes below
# "entry" MA, go short
else:
if crossover(self.sma_enter, self.data.Close):
self.sell()
# But if we already hold a position and the price
# closes back below (above) "exit" MA, close the position
else:
if (self.position().is_long and
crossover(self.sma_exit, self.data.Close)
or
self.position().is_short and
crossover(self.data.Close, self.sma_exit)):
self.position().close()
It's not a robust strategy, but we can optimize it.
Grid search is an exhaustive search through a set of specified sets of values of hyperparameters. One evaluates the performance for each set of parameters and finally selects the combination that performs best.
Let's optimize our strategy on Google stock data using randomized grid search over the parameter space, evaluating at most (approximately) 200 randomly chosen combinations:
%%time
from minitrade.backtest import Backtest
from minitrade.backtest.core.test import GOOG
backtest = Backtest(GOOG, Sma4Cross, commission=.002)
stats, heatmap = backtest.optimize(
n1=range(10, 110, 10),
n2=range(20, 210, 20),
n_enter=range(15, 35, 5),
n_exit=range(10, 25, 5),
constraint=lambda p: p.n_exit < p.n_enter < p.n1 < p.n2,
maximize='Equity Final [$]',
max_tries=200,
random_state=0,
return_heatmap=True)
/Users/ww/workspace/minitrade/minitrade/backtest/core/backtesting.py:2248: UserWarning: For multiprocessing support in `Backtest.optimize()` set multiprocessing start method to 'fork'. warnings.warn("For multiprocessing support in `Backtest.optimize()` "
0%| | 0/9 [00:00<?, ?it/s]
CPU times: user 6.47 s, sys: 38.4 ms, total: 6.51 s Wall time: 6.55 s
Notice return_heatmap=True
parameter passed to
Backtest.optimize()
.
It makes the function return a heatmap series along with the usual stats of the best run.
heatmap
is a pandas Series indexed with a MultiIndex, a cartesian product of all permissible (tried) parameter values.
The series values are from the maximize=
argument we provided.
heatmap
n1 n2 n_enter n_exit 20 60 15 10 9780.10498 80 15 10 9864.21924 100 15 10 11003.21764 30 40 20 15 11888.74610 25 15 16346.34842 ... 100 200 15 10 13118.24766 20 10 11308.46180 15 16518.74380 25 10 8991.55294 30 10 9953.07010 Name: Equity Final [$], Length: 177, dtype: float64
This heatmap contains the results of all the runs, making it very easy to obtain parameter combinations for e.g. three best runs:
heatmap.sort_values().iloc[-3:]
n1 n2 n_enter n_exit 40 140 20 15 18296.45394 100 160 20 15 19417.90456 50 160 20 15 19767.05222 Name: Equity Final [$], dtype: float64
But we use vision to make judgements on larger data sets much faster.
Let's plot the whole heatmap by projecting it on two chosen dimensions.
Say we're mostly interested in how parameters n1
and n2
, on average, affect the outcome.
hm = heatmap.groupby(['n1', 'n2']).mean().unstack()
hm
n2 | 40 | 60 | 80 | 100 | 120 | 140 | 160 | 180 | 200 |
---|---|---|---|---|---|---|---|---|---|
n1 | |||||||||
20 | NaN | 9780.104980 | 9864.219240 | 11003.217640 | NaN | NaN | NaN | NaN | NaN |
30 | 14117.54726 | 11656.102327 | 11833.501340 | 15248.209270 | 13286.483360 | 11598.391895 | 11271.353850 | 11449.573465 | 10717.850688 |
40 | NaN | 13462.739695 | NaN | 7624.609980 | 10696.599030 | 12991.038870 | 11438.851153 | 10935.961380 | 10718.967365 |
50 | NaN | 8467.364960 | 10180.502548 | 10563.790150 | 9149.067013 | 14351.970500 | 13653.468355 | 11458.974993 | 10128.978620 |
60 | NaN | NaN | 9232.415117 | 8046.485900 | 10838.454280 | 12929.726093 | 10416.431300 | 9519.835100 | 9611.335367 |
70 | NaN | NaN | 14712.143280 | 7192.892540 | 10461.744630 | 10165.959860 | 8355.260353 | 9950.317090 | 9435.988292 |
80 | NaN | NaN | NaN | 10909.253515 | 7746.413967 | 9190.286300 | 8883.167490 | 10478.420200 | 8979.801500 |
90 | NaN | NaN | NaN | 9050.433200 | 9578.601733 | 9884.415550 | 9782.404510 | 11385.593830 | 8844.327300 |
100 | NaN | NaN | NaN | NaN | 11306.293220 | 7154.397113 | 11419.912430 | 10222.051700 | 11978.015260 |
Let's plot this table using the excellent Seaborn package:
%matplotlib inline
import seaborn as sns
sns.heatmap(hm[::-1], cmap='viridis')
<Axes: xlabel='n2', ylabel='n1'>
We see that, on average, we obtain the highest result using trend-determining parameters n1=40
and n2=60
,
and it's not like other nearby combinations work similarly well — in our particular strategy, this combination really stands out.
Since our strategy contains several parameters, we might be interested in other relationships between their values.
We can use
backtesting.lib.plot_heatmaps()
function to plot interactive heatmaps of all parameter combinations simultaneously.
from minitrade.backtest.core.lib import plot_heatmaps
plot_heatmaps(heatmap, agg='mean')
Model-based optimization¶
Above, we used randomized grid search optimization method. Any kind of grid search, however, might be computationally expensive for large data sets. In the follwing example, we will use scikit-optimize package to guide our optimization better informed using forests of decision trees. The hyperparameter model is sequentially improved by evaluating the expensive function (the backtest) at the next best point, thereby hopefully converging to a set of optimal parameters with as few evaluations as possible.
So, with method="skopt"
:
%%capture
# Use the forked version from https://github.com/dodid/scikit-optimize.git, which contains numpy and scikit-learn fixes.
# ! pip install scikit-optimize # This is a run-time dependency
%%time
stats_skopt, heatmap, optimize_result = backtest.optimize(
n1=[10, 100], # Note: For method="skopt", we
n2=[20, 200], # only need interval end-points
n_enter=[10, 40],
n_exit=[10, 30],
constraint=lambda p: p.n_exit < p.n_enter < p.n1 < p.n2,
maximize='Equity Final [$]',
method='skopt',
max_tries=200,
random_state=0,
return_heatmap=True,
return_optimization=True)
Backtest.optimize: 0%| | 0/200 [00:00<?, ?it/s]
CPU times: user 10.5 s, sys: 93.8 ms, total: 10.6 s Wall time: 10.7 s
heatmap.sort_values().iloc[-3:]
n1 n2 n_enter n_exit 68 96 29 24 28424.01724 35 98 28 24 28658.79512 44 134 39 27 30251.80700 Name: Equity Final [$], dtype: float64
Notice how the optimization runs somewhat slower even though max_tries=
is the same. But that's due to the sequential nature of the algorithm and should actually perform rather comparably even in cases of much larger parameter spaces where grid search would effectively blow up, but likely (hopefully) reaching a better local optimum than a randomized search would.
A note of warning, again, to take steps to avoid
overfitting
insofar as possible.
Understanding the impact of each parameter on the computed objective function is easy in two dimensions, but as the number of dimensions grows, partial dependency plots are increasingly useful. Plotting tools from scikit-optimize take care of many of the more mundane things needed to make good and informative plots of the parameter space:
from skopt.plots import plot_objective
_ = plot_objective(optimize_result, n_points=10)
from skopt.plots import plot_evaluations
_ = plot_evaluations(optimize_result, bins=10)
Learn more by exploring further examples or find more framework options in the full API reference.