Backtesting.py Quick Start User Guide¶
This tutorial shows some of the features of backtesting.py, a Python framework for backtesting trading strategies.
Backtesting.py is a small and lightweight, blazing fast backtesting framework that uses state-of-the-art Python structures and procedures (Python 3.6+, Pandas, NumPy, Bokeh). It has a very small and simple API that is easy to remember and quickly shape towards meaningful results. The library doesn't really support stock picking or trading strategies that rely on arbitrage or multi-asset portfolio rebalancing; instead, it works with an individual tradeable asset at a time and is best suited for optimizing position entrance and exit signal strategies, decisions upon values of technical indicators, and it's also a versatile interactive trade visualization and statistics tool.
Data¶
You bring your own data. Backtesting ingests all kinds of
OHLC
data (stocks, forex, futures, crypto, ...) as a
pandas.DataFrame
with columns 'Open'
, 'High'
, 'Low'
, 'Close'
and (optionally) 'Volume'
.
Such data is widely obtainable, e.g. with packages:
- pandas-datareader,
- Quandl,
- findatapy,
- yFinance,
- investpy, etc.
Besides these columns, your data frames can have additional columns which are accessible in your strategies in a similar manner.
DataFrame should ideally be indexed with a datetime index (convert it with pd.to_datetime()
);
otherwise a simple range index will do.
# Example OHLC daily data for Google Inc.
from minitrade.backtest.core.test import GOOG
GOOG.tail()
Open | High | Low | Close | Volume | |
---|---|---|---|---|---|
2013-02-25 | 802.3 | 808.41 | 790.49 | 790.77 | 2303900 |
2013-02-26 | 795.0 | 795.95 | 784.40 | 790.13 | 2202500 |
2013-02-27 | 794.8 | 804.75 | 791.11 | 799.78 | 2026100 |
2013-02-28 | 801.1 | 806.99 | 801.03 | 801.20 | 2265800 |
2013-03-01 | 797.8 | 807.14 | 796.15 | 806.19 | 2175400 |
Strategy¶
Let's create our first strategy to backtest on these Google data, a simple moving average (MA) cross-over strategy.
Backtesting.py doesn't ship its own set of technical analysis indicators. Users favoring TA should probably refer to functions from proven indicator libraries, such as TA-Lib or Tulipy, but for this example, we can define a simple helper moving average function ourselves:
import pandas as pd
def SMA(values, n):
"""
Return simple moving average of `values`, at
each step taking into account `n` previous values.
"""
return values.rolling(n).mean()
A new strategy needs to extend
Strategy
class and override its two abstract methods:
init()
and
next()
.
Method init()
is invoked before the strategy is run. Within it, one ideally precomputes in efficient, vectorized manner whatever indicators and signals the strategy depends on.
Method next()
is then iteratively called by the
Backtest
instance, once for each data point (data frame row), simulating the incremental availability of each new full candlestick bar.
Note, backtesting.py cannot make decisions / trades within candlesticks — any new orders are executed on the next candle's open (or the current candle's close if
trade_on_close=True
).
If you find yourself wishing to trade within candlesticks (e.g. daytrading), you instead need to begin with more fine-grained (e.g. hourly) data.
from minitrade.backtest import Strategy
from minitrade.backtest.core.lib import crossover
class SmaCross(Strategy):
# Define the two MA lags as *class variables*
# for later optimization
n1 = 10
n2 = 20
def init(self):
# Precompute the two moving averages
self.sma1 = self.I(SMA, self.data.Close.df, self.n1)
self.sma2 = self.I(SMA, self.data.Close.df, self.n2)
def next(self):
# If sma1 crosses above sma2, close any existing
# short trades, and buy the asset
if crossover(self.sma1, self.sma2):
self.position().close()
self.buy()
# Else, if sma1 crosses below sma2, close any existing
# long trades, and sell the asset
elif crossover(self.sma2, self.sma1):
self.position().close()
self.sell()
In init()
as well as in next()
, the data the strategy is simulated on is available as an instance variable
self.data
.
In init()
, we declare and compute indicators indirectly by wrapping them in
self.I()
.
The wrapper is passed a function (our SMA
function) along with any arguments to call it with (our close values and the MA lag). Indicators wrapped in this way will be automatically plotted, and their legend strings will be intelligently inferred.
In next()
, we simply check if the faster moving average just crossed over the slower one. If it did and upwards, we close the possible short position and go long; if it did and downwards, we close the open long position and go short. Note, we don't adjust order size, so Backtesting.py assumes maximal possible position. We use
backtesting.lib.crossover()
function instead of writing more obscure and confusing conditions, such as:
%%script echo
def next(self):
if (self.sma1[-2] < self.sma2[-2] and
self.sma1[-1] > self.sma2[-1]):
self.position().close()
self.buy()
elif (self.sma1[-2] > self.sma2[-2] and # Ugh!
self.sma1[-1] < self.sma2[-1]):
self.position().close()
self.sell()
In init()
, the whole series of points was available, whereas in next()
, the length of self.data
and all declared indicators is adjusted on each next()
call so that array[-1]
(e.g. self.data.Close[-1]
or self.sma1[-1]
) always contains the most recent value, array[-2]
the previous value, etc. (ordinary Python indexing of ascending-sorted 1D arrays).
Note: self.data
and any indicators wrapped with self.I
(e.g. self.sma1
) are NumPy arrays for performance reasons. If you prefer pandas Series or DataFrame objects, use Strategy.data.<column>.s
or Strategy.data.df
accessors respectively. You could also construct the series manually, e.g. pd.Series(self.data.Close, index=self.data.index)
.
We might avoid self.position.close()
calls if we primed the
Backtest
instance with Backtest(..., exclusive_orders=True)
.
from minitrade.backtest import Backtest
bt = Backtest(GOOG, SmaCross, cash=10_000, commission=.002)
stats = bt.run()
stats
Start 2004-08-19 00:00:00 End 2013-03-01 00:00:00 Duration 3116 days 00:00:00 Exposure Time [%] 97.067039 Equity Final [$] 68935.11986 Equity Peak [$] 68991.21986 Return [%] 589.351199 Buy & Hold Return [%] 607.370361 Return (Ann.) [%] 25.673113 Volatility (Ann.) [%] 38.68591 Sharpe Ratio 0.66363 Sortino Ratio 1.303752 Calmar Ratio 0.776041 Max. Drawdown [%] -33.082172 Avg. Drawdown [%] -5.581506 Max. Drawdown Duration 688 days 00:00:00 Avg. Drawdown Duration 41 days 00:00:00 # Trades 94 Win Rate [%] 54.255319 Best Trade [%] 57.11931 Worst Trade [%] -16.629898 Avg. Trade [%] 2.085687 Max. Trade Duration 121 days 00:00:00 Avg. Trade Duration 33 days 00:00:00 Profit Factor 2.1966 Expectancy [%] 2.618979 SQN 2.001593 Kelly Criterion 0.259164 _strategy SmaCross _equity_curve Equity Asset Cash... _trades EntryBar ExitBar Ticker Size EntryPrice... _orders Ticker Side Size SignalTime ... _positions {'Asset': 85, 'Cash': 408} _trade_start_bar 19 dtype: object
Backtest.run()
method returns a pandas Series of simulation results and statistics associated with our strategy. We see that this simple strategy makes almost 600% return in the period of 9 years, with maximum drawdown 33%, and with longest drawdown period spanning almost two years ...
Backtest.plot()
method provides the same insights in a more visual form.
bt.plot()
/Users/ww/workspace/minitrade/minitrade/backtest/core/_plotting.py:737: UserWarning: found multiple competing values for 'toolbar.active_drag' property; using the latest value fig = gridplot( /Users/ww/workspace/minitrade/minitrade/backtest/core/_plotting.py:737: UserWarning: found multiple competing values for 'toolbar.active_scroll' property; using the latest value fig = gridplot(
Optimization¶
We hard-coded the two lag parameters (n1
and n2
) into our strategy above. However, the strategy may work better with 15–30 or some other cross-over. We declared the parameters as optimizable by making them class variables.
We optimize the two parameters by calling
Backtest.optimize()
method with each parameter a keyword argument pointing to its pool of possible values to test. Parameter n1
is tested for values in range between 5 and 30 and parameter n2
for values between 10 and 70, respectively. Some combinations of values of the two parameters are invalid, i.e. n1
should not be larger than or equal to n2
. We limit admissible parameter combinations with an ad hoc constraint function, which takes in the parameters and returns True
(i.e. admissible) whenever n1
is less than n2
. Additionally, we search for such parameter combination that maximizes return over the observed period. We could instead choose to optimize any other key from the returned stats
series.
%%time
stats = bt.optimize(n1=range(5, 30, 5),
n2=range(10, 70, 5),
maximize='Equity Final [$]',
constraint=lambda param: param.n1 < param.n2)
stats
/Users/ww/workspace/minitrade/minitrade/backtest/core/backtesting.py:2248: UserWarning: For multiprocessing support in `Backtest.optimize()` set multiprocessing start method to 'fork'. warnings.warn("For multiprocessing support in `Backtest.optimize()` "
0%| | 0/9 [00:00<?, ?it/s]
CPU times: user 1.74 s, sys: 27.1 ms, total: 1.77 s Wall time: 1.79 s
Start 2004-08-19 00:00:00 End 2013-03-01 00:00:00 Duration 3116 days 00:00:00 Exposure Time [%] 99.068901 Equity Final [$] 105040.12612 Equity Peak [$] 108327.71798 Return [%] 950.401261 Buy & Hold Return [%] 687.987489 Return (Ann.) [%] 32.010932 Volatility (Ann.) [%] 45.029728 Sharpe Ratio 0.710884 Sortino Ratio 1.504932 Calmar Ratio 0.727597 Max. Drawdown [%] -43.995445 Avg. Drawdown [%] -6.138853 Max. Drawdown Duration 690 days 00:00:00 Avg. Drawdown Duration 43 days 00:00:00 # Trades 153 Win Rate [%] 51.633987 Best Trade [%] 61.562908 Worst Trade [%] -19.778312 Avg. Trade [%] 1.557227 Max. Trade Duration 83 days 00:00:00 Avg. Trade Duration 21 days 00:00:00 Profit Factor 1.988207 Expectancy [%] 1.987172 SQN 1.619735 Kelly Criterion 0.166257 _strategy SmaCross(n1=10,n2=15) _equity_curve Equity Asset Cas... _trades EntryBar ExitBar Ticker Size EntryPric... _orders Ticker Side Size SignalTime ... _positions {'Asset': 130, 'Cash': 235} _trade_start_bar 14 dtype: object
We can look into stats['_strategy']
to access the Strategy instance and its optimal parameter values (10 and 15).
stats._strategy
<Strategy SmaCross(n1=10,n2=15)>
bt.plot(plot_volume=False, plot_pl=False)
/Users/ww/workspace/minitrade/minitrade/backtest/core/_plotting.py:737: UserWarning: found multiple competing values for 'toolbar.active_drag' property; using the latest value fig = gridplot( /Users/ww/workspace/minitrade/minitrade/backtest/core/_plotting.py:737: UserWarning: found multiple competing values for 'toolbar.active_scroll' property; using the latest value fig = gridplot(
Strategy optimization managed to up its initial performance on in-sample data by almost 50% and even beat simple buy & hold. In real life optimization, however, do take steps to avoid overfitting.
Trade data¶
In addition to backtest statistics returned by
Backtest.run()
shown above, you can look into individual trade returns and the changing equity curve and drawdown by inspecting the last few, internal keys in the result series.
stats.tail()
_equity_curve Equity Asset Cas... _trades EntryBar ExitBar Ticker Size EntryPric... _orders Ticker Side Size SignalTime ... _positions {'Asset': 130, 'Cash': 235} _trade_start_bar 14 dtype: object
The columns should be self-explanatory.
stats['_equity_curve'] # Contains equity/drawdown curves. DrawdownDuration is only defined at ends of DD periods.
Equity | Asset | Cash | DrawdownPct | DrawdownDuration | |
---|---|---|---|---|---|
2004-08-19 | 10000.00000 | 0.0 | 10000.00000 | 0.000000 | NaT |
2004-08-20 | 10000.00000 | 0.0 | 10000.00000 | 0.000000 | NaT |
2004-08-23 | 10000.00000 | 0.0 | 10000.00000 | 0.000000 | NaT |
2004-08-24 | 10000.00000 | 0.0 | 10000.00000 | 0.000000 | NaT |
2004-08-25 | 10000.00000 | 0.0 | 10000.00000 | 0.000000 | NaT |
... | ... | ... | ... | ... | ... |
2013-02-25 | 103035.52612 | 102800.1 | 235.42612 | 0.048854 | NaT |
2013-02-26 | 102952.32612 | 102716.9 | 235.42612 | 0.049622 | NaT |
2013-02-27 | 104206.82612 | 103971.4 | 235.42612 | 0.038041 | NaT |
2013-02-28 | 104391.42612 | 104156.0 | 235.42612 | 0.036337 | NaT |
2013-03-01 | 105040.12612 | 104804.7 | 235.42612 | 0.030349 | 533 days |
2148 rows × 5 columns
stats['_trades'] # Contains individual trade data
EntryBar | ExitBar | Ticker | Size | EntryPrice | ExitPrice | PnL | ReturnPct | EntryTime | ExitTime | Tag | Duration | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 20 | 60 | Asset | 87 | 114.64884 | 185.23 | 6140.56092 | 0.615629 | 2004-09-17 | 2004-11-12 | None | 56 days |
1 | 60 | 69 | Asset | -87 | 184.85954 | 175.80 | 788.17998 | 0.049008 | 2004-11-12 | 2004-11-26 | None | 14 days |
2 | 69 | 71 | Asset | 96 | 176.15160 | 180.71 | 437.60640 | 0.025878 | 2004-11-26 | 2004-11-30 | None | 4 days |
3 | 71 | 75 | Asset | -96 | 180.34858 | 179.13 | 116.98368 | 0.006757 | 2004-11-30 | 2004-12-06 | None | 6 days |
4 | 75 | 82 | Asset | 97 | 179.48826 | 177.99 | -145.33122 | -0.008347 | 2004-12-06 | 2004-12-15 | None | 9 days |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
148 | 2085 | 2111 | Asset | 139 | 689.15556 | 735.54 | 6447.43716 | 0.067306 | 2012-11-29 | 2013-01-08 | None | 40 days |
149 | 2111 | 2113 | Asset | -139 | 734.06892 | 742.83 | -1217.79012 | -0.011935 | 2013-01-08 | 2013-01-10 | None | 2 days |
150 | 2113 | 2121 | Asset | 136 | 744.31566 | 735.99 | -1132.28976 | -0.011186 | 2013-01-10 | 2013-01-23 | None | 13 days |
151 | 2121 | 2127 | Asset | -136 | 734.51802 | 750.51 | -2174.90928 | -0.021772 | 2013-01-23 | 2013-01-31 | None | 8 days |
152 | 2127 | 2147 | Asset | 130 | 752.01102 | 806.19 | 7043.26740 | 0.072045 | 2013-01-31 | 2013-03-01 | None | 29 days |
153 rows × 12 columns
Learn more by exploring further examples or find more framework options in the full API reference.