Automated trading systems make decisions on how to invest in the stock market. These decisions often depend on parameters that must be optimized to maximize returns and stability while minimizing risk. Quantopian’s mission is to provide you with the necessary tools to implement and test your own trading algorithms. As a simple example, you might have an algorithm that starts buying once a particular stock has gone up x number of times in a row and starts selling once the stock has gone down y number of times in a row. However, how do you determine what good values for x and y are? As a general guiding principle, you might chose values that worked well in the past and ideally, you would want to those values to be determined automatically. Our goal with the parameter optimization is to allow the user to specify that the backtesting engine should find good values for x and y. In your algorithm you would then simply use these parameters as you would any other variable – except that its value will be chosen for you!
Fortunately for us, a large branch of applied math and computer science has been working on this very problem of parameter optimization for quite some time. In general terms, we have some input parameter(s) (x and y in our case) and an objective function we want to maximize (e.g. the cumulative returns). A multitude of diverse optimization algorithms exist, ranging from the more formal gradient descent to more heuristic techniques like particle swarm optimization. And while this is a fascinating and clearly relevant topic, this blog post focuses on how to apply the optimization algorithm rather than on which algorithm to apply. I will discuss the pros and cons of the different algorithms in a future blog post.
I personally applied a fair share of machine learning and optimization techniques during my ongoing academic endeavors into computational cognitive neuroscience. However, I have mostly dealt with batch-data, never with real-time data. With this background my first solution was to apply an offline optimization algorithm (i.e. gradient descent) and evaluate each parameter value by running the trading algorithm over the complete data set. Specifically, we set one initial parameter value, run the trading algorithm on the whole historical financial data while keeping this parameter value fixed and evaluate how good we did (e.g. via the cumulative returns). The optimization algorithm then suggests a new parameter value which we again test over the whole historical data set. Rinse and repeat until we find the parameter value that provides us with the highest returns.
While this approach did work with a carefully designed example trading algorithm, we discovered a few interesting problems and limitations to this solution:
- Unrealistic: It is hard to imagine how we would use this method in a real-world trading system. We probably wouldn’t want to use a parameter that worked well from 1997 to 2007 for all our day-to-day trading. Market demands change all the time and we want to incorporate the new data we observed! Moreover, we probably care less about which parameter values worked well in 1997 and want more recent data to have a bigger influence on our choice of parameters.
- Over-fitting: We might find that certain parameter values work extremely well for the period we used in the optimization, however, once we test it on unseen data we notice that it does quite poorly. In the machine learning literature this is a very well known problem called over-fitting. It results from fitting models to noise artifacts in the data.
- Boring to watch: It would be pretty boring to just hit the optimize button and be notified a couple of hours (or days) later about the optimal parameter values found without any real insight into what was going on and not get any intermediate feedback.
Walk forward optimization
Our current approach improves all three of these points. Instead of batch optimizing the parameters over the whole historical data set, we run an optimization for each day individually using the data of e.g. last week. This is also known as walk forward optimization. For example, say we wanted to optimize our parameters iteratively using a time-window of two days. For day t, we would use the parameter values obtained by optimizing over day t-2 and t-1. We then move on to the next day, t+1, for which we use the parameters optimized over days t-1 and t and continue to move forward in this fashion. The image below should illustrate this more clearly.
This solves the problems outlined above in the following ways:
- Realistic: We could directly apply this method in a real-time trading system that continues to learn as new data becomes available. Moreover, because we only use the most recent data (e.g. last week) in each step, we quickly adapt to changes in the market.
- Out-of-sample testing: Note that we always test our chosen parameters on data not used for the optimization (day t is using the optimal value of days t-2 through t-1, not that of day t). This is known as out-of-sample testing.
- Continuous progress updates: We get immediate, real-time feedback on the performance of the trading algorithm and the current optimal parameter values while the simulation is ongoing. Not only is this exciting to watch, it allows us to spot potential problems with the trading algorithm or the parameter optimization early on.
On top of these benefits I think there are some other potentially interesting things to explore with this approach. For example, one might want to focus on how the optimal parameter values change over time as this might reflect how our algorithm responds to changing market demands. Coming back to our original example, lower values of x might be better in times of financial turmoil while during a boom, high values of y are beneficial.
What other cool things would you do with this? Leave a comment!