Whether or not their strategy has overfit the data it has been tested on should be one of the biggest concerns of any quantitative trader. Overfit strategies will appear to be extremely profitable on the backtesting data, only to fall apart when traded on out-of-sample data. This can lead the the implosion of many young trading accounts.
Andrea from Math Trading wrote an outside-the-box post where he discussed applying techniques from the Machine Learning field in order to explore how the features of a trading strategy related to how overfit the strategy was.
In her post, Andrea suggests that trading algorithms are simple a form of artificial intelligence applied to price action. He also explains that the more features that are added to a strategy, the more likely it is to overfit the data it is being tested on.
Building on that idea, he suggests that the features used in a strategy affect different strategies differently. This means that there is a dynamics issue involved as well, so we can’t assume that any system feature will always have the same impact on overfitting. He explains that Machine Learning techniques can be applied to help out with these judgements.
Andrea’s Real World Example
The real world example that Andrea uses to explain this idea is a sporting goods store in Australia. He suggests that because Australian consumers are big fans of watersports, spring and summer would be their best seasons for sales. That basic idea makes sense.
He explains that this seasonal approach would have less predictive power on a sporting good store located in the United States, where there are a number of popular sports in every season of the year. Therefore, the same predictive feature would have dramatically different results based on environment.
Trading System Example
To apply this idea to trading strategies, Andrea uses the rather obvious example of trailing stops. He explains that trailing stops generally work well when they are used in trend following strategies. Those same trailing stops have a much different effect when they are applied to mean reversion strategies.
Because trailing stops work differently in these different situations, we cannot determine that they are either good or bad for overfitting data. They have to be considered with respect to the strategy that they are being used in.
After explaining how different features can impact different strategies in different ways, Andrea suggests that there are a few papers in the Machine Learning field that could help to guide our thinking on this topic.