I recently had someone from Portugal approach me with an Expert Advisor that he programmed. He felt that it was successful and wanted to get my opinion on its viability. I only know a small amount of information about the system, yet I was able to confidently reject it as unproven.
Statistics are the key. It provides a relatively straightforward toolbox to easily dismiss the simplest of systems. The measure in particular that I care about is called a student’s t-test.
Let’s first talk about when this test is appropriate. If you have fixed take profits and stop losses in place on 100% of all trades, then it is appropriate to use this test. The reason is that these values provide fixed limits on the outcome of trades. The distribution of your trades is likely to form a nice bell curve. In math terms, this is called a Gaussian distribution.
If your strategy uses dynamic exits based on market conditions, then this test is not appropriate. The distribution of forex, stock and futures prices do not follow a bell curve. The test depends on the assumption that the distribution under study follows a bell curve. If you make assumptions that don’t match what you’re measuring, then results are useless at best and dangerously misleading at worst.
If you’d like to go into the mathematics involved, then you can find numerous sources on the internet that explain t-tests. I also really enjoy the book Statistics by Freedman, Pisani and Purves.
Most trading systems do not follow bell curves, which makes the details of the t-test largely irrelevant. What’s useful about it is to show that you can generally feel better about the outcomes and predictions based on the number of results in the backtest.
Restrictions and Degrees of Freedoms
The goal of any test is to ensure that the results are accurate. The more frequently that you test a concept, the more confident that you feel about the outcome repeating itself consistently. The idea of predictability largely matches our intuitive expectations. If my co-worker shows up on time regularly, then I feel confident about him showing up on time tomorrow. If he shows up late regularly, then I know that he’s likely to show up late in the future.
The increase in experience increases the level of confidence. Eventually, the number gets so big that we feel very comfortable with the probabilities.
The Portuguese client presented a system based on moving averages with 3 filters, a stop loss and a take profit. Let’s assume that each filter only had one parameter. The number of restrictions for the buy trade is the moving average period (1), the one parameter for each of the three filters (3), the stop loss distance (1) and the take profit distance (1). This yields a total of 6 restrictions for buy trades. Assuming that sell trades use the same inputs, then we have a total of 12 restrictions.
We can’t begin counting our total number of trades (i.e., degrees of freedom) until the backtest shows at least 12 trades to account for our restrictions. It’s a good idea to not infer anything about a trading system until 30+ trades elapse. With the 12 restrictions in place, that sets the threshold for the minimum number of trades to reach a conclusion at 30 + 12 = 42.
Generally, it’s a good idea to see 300-400 trades before drawing conclusions about any system. The reason for this is that some events are very rare. They may only occur once every couple of hundred trials. Allowing the amount of information to approach this threshold allows the trader to more comfortably evaluate what hidden risks may be present.
The backtest that the Portuguese individual submitted only contained 27 trades. Knowing what we know about basic analysis, I comfortably decided that there is nowhere near enough information on the system to consider evaluating it.
A Word of Caution
An algorithm’s trading statistics are almost certain to change with time. Unless you have a mathematically sophisticated model for evaluating volatility, best practice demands that you evaluate the trading results in light of the type of volatility experienced. Making money in 2008 when the markets nose dived does not mean that you would have made money in 2010, which was much quieter in comparison. Any signal in 2008 that indicated a short would almost certainly show returns that include a handful of monster winners. If those monsters fail to show up because the volatility does not cooperate, then your expert advisor will more than likely flop.
Leave a Reply