For nearly every trader, there is one, all-consuming question: how do you build a profitable strategy?
Naturally, one would expect to dive straight into the statistics and begin crunching the numbers. Yet, before we delve further into the mechanics – and believe me there are plenty – I want to talk a bit about the philosophy and, indeed, the rules behind a good strategy.
With all the numbers needing to be crunched, it’s unquestionably important to have a rough sketch of your ideas for building a strategy. Only with that rough sketch in hand can you then move onto the blue prints of actually constructing a sound, viable strategy.
There is No Perfect Trading Strategy
Having said that, let’s get something perfectly clear and out of the way: Your first rule of trading is to understand that there is no perfect strategy. But don’t let that stop you from crafting the best trading strategy possible. There are some steps to take before you even begin to draft your strategy.
If you hope to one day be able to gauge the market you need to first find your philosophical starting point. That is the acceptance of a single immutable truth which you must first embrace.
Before you dive into the practise, the data to the equation and, of course, the money, you need a reference point. And that reference point is simply this: The market has a heartbeat. That means there is no one-size-fits-all strategy that can predict the market each and every time. That’s because the market is not static; it’s constantly moving and the rules are constantly changing.
Many, many times I’ve seen traders try to optimise their strategy to account for every conceivable condition. The next thing you know, they’ve not only lost direction but they’ve lost a bundle. As I’ve said, no single strategy works all the time. Thus, building a successful strategy calls for you to predict the market at certain very specific points. Moreover, you need to recognize the “dead zones” where the strategy is not predictive, and make it a point not to engage.
Stick to the Rules
Your second rule to producing a viable trading strategy is to have clear mechanical rules of engagement. That is, clear conditions as to when you open a trade and when you close a trade. And you need to stick with that rule, no matter what. If you allow any leeway you actually change the mechanism, thus making it impossible to measure.
Let’s say your first rule of engagement looks like this: When the 120 days EMA crosses the 60 EMA from above, you buy. This time, though, it appears to you that the pair you’re trading is still bearish. You decide that you’re not going to open a trade after all.
Though in this instance it was the right call, you’ve interfered with the rules. Thus the results you got are not a byproduct of your strategy. Instead, they’re the result of your judgment call. When a trader interferes with the rules of engagement they eliminate the ability to assess its effectiveness. So why is that a tragedy?
Because it means you’ll never know where you’re really getting it wrong. And if you never learn from your mistakes you’ll always make them. This is the most dangerous, even lethal, pitfall, and one that has burned thousands of traders with so-called “good strategies.” So, if you want to make sure your strategy is successful (and not just randomly successful), stick to the rules.
Four Phases of Testing
So now that we’ve established the philosophy it’s time to get down to business. How do you test whether or not your strategy is worth implementing? The testing process constitutes four different phases:
- In-Sample Testing
- Optimization
- Out-of-Sample Testing
- Forward Testing (i.e. paper trading)
In-Sample Testing
When you think about testing a strategy what instinctively comes to mind? Back testing, or testing your strategy based on historical data. But while back testing is one of the more important parts of testing, it might also create some misconceptions.
For example, if you test your strategy on an entire set of data, how will you know how it performed when market conditions changed? To tackle this problem professionals use what’s called In-Sample Out-of-Sample.
How you perform the test is relatively simple. The historical data is divided into two parts, the In- Sample and Out-of Sample. In-Sample represents about 2/3 of the testing period while Out-of-Sample accounts for the remaining 1/3. You can see how it plays out in the illustration below.
The In-Sample will be the preliminary test of your strategy, the first dry run, if you will. If your strategy doesn’t do well in the In-Sample test, it means you might have to ditch that strategy and go back to the drawing board.
However, if the In-Sample testing shows an ascending curve of returns that’s good news! It means you’ve got something to work with. Now it’s time to tighten the screws, just like a mechanic, and in the world of trading that means optimizing your strategy.
Optimizing Your Trading Strategy
Now, optimizing a strategy is perhaps the most mathematically intensive part of carving out a strategy. Even if math is not your forte, it’s important enough not to ignore. There are three methods we can use: Correlation, return distribution and curve fitting. Let’s look at how we’d put them to use.
Of course, as a case study we will use the simplest trading strategy that traders use to ride a trend is the moving average cross. The moving average cross works like this: If the fast moving average (short period) is above the slow moving average (longer period) that’s a buy signal.
Conversely, if the fast moving average is below the slow moving average then that’s a sell signal. Now, let’s say that we decide to open a position but only if certain parameters are met. The question is how can you know if those parameters are, in fact, the optimal ones? Well, that’s where our statistical methodologies will come in handy.
Correlation, Return Distribution and Curve Fitting
The first method is correlation. Essentially, you switch to a different set of parameters that better correlates to the market. Let me elaborate; say your first test or trial was 120 days for the long average and 30 days for the short average (or 120, 30). Then you tested a few more options, let’s say 120, 14 and then 60, 30 days.
Next you compare the correlation of each data set. The closer the R2 Correlation Coefficient is to 1, the better. That means this strategy is better at predicting the market. What if you get an R value that is closer to -1?
Well, that’s good, too, in its own way. It means you should be selling rather than buying, because the market is moving in the opposite direction. (Of course, with the MA crosses it’s very unlikely to get a -1 correlation anyway.)
Now, if you get a value close to 0 that’s not so good. That signals that there’s no or very little correlation between your strategy and the market. If you had positive gains on the first back testing and the correlation is 0 then the reality is your success was random and not indicative.
As you can see from the three options, our first choice was actually the best correlated to the market. That suggests that the parameters we chose in this particular case were optimal.
Distribution of Returns
After we checked to see which strategy better correlated to the market there is another dimension to consider. Let’s say one set of parameters is better correlated to the market thus it’s more successful. Another set of parameters is not quite as correlated but a successfully executed trade yields more on average per trade.
As seen in charts above, we can see that that makes for an interesting insight. Our preliminary parameters (120, 30) had a better distribution of returns, meaning returns per trade are stable rather than fluctuating. And that makes sense because it has a higher correlation to the market, as indicated by the first test we made.
Now, we get an interesting result, and one that you are likely to encounter. Our first parameters made more constant returns because the correlation to the market was higher. However, the average return per trade was lower than the second set of numbers (120,14).
This is quite puzzling. How would you determine, then, which strategy is better? Is it better to earn less per trade but constantly or to earn more per trade but less constantly?
To answer this question we must move to our final staple in optimization. That is to compare the curves created when using the two parameters and see which one works to our advantage, i.e. constant smaller gains or less constant bigger gains.
Comparing the Curves
When we overlay the curves we get our answer: The first set of parameters (120,30) is still preferable. While having inferior results at the beginning, the high fluctuations of the second option means that your results were more random.
That randomness would eventually lead to a strategy that does not efficiently predict the market. However, if the returns on the second set of numbers were vastly superior then the volatility in returns might be worth the risk.
But in this case, the 120, 30 had proven superior returns at the end, compensating us for taking less risk and being more predictive in the market. So, now that we’ve finished our preliminary optimization we’re ready to test our strategy with the Out-of-Sample test phase.
Out-Sample Testing
By running the Out-of-Sample strategy after the other tests you’ll gain valuable insight as to how your strategy reacts to different market conditions than initially considered.
Now we need to check the results of each. There are two different strategies (A and B) that were tested with both the In- and Out-of-Sample. As you can see, Strategy A had been rather successful on the In-Sample (2/3) but once checked against the entire data set it performed rather poorly.
In contrast, Strategy B had done well on both parts of the In- and Out-of-Sample, meaning you’ve got something viable. Thus, there is a greater chance that your strategy is what every trader wants – a money maker.
Graduating to Paper Trading
Congratulations! If you’ve reached this level then you’ve been successful in generating returns. Now it’s time to try out your trading strategy on live data. We’re not yet trading with actual money but it’s the closest thing, and should give you a good idea as to how well your strategy will perform live.
You may find that there are glitches that you didn’t at first notice. Or perhaps the entry or exit of each trade could be improved with another indicator. These are things you might only discover when trading goes live. After some adequate sampling, something akin to the Out-of-Sample length and – voilà – you’ve crafted a trading strategy.
Shaun put together a free, 6-step checklist to help traders build their own automated trading systems. It’s the same one that he used for QB Pro. If you’re feeling overwhelmed with where to begin your own strategy, then that free handout is the logical place to start.
I havent understood the concept yet
Hi Azhar,
Which part is unclear?
–Shaun
very nice and helpful article
Thank you, Nikos!
Excellent article. Very useful for someone trying to devise an automated trading strategy. I have read many articles on trading and rarely find one so short and clear which covers so much.
Wow! I am bookmarking this one.
Thanks, David. That’s one of the best comments I’ve ever had on my blog!
The correlation part is unclear. The y-axis and x-axis values are not labeled, and calculation of R² is absent.
Could you elaborate on that?
Hey Rob,
I’ll have Lior weigh in. Thanks for the question.
Hi Rob,
Thank you for your question. What the correlation measures is the amount of pips the market moved vs the strategy’s gains on a specific signal. So if market moved 10 pips and your strategy moved 10 pips that would be a correlation of 1.If your strategy would have gain less, for say 8 that correlation would have been less than 1. If the market moved 10 pips but your strategy lost 10 pips that correlation would be -1. In essence it measures how well your strategy predicts what’s about to happen next.
I’m guessing the correlation graphs are statistical linear regression where the X axis would be 100% wins (independent variable) and the Y axis would be the strategy I’m testing (dependent variable)? https://onlinecourses.science.psu.edu/stat501/node/250
Are the data points cumulative? The graphs seem to suggest that they are.
Are the data points sampled over a given time frame (ie once per week/month/etc)?
Why is the R2 correlation better than a simple win/loss ratio?
I couldn’t get that link to work, but the graphs are the correlations of the moving average pairs compared to the underlying market.
If you look closely at the x-axis, you’ll notice that it’s over several thousand bars, which is years of data.
Correlations are better than win:loss because win:loss can often result from a purely random signal. Getting a 90% correlation is not a coincidence, but 1,000 coin tosses can come out profitable. There are even rare instances where it’s wildly profitable. You want to know how reliable the profit actually was.
I think that makes sense. I’ve been toying with my backtest class and I added what I call a perfect balance. This is calculated by looking at the candles between trade start and trade close. Wherever the highest (or lowest) price falls is used to calculate the perfect balance. So this would ideally be the perfect strategy (winning 100% of the time using only what the market gives it.)
My idea is to compare the perfect balance with the actual strategy balance performance using linear regression and r squared correlations.
That sound right or am I way off?
It’s very right. I’ve done machine learning along those lines trading at random frequencies. If you know the optimal entry and exits for a given frequency, the goal is to then find predictors to match those exits. Whether or not you use linear regression depends on the distribution of the variables, but that’s usually the tool of choice.
Fantastic! Always a learning experience. Now if I could go back in time and redo my system in python instead of php I would be in good shape 😀
Thanks Shaun. I appreciate your work.
Dommage que votre site ne soit pas en Français
Je suis désolé.
You guys are kidding yourselves. Technical analysis is junk science.
I don’t understand the comment. This article isn’t about technical analysis. It’s about measuring the results of your predictions and then acting accordingly. What exactly is the objection? You don’t think mathematical observation works?
Based on your comment, you may enjoy this video: The RSI Doesn’t Work.
eurusd open don
What?
For me this is not clear: How do you do the division between In-sample and out sample? What data you take for that? Or do you simply say that a strategy should be successful for 2/3 of the time? Thanks for an explanation.
Hi Ean,
There are several methods, though these are the 2 most common:
1) Divide the data in half. The first half is in-sample. The second is out-of-sample.
2) Divide the data into 6-12 month chunks. Optimize in one chunk, then test in the next chunk. Re-optimize, then test in the next chunk.
The second method is only appropriate if your strategy is sensitive to parameter settings. Most of the stuff I do is very simple, which leads me to use the first method 99% of the time.
–Shaun
which is your best robot for begners ?
Hi Noble,
It’s not quite that simple. I suggest you read through my beginner’s trading article to get my advice for new traders.
–Shaun
Hi Shaun,
Have you heard of the Algorithmic Traders Association? Do you think it’s a valid organization in the sense of giving quality training or qualification?
Just wanted your opinion on this as algo trading is a still new subject and doesn’t seem to have specific training on this.
This might be even an area where you could expand to. Provide broad training on all concepts of algo trading.
Hi Daniel,
I’ve never heard of the organization. I’d be leery of organizations issuing credentials, though. Algo trading is insanely difficult. You really need to develop math and science skills if you want to do it well. Your time would be better spent exposed to physcial science subjects rather than market specific. You’ll get better ideas from other disciplines.
–Shaun
Your correlation charts, are these created automatically by Tradestation, or can this be done in Excel?
How would you go about creating these – do you have the code, or steps?
It’s a very broad subject, but the short answer is that it’s not something that trading platforms support. You’ll need to use 3rd party software to build correlation charts.
I usually use R, which is not beginner friendly but very powerful. Excel would be quite a bit easier. Plot your x-y data, then calculate a linear regression of the sorted data. You can use CORREL to build the graphs.