Backtesting for efficiency - Algorithmic and Mechanical Forex Strategies

Your forex backtests are absolutely worthless if you do not test the statistical entry efficiency and exit efficiency of the strategy. Everyone that runs a backtest inevitably reports the dollars earned as the outcome. Other factors exist like the average win to loss, the profit factor and the Sharpe ratio, but they do not tell you anything useful until the final step of designing an automated trading system.

The correct approach to testing a strategy should focus on the question, “is my strategy a piece of garbage?” Most people try to prove themselves right. The real test is to not be able to prove yourself wrong. The only way to do that is through a statistical approach.

Entry and Exit Efficiency

Efficiency puts a hard number to what percentage of an available trading range that a strategy captures. The trading window starts on the bar where a trade entered the market. The window closes when the trade exits.

The total available window is the highest high minus the lowest low in the window. Calculating the entry and exit efficiency simply measures what percentage of that window that your strategy tends to capture. Take the average of all of the trades and you get the overall efficiency.

Entry efficiency formula

Formula for a long trade: (Highest high – entry price) ÷ (Highest high – lowest low)
Formula for a short trade: (entry price – lowest low) ÷ (Highest high – lowest low)

Exit efficiency formula

Formula for a long trade: (Exit price – lowest low) ÷ (Highest high – lowest low)
Formula for a short trade: (Highest high – exit price) ÷ (Highest high – lowest low)

Take an example where you buy a hypothetical currency at 150 and sell it at 170. The lowest low between the time of entry and exit was 140. The price then ran all the way up to 200 before settling back down to 170, which is where the exit took place.

The entry efficiency is (200-150) ÷ (200-140) = 50 ÷ 60 = 83%. Nearly anyone would agree that this makes for a great entry.
The exit efficiency is (170-140) ÷ (200-140) = 30 ÷ 60 = 50%. Most would agree that the exit would have ideally occurred sooner than it did.

Efficiencies do not change by instrument or time frame

One major problem that we encounter with forex backtests is the limited data set. This is especially true for those interested in testing long term strategies like those on the H4 or D1 charts. The wonderful thing about entry and exit efficiencies is that they do not vary from chart to chart or even period to period.

I like to jump down to M1 charts for efficiency testing. The data is nearly endless. I never have to worry about running out. The great thing is that I know when I shift back to the H4 chart, the efficiencies should not change more than ±5%.

If you see the efficiency vary too much, then you may not have enough trades to form a statistically significant group. My experience tells me that 75 trades usually gets very close to the actual efficiency. 100 trades or more is better. When I run tests on M1 charts, I often get several thousand trades over the course of a few months. Numbers that large can tell you with a great deal of confidence just how robust a strategy’s parameters truly are.

Usually, you can assume that any results that fall within 45-55% are the result of a random, stochastic process. When I see backtests that creep right up to those barriers like 54.9% or even 55.1%, the results inevitably tank back to around the 50% mark.

Random trade outcomes and dollar profits

I wish this section was about how to make money with a random efficiency. Alas, we must cover how randomness can result in unjustified eurphoria.

I’ve been interested in the concept of randomness for several years now. Mathematicians refer to it with the more opaque name of a “stochastic process”. Despite the non-sensical name, it’s just a fancy way of saying the study of randomness – how it changes, its distribution, how far it “walks”, etc.

Yesterday, I used the analogy of coin flips to describe how Martingale strategies are probabilistically doomed to failure. One interesting concept that I did not mention relates to Brownian motion. Even with a set of random outcomes, trades will go on a random walk away from the starting point.

Einstein gets the real credit for solving the math behind the concept, even though his name is not on the term. He demonstrated that the distance a random process will follow is the square root of the number of trials. If we decide to flip a coin 60 times, we know that 50% of the time should fall on heads and the other 30 on tails.

It actually turns out that we should expect a very slight bias in the number of either winners or losers, although we do not know which one. It’s random. The precise bias, whichever way it prefers to go, should equal √60, which works out ~7. The heads outcomes should typically range from 23-37, with the tails outcomes making up the difference.

Seven trades out of sixty strongly alters the percentages, even if we know that it’s really supposed to be 50%. If heads only came up 23 times out of 60, that’s 38%. The problem is not with the coin. It’s with the number of trials. As you do an increasing large number of trails, the random bias decreases in significance in terms of the percent accuracy. 50,000 trades, for example, should show a surplus of roughly 223 trades in favor of winning or losing. The accuracy range falls within 1% of 50% on either side, a dramatic improvement.

Risks of curve fitting

Curve fitting a random efficiency relates to the idea of Brownian motion. Let’s say that we use a strategy that I know will never show an entry or exit efficiency: the moving average crossover. I’ve gone through this strategy six ways from Sunday, almost exclusively at the behest of clients. It does not work as a fully automated strategy. There is no secret set of fast and slow periods that will unlock the hidden keys to profit.

Most traders, experienced or not, abuse the backtester by searching for a set of parameters that yield the most dollar profit. They curve fit their test to optimize for maximum profit. What really happens is that the traders optimize for amount of random drift that already occurred.

When I used the example of 50,000 trades creating a natural drift of 223, I cited it with the purpose of showing how little it reduces the error in the real percent accurracy. The other consequence for trading systems is that as the error percentage decreases, the natural bias in your outcomes increases. Blindly running the optimizer only selects the set of combination that yields a combination of two criteria:

The drift that happened to work out in favor of that set of parameters
The profit and loss that varies with those parameters. The dollar profit naturally changes because the two moving averages cross at different points

You need a tool like efficiency to guard against these types of random outcomes. It’s the only method that I know of that definitively states whether or not a strategy behaves in a random manner. I especially like the fact that it breaks those elements down into two of the three basic components of a trading strategy: the entry, the exit, and the position sizing.

Efficient strategies do not work all of the time

Position sizing marks the final obstacle to building your fully automated trading strategy. A set of rules that yields a statistically efficient entry that is paired with an efficient exit does not necessarily make money. The value of each trading setup can vary, too.

Each strategy contains different sets of winners and losers. Each winner and loser varies in its dollar value. Whatever money management approach that you take requires balancing the ratio of the winners and losers in a way that normalizes the outcome of each trade. You ideally want to eliminate the variation in dollar value. 20 pip trades should earn or lose you exactly as much as the 100 pip trades.

That seems counter-intuitive. Most traders want to win in proportion with the size of the opportunity. It’s better from a system perspective to entirely ignore the size of the opportunity and to make each trade worth the same amount. Betting more or less with each trade effectively normalizes the value of each trade.

Using a stop loss stands out as an obvious candidate to fix how much a trade is worth. The severe disadvantage is that it almost always negatively affects the exit efficiency. Whenever I can get away with it, I always recommend using a market based exit instead of an arbitrary stop loss. Traders usually scream at the top of their lungs when they hear me say this. I’m just speaking as a systems developer. The numbers are what they are.

odie.rachmat says

July 1, 2013 at 13:49

I know it might seems silly or dumb,

As we know, backtesting in walkforward test is to check if a system robust ennough to trade in the out of sample data.

Lets say we do optimized in insample data 2000-2008, then test it on out sample data at 2008-2013. If the strategy still generated profit in 2008-2013 outsample data, so the system is robust ennough.

Now how about if we optimized the data in 2008-2013 as in sample data, then running test it at 2000-2013. I’ve seen some of my test generated profit also in those back out sample. So we could see, the system is robbust also for the market condition in the past that we didnt optimized it.

The thing is, we have optimized data with closest to current market condition. Som its had more adaptability to current market condition with the prove of robustnest in out sample market data in the past.

I guess The point of walkforward is to had the robustness as out sample data that not been optimized. So when the system had the same robustness with those out sample in the backward test, why dont we take it?

And the another point is not, where we will trade our real koney in the future, but actually wr trade our real money in the “out sample/ unknown data/market condition”.
Anything that we didnt optimized is an “out sample/unknown market condition”. So i guess , walk backward test ( i dont know what to name this idea ) had fullfill those out sample robustness if it make profit also in the out sample past. Of course with benefit of more adaptability to current market situation since the optimsation process is the nearest with current market.

Might be doing the optimisation every monthly, or if there was some basic aspect like historical drawdown breached, the pn we re-optimzed again for the past 2-3 years and also check for robustness of this optimization set to the backward out sample.

Is this thing make sense statistically? Please let me know you oppinion

Best Regards

Odie

Comments

Femto Trader says
September 18, 2012 at 13:18
I liked this article. The paragraph about “Risks of curve fitting” is interesting… It is also called in Ernie Chan book (Quantitative Trading: How to Build Your Own Algorithmic Trading Business) “data-snooping biases” that’s why he advises to have two parts for your data “in sample data” (to optimize settings) and “out sample data” (to test settings)
The concept of testing “effectiveness” or “efficiency” of entry/exit point is also interesting… that’s an other point of view that I will probably try to consider in my next backtests.
I ever see an idea that is not present in your article that is called “walk forward analysis” WFA or “walk forward optimization” WFO… a picture is sometimes better than text… https://www.google.fr/search?q=walk+forward+optimization
What is your opinion about this way of backtesting / optimizing ?
Shaun Overton says
September 19, 2012 at 08:27
Hi Femto,
Thanks for your comment. I highly recommend Ernie Chan’s book. It sits proudly on my bookshelf.
Walk forward analysis is exactly the same idea that you outlined from Chan with data-snooping bias. Walk forward is the more commonly accepted terminology. As for my opinion, walk forward is the only acceptable method for strategy testing.
odie.rachmat says
July 1, 2013 at 13:49
I know it might seems silly or dumb,
As we know, backtesting in walkforward test is to check if a system robust ennough to trade in the out of sample data.
Lets say we do optimized in insample data 2000-2008, then test it on out sample data at 2008-2013. If the strategy still generated profit in 2008-2013 outsample data, so the system is robust ennough.
Now how about if we optimized the data in 2008-2013 as in sample data, then running test it at 2000-2013. I’ve seen some of my test generated profit also in those back out sample. So we could see, the system is robbust also for the market condition in the past that we didnt optimized it.
The thing is, we have optimized data with closest to current market condition. Som its had more adaptability to current market condition with the prove of robustnest in out sample market data in the past.
I guess The point of walkforward is to had the robustness as out sample data that not been optimized. So when the system had the same robustness with those out sample in the backward test, why dont we take it?
And the another point is not, where we will trade our real koney in the future, but actually wr trade our real money in the “out sample/ unknown data/market condition”.
Anything that we didnt optimized is an “out sample/unknown market condition”. So i guess , walk backward test ( i dont know what to name this idea ) had fullfill those out sample robustness if it make profit also in the out sample past. Of course with benefit of more adaptability to current market situation since the optimsation process is the nearest with current market.
Might be doing the optimisation every monthly, or if there was some basic aspect like historical drawdown breached, the pn we re-optimzed again for the past 2-3 years and also check for robustness of this optimization set to the backward out sample.
Is this thing make sense statistically? Please let me know you oppinion
Best Regards
Odie
- Shaun Overton says
  July 1, 2013 at 15:12
  Hey Odie,
  There’s no mathematical theorem that I can give to disprove the idea. What I feel strongly about is that it doesn’t work.
  Let’s say you optimize for 2008-2010, then “forward test” the idea from 2000-2007. To some extent, the idea is already known and studied for the time that you’re testing.
  I actually spent the last month experiencing this very idea. A client tasked me with recreating his manual trading strategy, based on data supplied from 2010. I tested my algorithm, which did a fair job of approximating his trades. I was curious about prior performance, so I ran a test all the way back to 2000. The results, not surprisingly, were equally solid. If I stopped there, it’d look like 10 years of solid performance optimized only on a single year.
  Then, along came the dreaded walk forward test. The results completely fell apart. “Forward testing” on previous data didn’t accomplish anything.
  Thanks for the great question!
drofwarc says
February 15, 2015 at 13:13
Hi Shaun,
Congrats on your website. Lots of solid information here. I’ve just read with interest your article discussing the concept of measuring entry and exit efficiency. You’ve provided the formulas, but you don’t explain how to apply them. I’m in the process of backtesting a number of strategies in MT4. Is it possible to code the formulas into the EA I’m using for backtesting?
I look forward to hearing back from you.
drofwarc
- Shaun Overton says
  February 16, 2015 at 07:55
  Hi drofwarc,
  I’ve honestly moved on from efficiency to different analysis tools. Efficiency is still solid for throwing out most unadvantaged entry and exit strategies. That said, it doesn’t mean that what’s left is worth keeping.
  It’s not worth building the efficiency measurements when you can program a strategy quickly in TradeStation or NinjaTrader to uncover the same number.
  –Shaun