Back Test Results


It’d be helpful for analysis, particularly for high frequency trading, if the back test results could be shown to two decimal places. For example, if the “Average % Profit” is shown as, say, 3.9% it matters whether the actual number is 3.85% or 3.94%. Similarly, if “Average Days Held” is shown as “8 Days”, it matters whether that is actually 7.50 days or 8.49 days.

I do the calculations currently by copy/pasting the Trade List into a spreadsheet, but that is obviously much slower.



Hi Kim,

This is an interesting one. While we could do that, the concern I have is that back tested results are a guide not a forecast. If one system was 3.85% and the other was 3.94%, I would consider them the same result. The reason is that the result comes from a very specific path through all the possible buys and sells which could have been made with the capital allocated. We call this path dependency because the result is dependent on the path. The smallest change to position size, capital, start date etc can have a large affect on the result.

On top of that is the issue that averages hide a lot of information (you can read more on that here

That’s not to say that Backtesting is a fool’s errand. It certainly is very important to test an idea, but I’m more interested in the Win Rate (Probability of Gain) because that will determine how long I can stay in the market. There is a whole study by Ralph Vince on Optimal f and Risk of Ruin (things we want to implement in our new generation testers). The idea is to quantify what is my risk of going broke based on my Win Rate. You can see that with no leverage (column 1:1), I need a Win Rate above 60% to survive (0% means no chance of going bust - note that this is also position size dependent).

That’s why that is much more important that the return from a test.


Of course everything is a compromise, we may take a lower win rate if the returns are higher. But if I had a really high return and a lower win rate (less than 50%) it would tell me that my return came from rare events which I can not count on.

Sorry for jumping on the soap box on this. I work a lot in this area and I just want to help people understand the limitations of testing.

All the best