Scientific research is becoming increasingly dependent on sophisticated software, custom written for individual research projects. These complex computer programs are rarely published together with the research they produce, which makes it cumbersome for other researchers to validate results.
This is how Thomas Wiecki opened his recent piece about the challenges of reproducing academic paper results at the quantitative blog, Quantopian. In this post, Thomas takes a look at how this growing trend in the scientific research community is affecting theoretical testing in the quantitative trading community.
The basic point that Thomas makes is that without all of the exact data and algorithms behind the theoretical results of a trading system, the system itself has little value. This is because it is so difficult to repeat these results without knowing for certain exactly how they were obtained. Not having all of the original data can make it very difficult to reproduce quantitative strategies.
He describes the problem very concisely:
There are many articles describing strategies that seem to work very well on paper, but without access to the code and data used to produce those strategies, it is very difficult to confirm their validity.
He then goes on to describe a recent example where his community encountered this exact problem:
I came across a paper claiming that Google Search trends for certain queries (e.g. the word “debt”) are predictive of market movements. According to the paper, this ability to anticipate market movements lead to a trading strategy that yielded a whopping 326% over the course of 7 years.
Not surprisingly, his community of researchers had a hard time reproducing those theoretical testing results:
Unfortunately, although the algorithm was easy to program we were getting nowhere close to the original 326% returns achieved in the paper. Was there an error in our programming somewhere? Or did the original paper contain a bug?
Since this is the Internet, and I was not aware who the original author was, I was ready to discredit the results and move on at this point. Interestingly, Thomas and his community kept trying to match the original results. They actually reached out to the original author, Tobias Preis:
He responded to our outreach and provided the data that was used in the publication along with a script to reproduce the results. Over the next couple of weeks our community worked tirelessly to iron out any bugs we found compared with the reference implementation.
However, even after making sure the algorithm was working identically, we still were not able to exactly reproduce the results from the paper.
Once again, at this point I was ready to discredit the author and move on. However, Thomas and his community pushed forward. They eventually found that the different results were a product of using different data:
As it turns out, in 2011 Google changed the data format in a way that degrades the signal. The paper was based on the higher quality data downloaded prior to this change. Once we plugged in the original data to our algorithm, we were finally able to reproduce the results from the paper.
Thomas points out that if the original author was not so willing to have his work completely dissected, they never would have been able to figure out why they were getting such different results. He concludes by addressing the key point of reproducibility:
Reproducibility is not only what “keeps us honest” as scientists – it is also a key step in the iterative process of developing ideas and building on the work of peers. If journals required the publication of all software and raw data used in research papers, reproducing results would be far simpler and the process of innovation would only accelerate.
Having all of the source data and code is the easiest way to reproduce quantitative strategies. If we can reproduce these strategies faster, we can spend more time improving them.