Friday, April 27, 2007

Stock Market Analysis Tools

Stock Market Analysis Tools

For the last 5 months (basically since I finished my Master's thesis) my stock market prediction software has been on hold. I was making great progress on it and thought that I had finally cracked the problem last September. However after running a real time trial that essentially paper traded all the stocks on the S&P 100 on a daily basis, my theoretical accuracy of 60-70% prediction rates turned out to be a bit of a wash. In short, after three months I neither made money, nor lost money. But I did spend about 1000 dollars a day in trading fees (err... virtual dollars that is).

The upper line is my theoretical profit on the stock CAT using predictions...the lower line is the value without predictions. The triangles are predicted short sells, while the circles are predicted long buys, see why I was excited?

I racked my brain and poured through my code. After a month of searching through the code and comparing the real time predictions to the simulated predictions my conclusion is that my framework for building and testing my learning models is sound. However the data that I was using to train the models was not. Tracing it down even further I began to realize that the functions I had written to compute the technical analysis indicators (see note 1) were incorrectly programmed and brought inconsistencies to the data. Thus the data for the training sets was inconsistent with the live data... my models were learning Portuguese while the stock market was speaking Spanish.

Not wishing to rewrite all those functions again I set out looking for some open source software to compute the technical indicators. Amazingly, I could not find a single Matlab, Python, Java or any language package that did such a simple task. I put it on the back burner and pursued other things for a while.

But all is not lost! Yesterday while searching through packages on the statistical computing language R I finally found one that did it! So now the task is for me to write some functions to take my data, use the preexisting functions in Rmetrics to transform it, and then rerun it back through my prediction engine....and start another real time trial! The website that this cool package can be found is... Rmetrics .

Enjoy, I know you all are just dying to try out both R and Rmetrics!

Note 1 - Technical Analysis Indicators are nothing more than non-linear transformations of the price data which are meant to bring out certain aspects of the price. For instance, a 200 day moving average indicator attempts to show a trend in the data using a 200 day smoothing transformation (actually just an average). For SVM's (the basis of my prediction algorithm) non-linear transforms are like glasses...they really bring clarity to an otherwise very fuzzy data set.

2 comments:

Nathan said...

That is exciting that you found a pre-built set of tools for your program and that they are open-source. The only challenge will be if and when you decide to commercialize it, you will have to look at the licensing of using open source code in your software. I am looking forward the your analysis of how the new algorithms help you program.

Ray said...

Theo,

This is indeed exciting. I am glad you found another possibility to try out and more importantly, you are persisting and not giving up. Looking back on many of my projects, I gave up when it got hard and that is no way to innovate!

Let me know if you need any help if you need some database stuff.

Ray