Evolving investment strategies using genetic programming

Back dated: 2013-07-28

This is AI applied in Computational Finance...

May/June of 2013 a friend of mine and I worked on an AI varsity project together. The aim of the project was to either evolve solutions to soduku puzzles using an evolutionary algorithm of our choice, or to evolve decision trees using evolutionary programming.

We made the choice to evolve decision trees that could be used to decide, given a set of financial data for a trading stock, whether to short or long that stock. Using genetic programming we evolved the tree using real stock data from 62 IT stocks off the S&P 500.

The study

We did a comparitive study between the performance of trees evolved using financial fundamental analysis against technical analysis as well as the average movement of the market (for those stocks) of the year 2011.

Sample Decision Tree (shown using various fundamental analysis indicators)

Sample Decision Tree

First we're going to look at the problem we are attempting to solve, Securities Analysis, then a brief overview of Genetic Programming, and lastly some technical words about specific implementation details.

Securities Analysis

Security analysis is the branch of Investment Management that deals with the analysis of tradable financial instruments called securities. Many techniques of security analysis exist, the most popular of which are Technical Analysis and Fundamental Analysis. Technical Analysis uses historic market activity including past prices and volumes traded to evaluate securities.

Fundamental analysis uses core business and economic metrics relating to the company and
the broader economy to evaluate the security. Both strategies have the end goal of determining whether a security should be bought or sold, meaning, whether to take a long or short position on the stock.

Financial indicators

We used financial indicators from Fundamental Analysis and Technical Analysis as the decision criteria (inner nodes) of the decision tree. Giving a decision tree a particular stocks indicator values one is able to traverse the tree moving left or right at each decision criteria node, and finally reaching a leaf node which represents one of 2 decisions, a SHORT or LONG position to be taken on the stock.

Genetic Programming

Genetic Programming is the computational evolution of tree data structures, in our case, decision trees. Genetic Programming forms part of the family of Evolutionary Algorithms (EA's), but where typical EA's are concerned with the evolution of solutions to the objective problem (these solutions are called genotypes), we are concerned with the evolution of behaviour of a solution these solutions are called phenotypes).

Read up a little on genetic programming and evolutionary algorithms should you not be familiar with the topics, they're absolutely fascinating.

Fitness function

As our fitness function we evaluated a tree by giving it each stock's set of financial indicators (either fundamental or technical) and evaluated the trees performance (how much money it made) on historical stock data for the year 2011 for all 62 stocks (stock data acquired from google finance and morning star).

We then in addition to the return on investment, rewarded the fitness of a tree if it was small (reward = 0.2/size(tree)), this had the effect of evolving trees that were not only accurate investment strategies but also memory efficient, comprehensible decision trees that may be themselves analyzed to identify relationships between investment outcome and the financial indicators present in a tree.

Reproduction operators

Crossover operators

We used 2 crossover strategies and compared the results of each:

  1. Sexual One Point Crossover: given 2 parent trees, one sub branch of each is randomly selected and swapped, producing 2 new offspring, each containing new genetic material.
  2. Coarse Grained Sexual One Point Crossover: due to the destructive nature of selecting sub trees that may be a severely different depth levels in the tree therefore swapping large chunks of genetic material for much smaller trunks, we found the above strategy to be rather destructive. To control this problem we proposed swapping the left branch of the first parent, with the left branch of the second. By doing so, on average, the amount of genetic material transferred was kept relatively constant.
Mutation Operators

We made use of many mutation operators:

  1. Non-terminal node mutation: one non-terminal node from each tree (residing in one of the last 2 depths before the leafs) is randomly selected and is swapped. This can be seen as a mild version of Sexual One Point Crossover.
  2. Terminal node mutation: one terminal node from each tree is randomly selected and is swapped.
  3. Indicator mutation: a non-terminal node is randomly selected and it's financial indicator replaced with a randomly selected financial indicator from the set of all financial indicators for the tree type (Fundamental / Technical).
  4. Gaussian mutation: the floating point value of a non-terminal node is mutated according to a gaussian distribution of mean 0 and deviation of 10% of the domain of that node's indicator.
  5. Grow mutation: a randomly selected terminal node is replaced with a randomly generated non-terminal node, which contains in itself 2 opposing decisions (SHORT / LONG).
  6. Trunc mutation: a randomly selected non-terminal node appearing just before a terminal node in a traversal is replaced with a randomly selected terminal node.

Average Market Trend for 62 IT Stocks Analyzed

Average Market Trend for 62 IT Stocks Analyzed

Average Trend against Decision Trees evolved using Fundamental Analysis (FA) and Technical Analysis (TA) for Quarters 1-4 of 2013

Performance comparison

Tabulated data of above comparison graph

Performance comparison tabular data

Conclusion

This article really targets a niche of people interested in artificial intelligence (more specifically, computational intelligence) and well as computational finance. If you enjoyed this read please take a look at the full extent of our research, this article was really just a taste of the depth of the reports. The code base is also very well structured and is easily adaptable/extendible.

In addition to the below links, check out Quantopian, the first online algorithmic trading platform; and a great TED talk by Kevin Slavin about how algorithms are changing our world: [TED (http://www.ted.com/talks/kevinslavinhowalgorithmsshapeourworld.html).

Interested in this article? Find out more

  1. Mirror blog post: Technical Analysis
  2. Research reports: Fundamental Analysis
    & Technical Analysis
  3. Code base: github repo

Simon van Dyk

Read more posts by this author.

Johannesburg, South Africa