Category Archives: Optimization

Strategy Replication – Evolutionary Optimization based on Financial Sentiment Data

Wow, I enjoyed replicating this neatly written paper by Ronald Hochreiter.
Ronald is an Assistant Professor at the Vienna University of Economics and Business (Institute for Statistics and Mathematics).

In his paper he applies evolutionary optimization techniques to compute optimal rule-based trading strategies based on financial sentiment data.

The evolutionary technique is a general Genetic Algorithm (GA).

The GA is a mathematical optimization algorithm drawing inspiration from the processes of biological evolution to breed solutions to problems. Each member of the population (genotype) encodes a solution (phenotype) to the problem. Evolution in the population of encodings is simulated by means of evolutionary processes; selection, crossover and mutation.
Selection exploits information in the current population, concentrating interest on high-fitness solutions. Crossover and mutation perturb these solutions in an attempt to uncover better solutions. Mutation does this by introducing new gene values into the population, while crossover allows the recombination of fragments of existing solutions to create new ones.

After reading Ronald’s paper I immediately wanted to test the hypothesis that the model is good at predicting the 1-day ahead direction of returns. For example, when the rule determines to go long, are the next day returns positive and when the rule determines to exit the long position or stay flat, are the next day returns negative. The results are not much better than a flip of a coin (see the results in the attachment below). Also, turnover is high (see the plot) which may warrant the strategy useless.

However, many variations on this genetic algorithm exist; different selection and mutation operators could be tested and a crossover operator could be added. Instead of using financial sentiment data a variety of technical indicators could be used for generating an optimal trading rule – e.g. see “Evolving Trading Rule-Based Policies“.

I emailed Ronald to get clarification regarding several questions I had. He kindly and swiftly responded with appropriate answers.

  • No crossover is used as the chromosome is too short.
  • The target return for the Markowitz portfolio is calculated as the mean of the scenario means, i.e. mean of the mean vector.
  • Pyramiding is not considered. The rule just checks whether we are invested (long) in the asset or not.
  • A maximum number of iterations is specified as the stopping rule.

Like my earlier post, End-of-Day (EOD) stock prices are sourced from QuoteMedia through Quandl’s premium subscription and the StockTwits data is sourced from PsychSignal.

The following comparisons and portfolios were constructed:

  1. In-Sample single stock results – Long-only buy-and-hold strategy vs. Optimal rule-based trading strategy
  2. Out-of-Sample Buy-and-hold optimal Markowitz portfolio
  3. Out-of-Sample Buy-and-hold 1-over-N portfolio
  4. Out-of-Sample Equally weighted portfolio of the single investment evolutionary strategies

I used R packages quadprog and PerformanceAnalytics but I wrote my own Genetic Algorithm. I’ll continue using this algorithm to evaluate other indicatorssignals and rules  🙂

Here’s some code with the results. The evolutionary risk metrics (pg. 11) are not as good as those in the original paper (I used 100 generations for my GA) but as you can see, my output is almost identical to Ronald’s. A clone perhaps – hehe.

replicatoR_mendel_ronh


If you have a specific paper or strategy that you would like replicated, either for viewing publically or for private use, please contact me.

Risk Budgeted Portfolios with an Evolutionary Algorithm

In my previous post I demonstrated how a risk parity portfolio could be calculated numerically using a Newton method. However, the more common problem is to be able to set risk budget constraints as opposed to having equal risk. We can do this with PCTR constrained optimization by setting PCTR inequality constraints on portfolio assets, or groups of assets. This results in an optimization problem with non-linear inequality constraints. As such, Linear or Quadratic programming won’t work and a more powerful optimization technique is needed. Differential Evolution (DE) is one such solution.

DE is a search heuristic introduced by Kenneth Price and Rainer Storn. It’s a very simple yet powerful population based, stochastic function minimizer that has been shown to be effective for global optimization of multidimensional, nonlinear, multimodal, and highly-constrained functions. It does not require that the function be either continuous or differentiable.

DE belongs to the class of evolutionary algorithms which use biology-inspired operations of crossover, mutation, and selection on a population in order to minimize an objective function over the course of successive generations. As with other evolutionary algorithms, DE solves optimization problems by evolving a population of candidate solutions using alteration and selection operators. DE uses floating-point instead of bit-string encoding of population members, and arithmetic operations instead of logical operations in mutation.

The DEoptim function in the R package with the same name performs evolutionary global optimization via the Differential Evolution algorithm.

DEoptim can be used to optimize a risk parity portfolio also. We can apply the Gini coefficient as a measure of portfolio concentration by computing risk contribution inequality (or risk contribution concentration). The Gini coefficient has an advantage over the more popular Herfindahl–Hirschman Index (HHI) measure of portfolio concentration in that it is always unitized between 0 and 1. For reference, see “Efficient Algorithms for Computing Risk Parity Portfolio Weights

First I demonstrate PCTR constrained optimization by setting PCTR inequality constraints on the portfolio assets I used in my previous post. Specifically, I set the upper threshold for each PCTR to be 20%.

library(parallel)
library(PerformanceAnalytics)
library(DEoptim)

# N assets
n <- ncol(returns) 

portfolio_risk_budget_obj <- function(w, r, upper_pctr_threshold) {
    
    # Instead of adding a penalty to the full investment constraint, ensure the 
    # weights sum to 1. (see Ardia, 2011, Differential Evolution with DEoptim)
    if (sum(w) == 0) {
        w <- w + 0.01
    }
    w <- w/sum(w)
    
    # Calculate the component contribution to standard deviation of the portfolio
    component_contrib <- StdDev(r, portfolio_method = "component", weights = w)
    
    # ###########################
    # Now compute the penalties #
    # ###########################
    
    # Penalty for portfolio volatility
    port_risk_penalty <- 200 * as.numeric(component_contrib[[1]])
    
    # Assets above the upper PCTR threshold
    assets_above_pctr_threshold <- component_contrib[[3]][component_contrib[[3]] > 
                                                              upper_pctr_threshold]
    
    # Penalty for risk budget (upper)
    risk_budget_penalty <- sum(assets_above_pctr_threshold)
    
    # Objective function to minimize
    obj <- port_risk_penalty + risk_budget_penalty
    
    return(obj)
}

# ##############################################################################
# Perform evolutionary global optimization via the 
# Differential Evolution algorithm.
# ##############################################################################

# Set the seed of the random number generator before calling DEoptim because DE 
# is a randomized algorithm, and the results may vary between runs. This allows
# the exact reproduction of results.
set.seed(1)

# Lower and Upper values for the weights (long-only)
lower <- rep(0, n)
upper <- rep(1, n)

# Number of population members
NP <- 20 * n

# Optimize
out <- DEoptim(fn = portfolio_risk_budget_obj, # the function to be optimized
               lower = lower, # lower bounds on each parameter to be optimized
               upper = upper, # upper bounds on each parameter to be optimized
               control = list(
                   itermax = 200, # maximum iteration (population generation) allowed.
                   NP = NP,       # number of population members
                   strategy = 1,  # DE / rand / 1 / bin (classical strategy)
                   storepopfrom = 1, # store intermediate populations from 1st gen
                   parallelType = 1, # use all available cores
                   trace = FALSE,    # do not print out progress at each iteration
                   packages = c("PerformanceAnalytics")), # load package for fn
               # two arguments to be passed to fn portfolio_risk_obj
               r = returns, 
               upper_pctr_threshold = 0.20)

# The best set of parameters (i.e. weights) found.
(w <- out$optim$bestmem / sum(out$optim$bestmem))
##        par1        par2        par3        par4        par5        par6 
## 0.002915525 0.244992874 0.046066171 0.056544851 0.165894008 0.145604661 
##        par7        par8        par9 
## 0.119851233 0.007811000 0.210319676
component_contrib <- StdDev(returns, portfolio_method = "component", weights = w)

# Display each asset's PCTR
round(component_contrib[[3]], 4)
##   AAPL      T    BAC   CSCO      F     GE   INTC   MSFT    PFE 
## 0.0032 0.1994 0.0517 0.0628 0.1785 0.1599 0.1381 0.0098 0.1968

As is shown above, each asset’s PCTR is constrained to be below 20%.

A risk parity portfolio using the Gini coefficient is now demonstrated.

portfolio_risk_parity_obj <- function(w, r, n) {
    # Instead of adding a penalty to the full investment constraint, ensure the 
    # weights sum to 1. (see Ardia, 2011, Differential Evolution with DEoptim)
    if (sum(w) == 0) {
        w <- w + 0.01
    }
    w <- w/sum(w)
    
    # Calculate the component contribution to standard deviation of the portfolio
    component_contrib <- StdDev(r, portfolio_method = "component", weights = w)
    
    # #####################################
    # Now compute the "Gini" coefficient  #
    # #####################################
    
    # First sort the PCTR in non-decreasing order
    risk_contributions <- sort(component_contrib[[3]])
    
    # G = 2/N * SUM(i * (PCTR_i - mean(PCTR)))
    gini <- 0
    for(i in 1:length(risk_contributions)) {
        gini <- gini + (i * (risk_contributions[i] - mean(risk_contributions)))
    }
    gini <- 2/n * gini
    
    return(gini)
}

set.seed(1)

# Optimize
out <- DEoptim(fn = portfolio_risk_parity_obj, # the function to be optimized
               lower = lower, # lower bounds on each parameter to be optimized
               upper = upper, # upper bounds on each parameter to be optimized
               control = list(
                   itermax = 200, # maximum iteration (population generation) allowed.
                   NP = NP,       # number of population members
                   strategy = 1,  # DE / rand / 1 / bin (classical strategy)
                   storepopfrom = 1, # store intermediate populations from 1st gen
                   parallelType = 1, # use all available cores
                   trace = FALSE,    # do not print out progress at each iteration
                   packages = c("PerformanceAnalytics")), # load package for fn
               # two arguments to be passed to fn portfolio_risk_obj
               r = returns, 
               n = n)

# The best set of parameters (i.e. weights) found.
(w <- out$optim$bestmem/sum(out$optim$bestmem))
##       par1       par2       par3       par4       par5       par6 
## 0.09361924 0.15820304 0.09990865 0.09989595 0.11490732 0.11324291 
##       par7       par8       par9 
## 0.10180668 0.08351423 0.13490198
component_contrib <- StdDev(returns, portfolio_method = "component", weights = w)

# Display each asset's PCTR
round(component_contrib[[3]], 4)
##   AAPL      T    BAC   CSCO      F     GE   INTC   MSFT    PFE 
## 0.1118 0.1112 0.1115 0.1102 0.1105 0.1113 0.1113 0.1108 0.1114

As is shown above, each asset’s PCTR is 11.11% (i.e. 1/n)