Category Archives: Quandl

Intra-day data from Quandl and a new tick database in town – party time !

(with permission from Quandl)

Quandl will soon be offering intra-day data (1 min bars). Rock on !

I was kindly given some data to test out (see below). I can’t say much more than this but keep an eye out for an official announcement soon 🙂

With both QuantGo and Quandl offering reasonably priced intra-day data, smaller trading shops have never had it so good.

I’ve been involved in the integration (i.e. messaging) of trading systems since my days working at IBM nearly 10 years ago. My current research is centred around the development of predictive models for trading systems, looking at different data sources to feed into my models. In today’s world, data is streaming from all directions and so the successful integration of data is vitally important.

Kerf is a new columnar tick database and time-series language designed especially for large volumes of numeric data. It is written in C and natively speaks JSON and SQL. Kerf has support for both real-time and historical databases and it provides data loaders for all the common Quandl formats, so you can run queries against Quandl data right away.

I’ve taken Kerf for a quick spin and via it’s Foreign Function Interface (FFI), which makes it possible to call out to C libraries from Kerf and call into Kerf from C, I can query the data using R (very easily). With ZeroMQ for my messaging, it could stack up to be a seriously good trading or analytics system.

I really like what I’ve seen so far. Is this a true competitor to Kx‘s kdb+ database ? Possibly. You should check it out for sure !

repl-header

Here is the output from a simple request/response using S&P 500 One Minute Bars from Quandl. A C application acts as the handler for Kerf requests. It calls into Kerf and responds back to the client with the data. A R script requests the closing prices for AAPL, but just as easily C++, Java or Python could be used. The messages are in JSON format and these are then transformed to xts in R. Messaging uses ZeroMQ.

R (rzmq) -> C (libzmq) -> kerf (FFI) -> C (libzmq) -> R (rzmq)

I’ve decided to further develop the R API and to start integrating my real-time Interactive Brokers feed to Kerf (e.g. Java API and ZeroMq). If anyone is interested in this, please get in touch.

kerf_handler_jpg

rclient_kerf_jpg

 

Strategy Replication – Evolutionary Optimization based on Financial Sentiment Data

Wow, I enjoyed replicating this neatly written paper by Ronald Hochreiter.
Ronald is an Assistant Professor at the Vienna University of Economics and Business (Institute for Statistics and Mathematics).

In his paper he applies evolutionary optimization techniques to compute optimal rule-based trading strategies based on financial sentiment data.

The evolutionary technique is a general Genetic Algorithm (GA).

The GA is a mathematical optimization algorithm drawing inspiration from the processes of biological evolution to breed solutions to problems. Each member of the population (genotype) encodes a solution (phenotype) to the problem. Evolution in the population of encodings is simulated by means of evolutionary processes; selection, crossover and mutation.
Selection exploits information in the current population, concentrating interest on high-fitness solutions. Crossover and mutation perturb these solutions in an attempt to uncover better solutions. Mutation does this by introducing new gene values into the population, while crossover allows the recombination of fragments of existing solutions to create new ones.

After reading Ronald’s paper I immediately wanted to test the hypothesis that the model is good at predicting the 1-day ahead direction of returns. For example, when the rule determines to go long, are the next day returns positive and when the rule determines to exit the long position or stay flat, are the next day returns negative. The results are not much better than a flip of a coin (see the results in the attachment below). Also, turnover is high (see the plot) which may warrant the strategy useless.

However, many variations on this genetic algorithm exist; different selection and mutation operators could be tested and a crossover operator could be added. Instead of using financial sentiment data a variety of technical indicators could be used for generating an optimal trading rule – e.g. see “Evolving Trading Rule-Based Policies“.

I emailed Ronald to get clarification regarding several questions I had. He kindly and swiftly responded with appropriate answers.

  • No crossover is used as the chromosome is too short.
  • The target return for the Markowitz portfolio is calculated as the mean of the scenario means, i.e. mean of the mean vector.
  • Pyramiding is not considered. The rule just checks whether we are invested (long) in the asset or not.
  • A maximum number of iterations is specified as the stopping rule.

Like my earlier post, End-of-Day (EOD) stock prices are sourced from QuoteMedia through Quandl’s premium subscription and the StockTwits data is sourced from PsychSignal.

The following comparisons and portfolios were constructed:

  1. In-Sample single stock results – Long-only buy-and-hold strategy vs. Optimal rule-based trading strategy
  2. Out-of-Sample Buy-and-hold optimal Markowitz portfolio
  3. Out-of-Sample Buy-and-hold 1-over-N portfolio
  4. Out-of-Sample Equally weighted portfolio of the single investment evolutionary strategies

I used R packages quadprog and PerformanceAnalytics but I wrote my own Genetic Algorithm. I’ll continue using this algorithm to evaluate other indicators, signals and rules  🙂

Here’s some code with the results. The evolutionary risk metrics (pg. 11) are not as good as those in the original paper (I used 100 generations for my GA) but as you can see, my output is almost identical to Ronald’s. A clone perhaps – hehe.

replicatoR_mendel_ronh


If you have a specific paper or strategy that you would like replicated, either for viewing publically or for private use, please contact me.

A Trendy and Innovative Investment Strategy

“It has been a cruel summer for one of the trendiest, most innovative investment strategies of the asset management industry”.

This quote was taken from an article last week in the FT.

• Title:  Risk parity funds suffer a cruel summer
• Author: Robin Wigglesworth
• Source: Financial Times (FT)
http://www.ft.com/cms/s/0/d210373e-5142-11e5-8642-453585f2cfcd.html?siteedition=intl#axzz3kbVfJfnX

Here I will demonstrate how a risk parity portfolio can be calculated quite easily using R.

By definition, a risk parity portfolio is one for which all percentage contributions to risk (PCTR) are equal. By the same definition, it means that the total contributions to risk (TCTR) are all equal also.

Using some basic calculus, it can be shown that the volatility of a portfolio can be decomposed as the weighted sum of each asset’s marginal contribution to risk (MCTR). MCTR tells us the impact of an infinitesimal increase in an asset’s weight on the total portfolio risk. With each MCTR known, each assets PCTR can be derived.

In general, a risk-parity portfolio needs to be solved by some numerical method, for example a Newton algorithm.
The setup for the Newton method requires writing the risk parity problem as a system of nonlinear equations (imposing the restriction that the weights add up to one), finding the Jacobian (using multivariable calculus) and taking a one-term Taylor expansion approximation. Then the Newton iteration can be formulated by defining a stopping rule and starting point.

For reference, see “Efficient Algorithms for Computing Risk Parity Portfolio Weights”

So using simple arithmetic and matrix operations, I demonstrate how this can be implemented. My example portfolio consists of 9 liquid stocks using end-of-day (EOD) prices sourced from Quandl.

Risk_Parity_Newton

As we can see in the attachment, each asset’s PCTR is 11.11%
Now, is that trendy and innovative or what?

Thanks to Doug Martin for his lectures on portfolio optimization.

Tap into the Pulse of the Markets

My first post 🙂

So to begin, here is a strategy I created recently, combining sentiment data with technical indicators, and using a machine learning classification technique named Support Vector Machines (SVM).

End-of-Day (EOD) U.S. stock prices are sourced from QuoteMedia through Quandl’s premium subscription.
The chosen source of sentiment is from StockTwits message posts that have been aggregated and scored by PsychSignal. This data too can be obtained through Quandl.

Tap_into_the_Pulse_of_the_Markets