Quantitative Trading

A field of trading systems that facilitate the automation of trading to profit from market inefficiencies.

Author: Punit Manjani
Punit Manjani
Punit Manjani
Punit Manjani is a highly skilled professional with experience in VC, contributing to strategic investments, Market research, and deal sourcing. Currently, Punit works at Loka Capital demonstrating expertise in financial modeling, due diligence, and market research. Known for negotiation and leadership prowess, Punit has a proven track record of successful leadership and entrepreneurial endeavors.
Reviewed By: Christy Grimste
Christy Grimste
Christy Grimste
Real Estate | Investment Property Sales

Christy currently works as a senior associate for EdR Trust, a publicly traded multi-family REIT. Prior to joining EdR Trust, Christy works for CBRE in investment property sales. Before completing her MBA and breaking into finance, Christy founded and education startup in which she actively pursued for seven years and works as an internal auditor for the U.S. Department of State and CIA.

Christy has a Bachelor of Arts from the University of Maryland and a Master of Business Administrations from the University of London.

Last Updated:October 20, 2023

What Is Quantitative Trading?

Quantitative Trading is a field of trading systems that involves the use of algorithms, and complex mathematical formulations, to automate the trading (buy and sell) signals. There are various ways a computer is used to find profitable trades. 

These algorithms can be divided into two parts:

  1. Machine Learning algorithms
  2. High-Frequency trading Arbitrage strategies

Mathematicians and Statisticians come up with various alphas or financial models, backtest them over historical data to generate output, and analyze the strategy. 

Some commonly used metrics to understand the efficiency of a strategy are:

  • Sharpe ratio
  • Drawdown
  • Profit factor
  •  Percentage of profitable trades

To backtest a strategy, various platforms and software are used. Backtesting Platforms are the platforms/software that helps backtest simple strategies, portfolio optimization, futures testing, etc. 

This way, the algorithms’ hyperparameter can be tuned for maximum profits. Here are some of the most commonly used backtesting platforms:

  1. TradingView
  2. MetaTrader 5
  3. AmiBroker
  4. MetaStock
  5. NinjaTrader
  6. MATLAB
  7. Python
  8. Excel

Key Takeaways

  • Quantitative trading uses algorithms for automated trading signals, divided into machine learning and high-frequency strategies.

  • Machine learning algorithms analyze historical data to predict future outcomes, while high-frequency strategies exploit speed and price differences for small profits.

  • Common quantitative trading algorithms include LSTM, RNN, ARIMA, and decision trees.

  • High-frequency trading strategies include latency arbitrage, statistical arbitrage, and index arbitrage.

  • Cross-validation is essential for evaluating model performance and avoiding overfitting.

Machine Learning Algorithms Vs. High-Frequency Strategies

Machine Learning algorithms and high-frequency trading strategies both play important roles in the finance, but they serve different purposes and have different characteristics.

Let us understand the difference between the two and how they are used to generate profits.

Machine Learning Algorithms

As the term suggests, the Machine learns patterns in the trend. 

The researcher inputs historical data, also known as time-series data, and the machine tries to find a mathematical mapping between the date and the price of the stock (it can be closing, opening, low, or high; closing is the most frequently used price data). 

Based on trained past data, the machine tries to predict future outcomes. The Machine learning model is always back-tested before usage. The Machine learning model might need to be updated regularly, as the seasonality and trend cycles may change over a period of time. 

High-Frequency Strategies

High-frequency trading strategies use the concept of arbitrage. Let us first understand the term arbitrage before proceeding further. Arbitrage is an investment strategy that involves buying and selling assets on different exchanges; that small difference constitutes profit for the investor. 

Since these are high-frequency signals, these small profits add up to millions for investors. 

Quantitative Trading—Machine Learning Algorithms

The following are the most commonly used Machine Learning Algorithms for time series analysis:

  1. Recurrent Neural Network
  2. Long Short-term Memory

1. Classic Methods

  • Multi-Layer Perceptron
  • ARIMA
  • Bayesian Neural Network
  • Radial Basis Functions Neural Networks
  • Generalized Regression Neural Network
  • K-Nearest Neighbor
  • CART Regression Trees
  • Support Vector Regression
  • Gaussian Processes

2. Topical Methods

  • Convolutional Neural Network
  • Attention Mechanism
  • Transformer Neural Network
  • Kaggle
  • LightGBM
  • Decision Trees
  • XGBoost
  • AdaBoost

3. Challenges in Time Series Analysis

  • One of the major challenges from a mathematical point of view is the data or the sample size available. For any data, several samples can be generated, but time series data for a particular stock will remain unique, hence the sample size is one.
  • The other major challenge from a mathematical point of view is for a time series, we capture only a particular cycle or trend in the market and not the complete cycle from the very beginning. 
  • Many time series models work as a black box. No one can understand mathematical formulation or any correlation in many of the models, hence there is no one to supervise the results.

4. Recurrent Neural Network

This network maintains time-to-time internal data (state) and hence can handle temporal dependence of data. 

5. Long Short-term Memory

LSTMs are a modified version of RNNs. They take into consideration the temporal dynamics of sequential data and take care of the large gradients. Thus, it resolves the issue of an outlier (or a short-term change) in the time series data. 

The above two algorithms are one of the most important algorithms as they train from a temporal perspective, hence giving better results. 

These time steps can be converted into a vector input of one variable. This is how a Multi-Layer perceptron works. Hence in some autoregression models, MLP outperforms LSTM. 

6. ARIMA

ARIMA stands for Auto-Regressive Integrated Moving Average model. This model gives a mathematical formula followed by the time series. Hence, it is one of the most popular time series models. 

Looking at the mathematical formula of the trend, one can understand the market at a deeper level.

For forecasting a non-linear time series, there are a series of algorithms like Generalized Regression Neural Networks, RBF neural Networks, Bayesian Neural Networks, etc. 

These algorithms use function approximation theory, which uses all kinds of functions, such as polynomials, Fourier, finite elements, etc, for approximation. Support Vector Machine algorithms can also train non-linear time series. 

7. CART Regression Trees

CART Regression Trees can produce accurate results based on input, as they take care of common issues with ML algorithms, such as overfitting. However, they cannot detect huge gradients or drifts. 

8. Random Forest and Decision Trees

Random Forest is an ensemble method consisting of decision trees. Each decision tree is either a subset of data or a subset of the factors considered for modeling. Random Forest similar to CART regression trees, can take care of common machine learning model issues. 

High-Frequency Quantitative Trading Strategies

There are various types of arbitrage strategies, such as Latency Arbitrage, Statistical Arbitrage, and index Arbitrage. These Arbitrage strategies rely on speed and price inefficiencies in the market.

Let us understand these strategies in detail.

What is Latency?

Latency is a term used for network or transmission speed. It estimates the time taken for data to travel from one point to another. The major factor determining the latency value is the distance. 

What is Latency Arbitrage?

As we have understood what latency is, let us try to look at how it is used to the advantage by the HFT firms. In the case of trading, there is a distance between the firms' servers and the broker or exchanges’ servers. 

Hence to overcome this time delay, firms have spent a huge amount of money on better, faster infrastructure and reduced the latency to its lowest. Due to a decrease in latency, these firms are able to deal in newer prices before the retail investors or other non-quant investors. 

This time lead-in price viewing helps firms gain profits in a fraction of a second. This type of strategy relies on the small profits gained from time disparity between the Quant investors and non-quant investors. 

What is Statistical Arbitrage?

Statistical Arbitrage, also known as stat arb, is a technique based on cointegration pairs trading. Cointegration is a concept that involves the use of mean reversion principles along with hedging strategies. 

For pairs trading, stocks of similar companies (i.e., of the same sector/industry) are taken and hedged against each other. 

  • If there is a possibility of one company’s increase in market share, Quant will open a long position in that company and a short position for its competitors. Mean reversion analysis is applied over several stocks in diverse portfolios for a small period of time to reduce exposure. 
  • This strategy includes the use of a predetermined stock portfolio with minimal risk. 
  • For statistical Arbitrage, correlation values between various stocks are calculated, and highly correlated stocks are taken as pairs.
  • These strategies are not risk-free, hence they are combined with other High-frequency trading algorithms. This way HFTs can take advantage of small price deviations.  

Statistical Arbitrage includes CVaR portfolio optimization. CVaR stands for conditional value-at-risk. This quantity helps measure the risk of the complete portfolio, it is an extended version of value-at-risk. (1 - CVaR score) gives the worst risk exposure possible for the portfolio.

What is Index Arbitrage?

As the term suggests, these strategies utilize the price discrepancy between market indices. Generally, the market indices traded at two different exchanges will have a small price difference; this is where index arbitrage comes into play. 

The other possibility is if there are two different market indices with a similar standard value. If the price deviates between the two, index arbitrage can be used. 

One of the most upcoming analyses is sentiment analysis on news articles. Quants have been trying to figure out the public’s reaction to various types of news articles and how it is going to affect the future trend.

What is Sentiment Analysis?

Sentiment Analysis is also known as Opinion Mining. This Analysis indicates the use of Natural Language processing, text analysis, and computational linguistics to understand and predict the public’s emotions. 

  • Based on pre-built models, a passage is fed to the model, and the model predicts or calculates the sentiment score. Generally, the higher sentiment score indicates a better mood or positive reaction, and the lower(can be negative) score indicates a negative or bearish reaction.
  • Besides sentiment scores, these models also provide different scores for various emotions for a deeper understanding or public reaction towards the market. There is VADER (Valence Aware Dictionary for Sentiment Reasoning), a pre-built model for these analyses. 
  • This model, along with the words, takes into consideration punctuations, capitalization of words, degree modifiers, and conjunctions to understand the fed passage.

Sentiment Analysis requires web scraping to automate news reading. 

Alternative Data Trading Strategies

  1. Brand Value Factor: Opposed to the popular belief of investing in big brands for stable income, many investors invest in unpopular brands to earn an unpopular premium.

  2. Google Search Strategy: One can use Google search to find stocks with the lowest attention to measuring the volatility of the stocks

  3. Newspaper Picture and Text Pessimism: These strategies are like sentiment analysis; they use newspapers to understand investors' pessimism

  4. Technology Momentum: This strategy involves technology stocks that have the potential to innovate; this can be judged based on the number of patents from the company. These intangible assets lead to pricing inefficiencies in the market. Thus, creating opportunities for Quant traders to take advantage of these deviations.

  5. Management Diversity: This strategy is similar to sentiment analysis; instead of emotions, they aim for diversity in upper management. They use NLP over various reports, such as annual reports, webpages, etc, to determine the diversity. More diversity leads to better returns. 

  6. Lexical Density of Filings: This strategy again uses Natural Language Processing to determine the amount and quality of information the company disseminates. Determining factors are the verbal and financial content of the company filings. 

Hence, the higher the lexical score, the higher the chances that the company will perform better. 

Obscure and Small Markets

Quant traders also aim at other markets called Obscure and Small Markets. What are small and obscure markets in trading, and how do Quants benefit from them?

  • Stocks and ETFs at Obscure and Small Markets tend to trade at a premium or heavy discounts. This opens up big opportunities for investors. These are called Closed-end funds, they are mutual funds that trade like stocks. 
  • Closed-end funds are actively managed as opposed to ETFs, which are passively managed. Active management, though, has a high risk and ensures higher returns over a short period of time. Hence, active management plus heavy discount over Net Asset Value prices lead to high returns (profits) on investment.  

Take note that active management provides higher returns only for a short period of time; for longer periods, passive management has been able to produce higher returns. 

  • The most important part of any machine learning modeling is the cross-validation part. Even for time-series analysis, cross-validation is very important before moving further with the financial metrics analysis. 
  • One of the most commonly used cross-validation methods is k-fold cross-validation. The k in k-fold represents the number of sets the dataset will be divided into for training. 
  • Suppose we use 5-fold cross-validation, we divide the training dataset into 5 parts, and each part is separately trained and tested on a common test dataset. This way, we overcome the issue of overfitting, and we can evaluate the model in a more robust way. 

Let us understand another frequently used cross-validation method for backtesting HFT strategies

Roll forward cross-validation

This method, as the term suggests, uses roll forward similar to moving average for cross-validation; the only requirement is the test dataset should always be after the training dataset. 

We need to keep in mind two things for cross-validation since this is a time series, one shouldn't shuffle the dataset. The other thing to remember is that the test dataset should always come after the training dataset. One cannot train on future datasets and predict past datasets.

Researched and authored by Punit Manjani | LinkedIn

Free Resources

To continue learning and advancing your career, check out these additional helpful WSO resources: