Leonardo Valencia

# A simple Statistical Arbitrage System

Most retail traders that arrive at the world of options do so after a long search for the holy grail, for that perfect system that will always win and make money consistently. Somehow, they think that by using clever combinations of strikes and expirations, long and short legs, they will be able to create the perfect “system” one with zero risk and all gains. Well folks, allow me to burst that bubble, there is no such thing as the holy grail in options (or pretty much any other instrument), the concept of pure arbitrage, making money with zero risk, is beyond the grasp of any retail trader, the reality is that the old adage is truer than ever: In order to make money you have to risk money.

There is indeed the possibility of true arbitrage, and in fact it happens every second across all markets. You can see it when you have different instruments that essentially represent the same thing, for example: ES minis vs SPY ETF vs S&P 500 basket. At any given time, there are tiny discrepancies between the 3 of them that can be exploited for a quick risk-free trade, of course you have to be incredibly fast to spot them, and profit from them and also have massive amounts of capital to make the whole thing worth it. This is an industry on its own and it is at the core of the efficiency of advanced markets (like the US). Sadly, it is an activity that is beyond the reach of retail traders.

Is everything lost then? Of course not! There still the traditional ways of making money in the market: Investing in undervalued stocks (buy and hold basically), betting on inside information (This is more common that people think, although mostly illegal), trend following, dispersion trading (a slow-moving form of arbitrage but with risk) and finally a set of strategies that fall under the umbrella of **statistical arbitrage**.

## Statistical Arbitrage defined

At the core of this set of strategies is the concept of statistical edge. The idea here is not to have a perfect system but instead to have a system that wins more than is being priced by the risk taken. This is a system that wins in the long run, after a big enough number of trades. The idea is that profit accumulates slowly over time. A simple example of this is the roulette game in a casino. The outcome of the players’ bets is impossible to predict, is perfectly random (neither the player nor the casino has an information advantage) but the structure of the bet produces massive amounts of edge in favor of the casino. The reason is that the casino is pricing the bet in a way that the player is not being fairly compensated by the risk taken. In the US, the roulette wheel has 38 slots in total (36 numbers, one 0, and one 00). When a player bets on a number the odds of winning are 1/38, yet if he/she wins he is only paid 35X the original bet, so that means that the casino has about 5.26% of edge per bet. This, folks, is statistical arbitrage. The casino will win and lose bets, but in the long run is accumulating a positive edge of 5.26% on a bet where the outcome is pretty much random.

## Using options for statistical arbitrage

We need to find a “roulette” in the options space in which we can beat the odds so we can deploy a system that slowly makes money per trade. Luckily there are tons of those potential roulettes to be found and that is mostly due to the way that options are priced. The details are very technical, and I’ll expand on them on different articles, but it boils down to the fact that options dealers price options in a risk-neutral world (they don’t care if the stock goes up and down) while we, lowly retail players, fully assume directional risk. That opens the door for scenarios where both the dealers and us can make some money on the same trade (demolishing the myth of the zero-sum game in options).

## Presenting the VRP system

This is a **statistical arbitrage** system, that uses *neural networks* to predict a *probabilistic distribution* of the SPX index a number of sessions ahead, then we compare this predicted distribution with the one implied by options prices and we generate a potential trade based on the difference between both distributions. This system employs two different sources of positive edge. One portion of the edge comes from the asymmetry between the risk-neutral pricing of options versus the actual terminal distribution of prices that we see at expiration, and the other big portion of edge comes from the neural network ability to identify patterns in market activity to refine the terminal distribution prediction. The core of the system is the ability of replicate binary options with very tight vertical spreads (an imperfect replication though) to capture the implied odds of a particular trade and compare them with the odds generated by the neural nets and play the binary in such a way that we are on the right side of positive edge most of the time.

The system has many different implementations and my favorite one is the VRP 350 AI version, which has delivered a positive edge of 17.2% since Jul 1, 2021 (we are doing better than the American roulette by the way). It is important to highlight that the system is not perfect and doesn’t win all the time. In fact, it has no information about where the SPX will be in the future (it treats that as a random event, just like the roulette), it just exploits mispricing in the implied odds of the trade to capture a long-term positive edge and accumulate profits from that.

## Closing Remarks

The VRP system is indeed part of the statistical arbitrage set of systems. We don’t need to correctly predict the outcome of any event. We just need to be able to value the odds of an event and compare it with the ones presented to us from options prices and pick always the side of the house, that is how positive edge and profit emerge with time.