At Lucena, we perpetually strive to expand our research and introduce new technologies and machine learning concepts on our flagship platform, QuantDesk®, or in the context of designing new investment strategies. Today, I wanted to touch on a new technique we employ to design an intra-day price forecaster.
Most of the strategies and machine learning disciplines I presented so far were focused on multi-day holdings. We have intentionally omitted high-frequency and even less frequent intra-day strategies since our database is predicated on end-of-day data (and not intra-day). We’ve decided to support one day to one year holdings for two main reasons:
- Logistical (mainly derived from database sizing).
- Business philosophy
Logistic and Database Size: With end-of-day data, our database maintains approximately 800 features to support approximately 10,000 securities worldwide. If we maintained a single value per each feature per each security per each trading day since January 1st, 2000, we would be looking at a database size of a few terra bytes holding approximately 35 trillion data points. You can do the math yourself — 17 years * 252 trading days per year * 800 features * 10,000 securities. Just imagine the magnitude of the extension of our database if we extrapolate the math by multiplying our 35 trillion data points over 510 minutes (8.5 trading hours * 60 minutes per day). If we wanted to go deeper to the seconds’ level, we would have to multiply our 35 trillion data points by 30,600 seconds.
Business Philosophy: When looking at the spectrum of investment styles where on one end you have the Warren Buffets of the world advocating buy and hold strategies, and on the other end of the spectrum is the sub-second high frequency traders, the reliance on human insight gradually diminishes in favor of automation. A portfolio manager buying and holding on to a position for decades would normally rely more heavily on his/her intellect and foresight derived from some well thought out projections based on the likes of socioeconomic, geopolitical, and macroeconomic long term trends. On the other hand, at the microsecond level, human insights are literally nonexistent. The algorithm, although constructed originally by humans, makes all the decisions with virtually no human input.
Image describing the progressive influence of human intellect at the expense of machine-based algorithmic decision as the time horizon grows from milliseconds to years
Building Our Version of Intra-Day Price Forecaster: In the face of our logistical and business philosophy constraints, we’ve heeded our clients’ and prospects’ requests and we’ve come up with a pretty interesting technology that doesn’t impact our database size to a great degree, enables our users to apply their own discretion on the final intra-day trade. The goal is to identify a smaller subset of indicators (most of our 800 features are NOT applicable to intra-day situations anyway) that are influential for decisions with a 30 minute to 1 hour time horizon.
Intra-Day Price Change: Let’s start by looking at minute by minute price changes, and attempt to estimate the probability of a price change 30 minutes or so into the future within the same trading day. In other words, at 3:00PM we can estimate the price projection for 3:30PM based on all the data gathered since the market opened at 9:30AM.
Imagine looking at the price change of a specific stock, every minute over the entire trading day.
Let’s take a hypothetical scenario of APPL minute by minute price change on a certain trading day:
- At 9:30 we mark its open price.
- At 9:31 we mark the change in price from its previous mark at 9:30.
- At 9:32 we mark the change in price against its 9:31.
- Etc… until the end of trading day.
Image 1: A hypothetical one minute price change (%) interval of AAPL between 9:30AM and 9:40AM.
Now, imagine we gather the minute by minute price change path for AAPL every day in the past 1000 trading days.
Image 2: A compilation of 1000 one-minute price change paths. One path per trading day.
Image 3: After a period of time during the trading day (marked by the red ticks), we have eliminated all the non-conforming paths (marked in blue) while the green paths remain in contention.
Image 4: After a longer period of time during the trading day (marked by the red ticks), we have eliminated more of non-conforming paths (marked in blue) while the green paths remain in contention.
Now that we have clear and refined potential paths left, we can visually picture the average (mean) price change among all potential remaining paths. The narrower the standard deviation of the remaining paths, the higher the confidence in our forecast.
Image 5: The last 30 minutes of the trading day, is left with very few options, which enables a more refined price prediction.
I like the simplicity and visual clarity of how one could potentially take a simple data mining example and convert it to a trading strategy. I caution you, however, that this example by itself will probably NOT prove very effective. When looking at price alone, one can quickly assume that others have already exploited the method depicted here. However, by augmenting the above process with additional features (change in volume minute by minute in addition to price change, for example) and with additional machine learning disciplines, we can come up with some very promising scenarios. Lucena is in the process of finalizing a comprehensive intra-day forecaster. (Stay tuned, an announcement is coming soon.)
Erez M. Katz — Lucena Research Inc.