ATLANTA, April 3, 2017 —
When we founded Lucena and embarked on this incredible journey, we based our business approach on an important key principle: distinguish substance from noise. While many fin-tech start-ups raised capital on the basis of unrealistic valuations and unsustainable growth projections, we decided to build substance before going to market with a robust sales and marketing infrastructure. Our seed funding and cash flow were funneled almost exclusively into building meaningful value through breakthrough machine-learning technology that works.
As big data, predictive analytics and machine learning becomes not just relevant but critical to business, the confusion among industry buzzwords continues to grow. It’s incredible to see how many companies profess to be machine-learning experts while providing, at most, static models in the form of statistical modeling.
While this discussion could be controversial, I wanted to shed some light on the main differences between machine learning and statistical modeling. While both machine learning and statistical modeling rely on historical data to predict the probability of an outcome, machine learning holds one main distinguishing ingredient in the form of inductive reasoning. Let’s look at why inductive reasoning is so important for financial modeling.
According to Investopedia, inductive bias (also known as learning bias) of a learning algorithm is the set of assumptions that the learner uses to predict outputs given inputs that it has NOT yet encountered. Simply put, the machine-learning technology that Lucena employs is capable of providing answers to questions our users may not have thought of asking.
Statistical analysis has lost its luster with the advent of big data science, the speed by which information becomes available, and the affordability and pervasiveness of data storage. The efficient market hypothesis (EMH) asserts that it’s impossible to “beat the market” because the stock market is so efficient that share prices always reflect all relevant information. It’s therefore important for us at Lucena to incorporate capabilities that are not easily accessible to the masses while also enabling an individual’s instinct to have the final say before applying the machine’s recommendations. When evaluating a machine learning solution, here are a few good questions you should ask in order to determine if a provider can really back-up its marketing claim:
Dynamic Models vs. Static Models
Ask if the provider supports dynamic models. Dynamic models are designed to morph to accommodate changing conditions in the financial markets. To use the autonomous vehicle as an analogy, a driverless car must accommodate multiple terrains, weather conditions, traffic conditions and environmental surroundings that could drastically change during the course of a route from point A to point B. The financial markets are similar in the sense that one model cannot be suitable for all market dynamics.
Ask if the providers’ models self-adjust and how the providers’ models get “smarter” over time. One of the most powerful concepts behind the models Lucena uses is that they score their statistical success and self-adjust. Hence learn from their successes and failure.
Backtesting, Cross Validation and Overfitting
Improper backtesting can easily distort reality and imply a false sense of optimism. Ask how the provider tests for the models’ robustness, and how the model performed during unfavorable market conditions, the financial crisis of 2008/2009, for example. Does the provider incorporate forward testing? What are the in-sample and out-of- sample testing periods? Do they account for slippage, short borrowing cost, and transaction costs?
As in any emerging industry, there are good honest providers, but at the same time there are others ready to exploit misinformation and confusion.
As an educated consumer one should be aware of some of the pitfalls to avoid.