Is AI Lender Upstart (UPST) Living on Borrowed Time?
opaque Machine learning and higher-than-expected costs dim prospects for A Leader In automated loans
The robots of the future seem to be struggling with the lending of the present. Case in point: Upstart Holdings (UPST).
Started back in 2012, by San Mateo-based Google alums Paul Gu, Dave Geroud, and Anne Counselman, Upstart began experimenting with bringing mathematical-based automation to consumer and business lending. The firm tinkered with basic predictive statistical tools, like Monte Carlo simulations and logistic regression, to better model the lending process. Model inputs were mostly understood information, like third-party credit scores and educational attainment data.
Upstart’s approach was not novel. Other early loan automaters, such as LendingClub, (LC) RocketMortage (RCKT) – then known as Quicken Loans – and Lending Tree (TREE) also attempted to innovate lending. Early automation included loans made without banker middlemen, analysis of web data, and mobile marketing.
But finding true efficiencies in lending proved elusive. A 1978 study from the Journal of Finance, written by real estate economist Kerry D. Vandell, summed up the challenges of using data to analyze lending. “The first has to do with the availability of data,” wrote Vandell, “The second problem has to do with the methodology used in generating data to be used in the estimation.“
Lending data was scarce. Something on the order of half of all Americans turned out to have either thin or no active credit information. And globally, such credit “invisibles” can gross up to 9 out of 10 of all potential borrowers, depending on how such non-customers are counted.
A wave of so-called alternative lending models sprang up to create the data needed to lend profitably to this unserved market. No technique seemed out of bounds. Loan application language was studied. Social media accounts were scanned. Web behavior was modeled. Macroeconomic tools lille multi-department structured credit data was factored into lending data streams. Whizbang machine learning techniques, like stochastic gradient boosting, were also experimented with.
Alternative lending data began to drive business. Upstart Holdings claims to have originated more than $13.6 billion in total credit. As of last quarter, roughly 70% of its loans were fully automated. Estimates say the alternative lending market will grow by roughly 17% over the next 5 years.
But on the whole, the credit industry remains quietly skittish over loans made with alternative data. Researchers worried that, as lending standards loosen, loan volume and bank profits jump. And mass default becomes a serious risk. A 2006 study by Giovanne Dell’ Ariccia and Robert Marquez elegantly lays out how loosened lending leads to a boom in loans and the seemingly inevitable financial disaster.
Dell’Arricia and Marquez point out that the bank crises in Argentina in 1980, Chile in 1982, United States in 1986, Sweden, Norway, and Finland in 1992, Mexico in 1994, United States again in 2008 and the almost certain lending and insurance corruption crises happening in China now, all follow the same, timeworn path to disaster.
The drive to lend in new ways has already led some to cheat. In 2021, peer-to-peer start-up LendingClub agreed to pay $18 million in fines to settle 2018 Federal Trade Commission charges that it deceived customers with fraudulent no-fee loans. In 2016, CEO and founder, and long-distance sailor Renaud Laplanche was forced to resign. He raised nearly $1 billion in what was probably 2014’s largest technology IPO. By 2020, LendingClub acquired a traditional financial entity, Radius Bank, and shuddered its automated lending platform entirely.
A pandemic altered world has appeared to have worsened fraud. A 2022 Nexis Risk report estimates that the cost of fraud for U.S. financial services and lending firms has increased roughly 9.9% compared to pre-pandemic levels. Every dollar lost to fraud now costs U.S. financial services business $4.00, compared to $3.64 in 2020 and $3.25 in 2019.
The Automation of Questionable Lending
Upstart Holdings seems unusually opaque about how it manages the inherent risks of alternative lending. We could find no firm methodology for how it defines loan defaults. That’s unusual in emerging lending markets. Over the past decades both LendingClub and Rocket Companies offered extensive disclosures for various families of default for its new loan products.
Instead, default analysis at Upstart appears to be limited to top-line performance metrics, cribbed from company presentations. The small print from 2021 company guidance, quietly disclosed dramatic increases in the costs to get new customers and manage existing loans.
While revenues grew to a seemingly dramatic 249%, the cost to service those customers rose by roughly 247%. But the cost to acquire those customers rose by over 310%. And these figures specifically exclude other operations costs, like payroll, personnel-related expenses and stock-based compensation for operations employees not directly tied to servicing customers.
Spiraling costs and complexity seem to be a theme throughout Upstarts disclosures, particularly when it comes to artificial intelligence. Currently, the firm says, more than 1,000 variables are modeling in its analytic universe. That dataset contains more than 10.5 million “repayment events.” The growth of the data Upstart attempts to harness is starting. Starting from essentially no information in 2014, Upstart’s models studied about 30,000 data points in 2015 and 1.1 million in 2017.
The company’s models currently touch more than 10.5 million data points. They are getting dizzyingly complex. Initially, analysis simply focused on the likelihood of loan default. But as time passed, more complex models were applied throughout the loan process, including fee optimization, borrower income fraud, acquisition targeting, prepayment predictions, and identity fraud.
The tools brought to bear to study these new statistical themes seem to come from every isle of the big data big box store.
Researchers are beginning to wonder if bigger may not always be better when it comes to automated intelligence. A Cornell University study submitted by Mark Chen, but credited to dozens of other researchers, revealed that building complex AI models using vast swaths of information, can be effective. But there are also significant limits.
Researchers say machine learning is most accurate when it’s designed to solve a specific problem. But as the domain of the problem broadens, performance decreases.
Regulators and market participants have been cautious about such alternative lending. The General Accounting Office released a report that alternative loans have a future. But the risk for such loans remains significant. And those close to these transactions sometimes wonder if lending standards are simply too lax.
“It frankly just feels a bit too easy to get money,” said one credit investor familiar with Upstart’s platform.