Cash Flow Credit Scoring | DSC 180B Capstone

Introduction

Why Traditional Credit Scoring
Falls Short

Traditional credit scoring models rely heavily on past repayment records, systematically putting credit-thin individuals such as students, new immigrants, and cash-based workers at a disadvantage when applying for loans.

To address this gap, we implement a cash-flow underwriting model that measures credit risk by leveraging income stability, spending patterns, and liquidity dynamics from bank transaction data.

Instead of depending solely on historical credit lines, our model parses everyday cash-flows to evaluate creditworthiness more inclusively via a dynamic metric: the Cash Score.

Project Scope: We strictly focused on feature engineering and predictive modeling using raw transaction logs to forecast delinquency risk. We did not test for demographic fairness due to the strict anonymization of the dataset.

Inclusive Assessments

By removing the dependency on historical debt usage, we differentiate risk among borrowers with similar financial profiles but no traditional credit footprint.

Data-Driven Thresholds

Utilizing tree-based boosting architectures with AUC-ROC performance evaluations to allow lenders to set customizable risk thresholds based on their specific tolerance preferences.

Transparent Decisions

Producing interpretable signals through actionable "reason codes" so underwriters understand exactly what financial behaviors are impacting a score.

Data

Hierarchical Financial Records

We utilized a proprietary, anonymized dataset provided by Prism Data, consisting of millions of banking records securely linked to specific evaluation dates.

Target Variable

Delinquency (DQ)

Our target was forecasting a delinquent payment, defined as "a payment that is late or missed past its due date." Our population consisted of ~10% delinquent consumers (DQ=1) and ~90% non-delinquent consumers (DQ=0).

Exclusion Funnel

9,319 Valid Users

We started with 12,000 consumers and applied a strict screening funnel: dropping duplicates in transaction/account IDs, removing consumers without valid accounts, and dropping users with fewer than 3 months of transaction history.

Consumer Dataframe: Each row represents a unique consumer and their target variable (DQ) indicating delinquency status.
Consumer ID	Evaluation Date	Credit Score	DQ Target
1608	2021-08-01	746.0	0.0
8752	2023-10-21	441.0	1.0
5606	2023-12-06	600.0	0.0

Account Dataframe: Each row represents a unique account linked to a consumer, containing financial details like balance and account type.
Consumer ID	Account ID	Account Type	Balance Date	Balance
6754	13777	CHECKING	2023-04-15	23.67
1283	2291	CHECKING	2021-01-31	1153.85
1624	5015	SAVINGS	2021-05-28	544.34

Transaction Dataframe: Each row represents a unique transaction, detailing the amount, date, and category of the transaction.
Consumer ID	Transaction ID	Category	Amount	Credit or Debit	Posted Date
10961	3835508	1	14.00	CREDIT	2021-09-21
14792	4010056	14	21.58	DEBIT	2021-08-30
13182	5183051	17	8.35	DEBIT	2021-07-12

Category Mapping: Each category ID corresponds to a specific type of transaction.
Category ID	Category
0	SELF TRANSFER
1	EXTERNAL_TRANSFER
2	DEPOSIT

Methods

Data Wrangling & Feature Engineering

Our approach transforms raw, irregular transaction time-series into structured, consumer-level tabular features across multiple temporal dimensions.

01

Feature Engineering

We aggregated raw events into comprehensive financial profiles capturing liquidity, stability, and consumption.

Feature Categories

Income Features (Inflow stability): Monthly income, income volatility, source count, regularity, recency across 1, 3, 6, and 12-month windows.
Income-to-Spending Ratio Features (Consumption burden): Category ratios, multi-window summaries (1/3/6/9m), and income-adjusted intensity.
Balance Features (Liquidity trajectory): Reconstructed running balance series, volatility, drawdowns, trend slopes, and overdraft flags.
Account Features (Resource composition): Total/avg balances, account diversity, dispersion, negative exposure, and wealth tiers.
Temporal Behavior Features (Financial regularity): Weekday spending habits, bill-cycle timing (day-of-month), transaction periodicity, spectral stability (CWT).

02

Feature Selection

To reduce high dimensionality and eliminate noise, we applied rigorous selection methodologies to isolate the most predictive signals.

Selection Techniques

L1-Lasso Regularization: Drove weights of less predictive features to zero.
Feature Importance: Retained the top 50 features with the largest importance weights.
Max_Features Hyperparameter: Utilized embedded feature selection directly within tree models.
Zero-Variance Elimination: Automatically removed features containing zero-variance.
Collinearity Screening: Dropped highly correlating redundant features (>0.85).
Manual Inspection: Manually picked out and removed certain uninterpretable features to preserve reason-code clarity.

03

Model Training & Tuning

We trained multiple models to predict the probability of delinquency using a strict 80-20 Train-Test split.

Optimization Details

We utilized Optuna for comprehensive hyper-parameter tuning across our gradient-boosted decision trees (XGBoost, LightGBM) and evaluated baseline Logistic Regression models.

Because our target labels were highly skewed (~10% DQ), we explicitly avoided using standard accuracy as a metric. We utilized specific Imbalanced Data Handling techniques (such as scale_pos_weight and balanced class weights) and strictly evaluated model performance using AUC-ROC.

Results

Results & Discussion

Our final CatBoost model successfully demonstrates the robust predictive power of cash-flow underwriting, allowing for transparent deployment into actual lending workflows.

Model Performance

Testing AUC-ROC scores across our benchmarked algorithms.

Model	Test AUC
CatBoost	0.8585
XGBoost	0.8539
LightGBM	0.8222
Logistic Regression	0.7532

Outputs

Model Outputs

Model Outputs: Cash Score + Top 3 Reason Codes for Each Consumer

The model produces two primary outputs for each consumer: a Cash Score as well as the top 3 primary factors influencing the model’s prediction for each consumer. These reason codes highlight key behavioral signals that contributed to the risk assessment. Providing reason codes is important for transparency and helps ensure compliance with the Fair Credit Reporting Act (FCRA), as mentioned before, which requires that consumers be given un- derstandable explanations for adverse credit-related decisions.

Conclusion

The Future of Credit

Heat map of delinquency rates by Cash Score and Credit Score bins, showing delinquency patterns differing within Credit Score bands.

Delinquency Rates of Cash Scores Within Credit Bins

Traditional credit scoring inherently excludes millions of financially responsible individuals. By shifting the paradigm to cash flow underwriting, we successfully demonstrated the ability to leverage transaction-level behavior to evaluate repayment risk.

As you can see, Cash Scores are not intended to replace traditional credit scores, but rather to support them. While credit scores often rely on fixed cutoff thresholds for approval decisions, they may not fully capture short-term liquidity conditions or behavioral risk differences among borrowers with similar scores. This is where our scores jump in. By providing additional screening within each credit score band, the Cash Score helps differentiate repayment risk among consumers who appear identical under traditional scoring models. For example, individuals with high credit scores but weak cash-flow stability may still face elevated default risk. Incorporating the Cash Score therefore enhances risk stratification and supports more informed lending decisions.

Our Team

Contributors

AM

Ada Mo

admo@ucsd.edu

BC

Brighton Chan

chc@ucsd.edu

HS

Haris Saif

hasaif@ucsd.edu

KC

Kyle Choi

k3choi@ucsd.edu

Mentor: Kyle Nero
kyle.nero@prismdata.com

Mentor: Daniel Mathew
daniel.mathew@prismdata.com

Evaluate Credit Risk with Cash FlowUnderwriting

Why Traditional Credit ScoringFalls Short