Case Study

Numida Reduces Fraud and Accelerates Underwriting with ML on AWS

At a glance

Numida partnered with New Math Data to build a machine learning pipeline that detects internal and external fraud in digital loan applications. Using AWS-native services including Amazon SageMaker, AWS Lambda, and Step Functions, the solution delivers real-time scoring, retrains weekly, and enables rapid iteration, improving risk assessment and laying the foundation for scalable fraud prevention across new financial products.

Industry

Use Case

Solution implemented

The value equation

Company Snapshot

Numida

Numida empowers Africa’s overlooked micro- and small businesses (MSBs) with digital financial services tailored for growth.

Location

Africa

Numida Builds Real-Time Fraud Detection to Support Scalable Credit Decision

As one of Africa’s leading digital lenders to micro- and small businesses (MSBs), Numida needed a way to identify both internal and external fraud before it impacted revenue or risk scores.

Internal fraud, where staff might improperly approve loans to boost commission or in collusion with applicants, required a sensitive, high-precision approach. External fraud, where applicants apply with no intent to repay, posed a scaling risk as Numida expanded its loan portfolio.

Working with New Math Data, Numida launched a proof of concept to move from manual fraud analysis to a production-grade, automated machine learning pipeline that retrains weekly and delivers scores in real time.

The pipeline not only flags suspicious activity at the point of submission but also establishes a robust foundation for model evolution as fraud patterns change.

Problem

Numida’s fraud detection was historically ad hoc. Loan records existed in Amazon RDS, but manual analysis and slow iterations made it hard to respond to evolving fraud techniques—especially as new financial products came online.

To prepare for scale, Numida needed a solution that could:

Solution

Building an AWS-Native ML Pipeline for Real-Time Fraud Scoring

New Math Data designed and implemented a machine learning pipeline tailored for structured data fraud detection using AWS-native services.

Training Workflow

Triggered by AWS Step Functions, a custom SageMaker training container pulls feature data from Amazon RDS, trains an XGBoost model on EC2, and saves results to S3.

Model Choice & Evaluation

XGBoost was selected for its strong performance on structured datasets and support for feature importance scoring. The F1 metric was used to evaluate success due to the imbalanced nature of fraud data. The model was trained on ~300,000 records (an 80/20 split) with only 5% labeled as fraudulent, and achieved a high precision F1 score of 0.98 on the test set.

On-Demand Inference

An AWS Lambda function scores incoming applications in real time, calling the trained model, writing results back to RDS, and logging outputs for further analysis.

Scalable Infrastructure

The entire infrastructure was provisioned using Terraform and handed over with clear documentation to enable reuse across future ML workflows.

Related Case Studies

Inspire Clean Energy

Streamlined Data Processing and MLOps with Spark

Vertically Integrated Utility Company

Migration of data systems and applications to AWS.

Ready to Transform Your Fintech Business?

See how New Math Data can transform your fintech business with AWS-powered innovation.