AI-Powered Decisioning Platform

Business Problem

Manual document processing was creating bottlenecks — assessors spending hours extracting data from unstructured documents, inconsistent decisions due to human bias, and no explainability for rejected cases.

The platform needed to automate extraction, compute risk scores in real time, and give every decision a clear, auditable explanation for compliance teams.

Architecture — End-to-End Flow

① Sources → Kafka (real-time) · Blob Storage · REST API · Synapse · Storage Queue
                ↓
② Schema Registry + Data Contracts → validation gate pre-Bronze
                ↓
③ Bronze Layer → raw immutable Delta tables · append-only · CDF enabled
                ↓
④ Silver Layer → parse · validate · deduplicate · LLM extraction (ai_extract)
                ↓
⑤ Feature Store → pre-computed features · training/serving parity · point-in-time correct
                ↓
⑥ ML + AI Layer → XGBoost scoring · LLM recommendation · Vector Search + RAG
                ↓
⑦ Gold Layer → decisions · risk profiles · pipeline health tables
                ↓
⑧ Model Serving Endpoint → REST API · A/B traffic split · inference logging
                ↓
⑨ Consumption → Databricks SQL · Power BI · Chat Interface · REST API

How It Works — Step by Step

Multi-Source Ingestion

5 source systems — Kafka (high-throughput streaming), ADLS Blob, REST APIs, Synapse JDBC, Azure Storage Queue. Each with dedicated ingestion pattern (Structured Streaming, Auto Loader, Python job, Synapse Connector, Azure Function trigger).

Schema Registry + Data Contracts Gate

Confluent Schema Registry enforces Avro schema compatibility. YAML Data Contracts define owner, SLA, field-level expectations. Great Expectations validates at boundary — contract breach blocks data and triggers alert.

Bronze → Silver Medallion Processing

Bronze stores raw immutable data with CDF enabled. Silver parses, validates, deduplicates via MERGE, applies 2-hour watermarks, and enriches with LLM extraction (ai_extract) — invalid records routed to DLQ.

Feature Store — Training/Serving Parity

Databricks Feature Store computes and stores features (debt_ratio, tx_velocity_30d, country_risk). Same feature logic used in training and real-time serving — eliminates training-serving skew entirely.

XGBoost + LLM + Vector Search

XGBoost computes risk score 0–100 with SHAP explanation per prediction. LLM (Llama 3.1 405B via ai_query) generates recommendation text. Vector Search retrieves relevant policy documents — injected as RAG context into LLM for grounded answers.

Model Serving + Champion-Challenger

Databricks Model Serving Endpoint exposes scoring as REST API (<200ms P99). Champion 90% / Challenger 10% traffic split. New model proven in shadow mode before full rollout. Every prediction logged for drift detection.

Gold Layer + Consumption

Gold tables store final decisions with risk score, LLM recommendation, SHAP reason. Consumed by Databricks SQL dashboards, Power BI (DirectQuery), conversational chat interface, and external REST API for CRM integration.

MLOps Practices

Experiment tracking: Every training run logged — params, metrics, SHAP plots, LLM prompt versions
Model registry: Staging → Production promotion with evaluation gate (AUC must exceed champion)
Champion-Challenger: 10% traffic to new model, 2-week comparison before full rollout
Drift monitoring: Evidently weekly report — auto-retrain triggered if drift exceeds 10%
Feature Store: Eliminates training-serving skew — same features at training and inference time
SHAP explainability: Every decision auditable — feature contribution per prediction logged to MLflow

Governance & Compliance

Unity Catalog: Column masking on PII fields, row-level security per business unit
GDPR compliance: Retention policies on all tables, right-to-erasure via VACUUM, PII tagged
Data lineage: Automatic end-to-end lineage — source to Gold — auditable in Unity Catalog
Data Contracts: Schema evolution controlled, breaking changes blocked at ingestion boundary
Audit logs: system.access.audit captures every read/write — full compliance trail

Tech Stack

Azure Databricks Apache Kafka Delta Lake Delta Live Tables Auto Loader Structured Streaming MLflow XGBoost SHAP Evidently Databricks Feature Store Model Serving Endpoints Llama 3.1 405B Vector Search RAG Unity Catalog Great Expectations Azure DevOps Databricks Asset Bundles Power BI Python PySpark