Nishan Shetty
ML systems that
work in the real world
Data Scientist and AI Engineer with 7+ years building production ML systems that are accurate, explainable, and fast enough to matter. Proven in regulated, multi-site production environments, from scheduling optimization and revenue forecasting to multi-agent AI systems.

Mt. Rainier · 2024
I'm a data scientist and AI engineer who likes building systems that are practical, explainable, and actually useful in the real world. Most of my professional work sits at the intersection of machine learning, operations, healthcare, and decision support. Outside of work I'm usually drawn to things that involve exploration, creativity, and learning how systems work.
I like to travel, especially through Europe, and I'm always interested in places with history, architecture, good food, and a little bit of mythology or mystery. Greece is high on my list, partly because I grew up fascinated by Greek mythology. I also enjoy road trips, camping, trying new recipes, making cocktails, and finding interesting local spots when I'm in a new city.
A lot of my interests overlap with how I approach data science: patterns, stories, constraints, tradeoffs, and building something useful from messy inputs. Whether I'm planning a trip, cooking, styling a space, or designing an AI workflow, I'm usually thinking about how the pieces fit together.
Outside of that, I'm into TV, movies, sports, home projects, and the occasional overly detailed itinerary. I'm happiest when I'm learning something new, solving a practical problem, or turning a vague idea into something tangible.
Off the clock
Scheduling Optimization Engine
The Problem
Staffing across 25+ locations was inefficient due to varying preferences, coverage needs, and shift constraints - costing the organization in overtime and misallocated resources.
Approach
Modeled the problem as a constraint program over ~12k decision variables (CP-SAT) with shift-balance, skill-mix, and preference constraints; used a genetic algorithm warm-start to seed CP-SAT and cut solve time below the nightly batch window. SimPy discrete-event simulation stress-tested schedules against demand variability before deployment. Orchestrated on AWS Batch with Step Functions; FastAPI interface for leadership to run what-if scenarios (demand shock, headcount cuts, policy changes).
Tech Stack
Impact
- ↑35% improvement in staffing efficiency across 25+ locations
- ↑Reduced scheduling conflicts and overtime spend
- ↑Solve time under 8 minutes on nightly batch (down from 4+ hours with prior heuristic)
- ↑Enabled dynamic decision support for planners
Validated in a regulated, multi-site production environment.
Solver Efficiency
Optimal target
19 staff
+9 above optimal
Solver Output · Required vs. Allocated coverage
Solver Trace
1 iter · obj = 21Decision variables x_s · staff per shift
00h
2
03h
3
06h
5
09h
5
12h
5
15h
4
18h
2
21h
2
8 shift patterns · 105 total demand-hours · headcount Σx_s = 28
Live JS hill-climbing solver on 8-shift ILP · production system uses OR-Tools CP-SAT on ~12k variables across 25+ locations
Revenue Forecasting & Denial Risk System
The Problem
High denial rates and unpredictable accounts receivable aging caused revenue leakage with no early warning system.
Approach
ARIMA captured trend and seasonality on the aggregate series; an LSTM modeled residuals against payer-mix and claim-aging features to capture non-linear dynamics ARIMA could not. Walk-forward cross-validation, drift monitoring, and nightly Lambda-triggered retraining on AWS SageMaker with S3-based feature storage. Predictions surfaced in Snowflake dashboards used by finance and operations.
Tech Stack
Impact
- ↑77% reduction in reprocessing costs
- ↑25% fewer denials
- ↑18% MAPE improvement over prior Prophet baseline
- ↑Enabled proactive scenario-based revenue planning
Validated in a regulated, multi-site production environment.
Model
ARIMA + Seasonal Decomp
Forecast Error (MAE)
12.4%
77%↓
Reprocessing Cost
25%↓
Denial Rate
18%↓
Forecast Error
Synthetic data · 24-month simulation · nightly retraining cadence
Incidental Findings Follow-up Agent
The Problem
Manual tracking of incidental findings in large volumes of unstructured documents led to missed follow-ups and compliance risk across 25+ locations.
Approach
LangChain RAG agent over a DynamoDB-indexed corpus of radiology and clinical notes, with LLM inference via AWS Bedrock. Built an evaluation harness measuring extraction precision and recall against a radiologist-labeled gold set, with guardrails for HIPAA-sensitive content and a human-in-the-loop review queue for low-confidence outputs. Integrated into HL7/FHIR-based follow-up scheduling and compliance workflows via FastAPI on SageMaker endpoints.
Tech Stack
Impact
- ↑30% increase in follow-up compliance
- ↑45% reduction in manual review effort
- ↑94% extraction precision on radiologist-labeled eval set
- ↑Improved safety outcomes at scale
Validated in a regulated, multi-site production environment.
Document Input
Segmentation, Causal Experimentation & Referral Network Platform
The Problem
Outreach campaigns were generic and underperforming. There was no visibility into which interventions actually caused behavior change, and no map of the referral relationships driving the most value.
Approach
Three-layer platform: (1) K-Means and hierarchical clustering on PCA-reduced behavioral features, validated with silhouette analysis and bootstrap stability tests; (2) NetworkX referral graph with PageRank-style scoring to surface high-value pathways; (3) X-Learner uplift modeling with CUPED variance reduction to measure causal lift, not correlation. Automated experiment tracking via AWS SageMaker and Lambda.
Tech Stack
Impact
- ↑25% improvement in campaign ROI vs. control
- ↑Identified highest-value referral pathways for targeted engagement
- ↑Reduced experiment runtime ~40% via CUPED variance reduction
- ↑Causal (not correlated) retention lift validated across 3 holdout cohorts
Validated in a regulated, multi-site production environment.
Patient Segments · PCA Projection (PC1 vs PC2)
Referral Network
2.4k
+18%
Referral Volume
0.34
+0.07
Network Density
127
+31%
High-Value Paths
25%
vs baseline
Campaign ROI
Synthetic PCA projection · 4-cluster K-Means · 1,218 patient cohort simulation
Automated Insight Narrative Generator
The Problem
Analysts spent hours each week manually interpreting dashboards and writing executive summaries - a bottleneck that delayed decisions and did not scale across a growing organization.
Approach
Agent ingests a structured dataset or dashboard export, runs automated anomaly detection and trend analysis, identifies the 3–5 most significant signals, and generates a polished executive-ready narrative using LLM synthesis. Built with Python, Claude API, and Pandas profiling.
Tech Stack
Impact
- ↑~70% reduction in analyst reporting time
- ↑Consistent insight quality regardless of analyst experience level
- ↑Closed the last mile between model output and executive decision
Validated in a regulated, multi-site production environment.
Q1 2025 · Revenue Cycle Dashboard
Monthly Revenue
+12.3%$4.2M
Patient Volume
-3.1%18,400
Denial Rate
-1.9pp8.7%
Days in AR
+5.2%42
Net Collection Rate
+0.8pp96.4%
New Patient Acq.
+22.1%1,240
IoT Predictive Maintenance & Failure Explanation Agent
The Problem
Fault-driven downtime across 10,000+ IoT-enabled assets was unpredictable and expensive. Technician logs contained valuable diagnostic signal that was never systematically analyzed.
Approach
Predictive maintenance models trained on sensor telemetry with lifecycle simulation. NLP pipeline extracted root cause patterns from unstructured technician logs and automated fault classification. Agent layer explains predicted failures in plain English with recommended interventions.
Tech Stack
Impact
- ↑22% reduction in fault-driven downtime
- ↑Automated root cause classification integrated into operational workflows
Validated in a regulated, multi-site production environment.
Asset Health Score
C2-COMP-03 · Air Compressor
Failure Predicted
Est. 4–6 days
Synthetic IoT telemetry · 24 h window · 10,000+ asset fleet simulation
A/B Test Interpreter Agent
The Problem
Experiment results were routinely misread by non-technical stakeholders - statistical significance confused with practical significance, peeking bias ignored, and recommendations written without accounting for novelty effects.
Approach
Agent ingests raw experiment results, runs validity checks (sample ratio mismatch, peeking detection, novelty effect flagging), interprets statistical and practical significance, and writes a plain-English recommendation memo with confidence levels and caveats.
Tech Stack
Impact
- ↑Eliminated systematic misinterpretation of experiment results
- ↑60% reduction in time from experiment close to decision
- ↑Enabled non-technical teams to act on results independently
Validated in a regulated, multi-site production environment.
Experiment Parameters
+25.0%
Relative Lift
<0.001
p-value
Sig.
at α = 0.05
Self-Healing Pipeline Agent
The Problem
Data pipeline failures were reactive - teams discovered issues hours after silent data corruption had already propagated downstream into models and dashboards.
Approach
Monitoring agent wired into Airflow DAGs detects failures and classifies root cause (schema drift, upstream data issue, compute failure, or dependency timeout). Agent either auto-remediates known failure patterns or escalates with a structured incident report.
Tech Stack
Impact
- ↑Reduced mean time to detection from hours to minutes
- ↑Eliminated class of silent data corruption failures
- ↑Shifted team posture from reactive firefighting to proactive maintenance
Validated in a regulated, multi-site production environment.
Airflow DAG · Real-time Status
Ingest
Validate
Transform
Load
Publish
Inject Failure
Synthetic Airflow DAG simulation · no real pipelines or data
LTV & Churn Cohort Analyzer
The Problem
Churn was being measured in aggregate, masking the fact that specific acquisition cohorts were degrading months before the signal appeared in top-line retention metrics.
Approach
Cohort segmentation by acquisition channel, behavioral pattern, and tenure. LTV trajectory modeling per cohort with leading indicator identification. Agent surfaces which cohorts are at risk, why, and what intervention historically works for that segment.
Tech Stack
Impact
- ↑Proactive retention investment 60–90 days before churn materialized in aggregate metrics
- ↑Improved campaign targeting efficiency
- ↑Improved LTV forecasting accuracy
Validated in a regulated, multi-site production environment.
Cohort Analysis
Retention Heatmap
| Cohort | M0 | M1 | M2 | M3 | M4 | M5 |
|---|---|---|---|---|---|---|
| C1 | 100% | 84% | 71% | 62% | 57% | 53% |
| C2 | 100% | 76% | 58% | 44% | 37% | 32% |
| C3 | 100% | 89% | 79% | 72% | 68% | 65% |
| C4 | 100% | 71% | 52% | 41% | 35% | 30% |
| C5 | 100% | 82% | 68% | 58% | 52% | 48% |
6-Month LTV Trajectory ($)
Synthetic cohort data · 5 acquisition channels · 6-month window
Capacity Planning Simulation
The Problem
Leadership made capacity decisions based on point estimates, with no visibility into the range of outcomes under different demand or resource scenarios.
Approach
Monte Carlo simulation engine ingests demand forecasts, resource constraints, and growth assumptions. Runs thousands of scenario iterations and outputs a capacity plan with confidence intervals, break-even thresholds, and a plain-English summary of key tradeoffs.
Tech Stack
Impact
- ↑Replaced gut-feel capacity decisions with probabilistic scenario planning
- ↑Enabled leadership to stress-test assumptions before committing capital or headcount
Validated in a regulated, multi-site production environment.
Scenario Parameters
Synthetic Monte Carlo · 2,000 iterations · parametric model only
Automated EDA & Data Quality Report Agent
The Problem
Every new dataset required hours of manual profiling before any modeling could begin - a recurring tax on every data science project.
Approach
Agent ingests a raw dataset, profiles distributions, missingness, outliers, correlations, and cardinality. Flags data quality issues with severity scores (critical / warning / info). Outputs a structured report with visualizations and recommended preprocessing steps ranked by impact.
Tech Stack
Impact
- ↑Reduced dataset onboarding from hours to minutes
- ↑Standardized data quality assessment across all projects
- ↑Gave non-technical stakeholders visibility into data health before models were built
Validated in a regulated, multi-site production environment.
91
Completeness
83
Consistency
88
Validity
87
Overall DQ
Synthetic dataset profile · 6 columns · 48,000 row simulation
Multi-Agent Financial Scenario Planner
The Problem
Financial planning relied on single-point forecasts that could not capture the interdependencies between demand volatility, cost structure, and strategic decisions.
Approach
Multi-agent system with three specialized agents - a forecasting agent (demand and revenue projections), a risk assessment agent (downside scenario modeling and sensitivity analysis), and a narrative agent (plain-English synthesis). Agents hand off outputs sequentially with a shared state object.
Tech Stack
Impact
- ↑Enabled leadership to stress-test strategic decisions against multiple futures before committing
- ↑Replaced static spreadsheet models with a dynamic, explainable scenario planning system
Validated in a regulated, multi-site production environment.
Scenario