Skip to content

▸ MISSION CONTROL — EARTH STATION 01 ◂

PORTFOLIO.SYS

SUSHIL DALAVI · AI RESEARCH ENGINEER · v2.0.25

NEURAL CORE.........[----]
NLP ENGINE..........[----]
RAG PIPELINE........[----]
LLM INTERFACE.......[----]
PORTFOLIO ASSETS....[----]

INITIALIZING SYSTEMS...

ALL SYSTEMS NOMINAL · STANDBY FOR DEPLOYMENT · LAT 34.02° N / LON 118.28° W

AvailableAI Engineer

Sushil Dalavi.

Production AI Systemsfor Retrieval, ML & Data

AI Engineer at USC Annenberg Norman Lear Center — architecting AWS data platforms, hybrid retrieval pipelines, and multi-modal ML systems at scale.

Résumé
1M+
Records Processed
99.3%
Alignment F1
21.8%
MRR Lift
30%
Latency Reduction
Connect
Los Angeles, CAUSC '26
About

A short bio

Architect-minded AI engineer. Measurable outcomes over ceremony.

I'm an AI Engineer at the USC Annenberg Norman Lear Center and an MS in Computer Science candidate at the University of Southern California. I architect production AI systems that span AWS data platforms, hybrid retrieval, and large-scale ML pipelines — with an emphasis on measurable outcomes, reliability, and reproducibility.

My current work fuses multi-modal signals — audio, speaker diarization, and caption streams — into alignment systems reaching 99.3% F1 across 1M+ multi-region records ingested through S3, Glue, SageMaker, and Bedrock.

Previously at Reliance Jio, I deployed quantized transformer inference and ResNet/DenseNet vision networks into production, cutting p95 latency by 30% and lifting recall on medical anomaly detection by 35%.

Focus

Retrieval-Augmented GenerationHybrid Retrieval & RerankingLLM Inference & EvaluationDistributed WorkflowsAWS Data PlatformsMLOps & ObservabilityMulti-Modal ML
Education
University of Southern California

MS in Computer Science

University of Southern California

Los Angeles, CA
Aug 2024
May 2026
University of Mumbai

BE in Computer Engineering

University of Mumbai

Mumbai, India
Jun 2019
May 2023

Relevant Coursework

Machine LearningDeep LearningDistributed SystemsInformation RetrievalNatural Language Processing
Selected Work

Systems I've shipped

Production AI, retrieval, and distributed workflow platforms — architected end-to-end.

01Workflows ·Retrieval

JobSense

Durable Distributed Workflow Platform

A fault-tolerant orchestration platform built on Temporal — coordinating 12 tool integrations, automated retries, human-in-the-loop checkpoints, and a provider-agnostic inference gateway with semantic caching and CI regression gates.

  • Durable orchestration on Temporal with 12 tool integrations, automated retries, human-in-the-loop checkpoints, and end-to-end observability
  • Provider-agnostic inference gateway with multi-backend failover, Redis semantic caching, and structured-output validation
  • CI regression gates blocking merges on quality or cost drift
  • Hybrid retrieval (BM25, dense vector, cross-encoder rerank) fused with Reciprocal Rank Fusion, benchmarked end-to-end
PythonFastAPITemporalPostgreSQLRedisBM25Cross-EncoderRRF

Shipped Metrics

12

Tool Integrations

Hybrid + RRF

Retrieval

Multi-Backend

Gateway

Stack

Python
FastAPI
Temporal
PostgreSQL
Redis
BM25
02LLM Systems ·MLOps

ScribeAI

Inference Service with Evaluation Pipeline

An async FastAPI inference service with SSE streaming, multi-backend routing (GPT-4o, Claude, fallback engine), an MLflow-tracked evaluation harness, and a compliance-aware pipeline with entity-level redaction and append-only audit logging.

  • Async FastAPI inference service with SSE streaming, multi-backend routing (GPT-4o, Claude, fallback), and graceful degradation under upstream failure
  • MLflow-tracked evaluation harness running ROUGE, BLEU, BERTScore, faithfulness, and leakage checks on every model change
  • Automated regression alerts on metric drift across versioned model releases
  • Compliance-aware pipeline: entity-level redaction across 10+ PII types, encrypted storage via pgcrypto, and append-only audit logging
PythonFastAPIQdrantMLflowpgcryptoSSEGPT-4oClaude

Shipped Metrics

10+

PII Types

5

Eval Metrics

SSE

Streaming

Stack

Python
FastAPI
Qdrant
MLflow
pgcrypto
SSE
03RAG ·Retrieval

ScholarRAG

Retrieval & Data Engineering System

A hybrid retrieval pipeline for scholarly discovery — combining dense embeddings, BM25, RRF, and MiniLM reranking over 120+ evaluation queries, with DOI/title normalization and citation-aware grounding.

  • Hybrid retrieval pipeline (dense, BM25, RRF, MiniLM rerank) lifting MRR by 21.8% and nDCG@10 by 18.0% across a 120+ query evaluation harness
  • Reduced duplicate indexing by 50% and re-ingestion time by 60% via DOI/ID/title normalization and SHA-256 content hashing
  • Improved answer grounding from 0.505 to 0.616 faithfulness and claim support from 45.4% to 85.6%
  • Evidence-constrained generation with citation-aware prompting across heterogeneous scholarly sources
PythonFastAPIpgvectorPostgreSQLMiniLMBM25RRFSHA-256

Shipped Metrics

+21.8%

MRR Lift

0.616

Faithfulness

85.6%

Claim Support

Stack

Python
FastAPI
pgvector
PostgreSQL
MiniLM
BM25
Experience

Where I've built

Shipping production ML and data systems — from research labs to telecom platforms.

Jun 2025 — Present
USC Annenberg Norman Lear Center
Los Angeles, CA

Architecting AWS data platforms, multi-modal alignment systems, and large-scale batch ML pipelines for media and social-impact research.

  • Architected an AWS data platform (S3, Glue, SageMaker, Bedrock) ingesting, deduplicating, and normalizing 1M+ multi-region records for downstream ML training and retrieval workloads
  • Shipped a multi-modal alignment system fusing audio, speaker diarization, and caption streams — reaching 99.3% F1 and 99.9% coverage on ground-truth evaluation
  • Developed large-scale batch pipelines processing long-form video and audio through Whisper ASR, pyannote diarization, and model-based refinement stages
  • Automated dataset QA, Unicode normalization, and deduplication in Python — lifting analysis-ready yield from 10,819 raw inputs to 9,735 records with full reproducibility
PythonAWSS3GlueSageMakerBedrockWhisperpyannote
Dec 2023 — Jul 2024
Reliance Jio Platforms

Software Engineer

·Reliance Jio Platforms
Navi Mumbai, India

Trained and deployed deep vision and quantized transformer models into production, and engineered demand-forecasting microservices for business-critical workloads.

  • Trained and deployed ResNet-50 and DenseNet-121 deep vision networks for medical image anomaly detection — improving recall by 35% through transfer learning, augmentation, and loss tuning
  • Optimized quantized transformer inference (BERT, GPT-2) on GPU with batched serving — cutting p95 latency by 30% and lifting throughput while preserving 20% accuracy gains
  • Engineered demand-forecasting microservices (TFT, CatBoost, LSTM) over Hive SQL batch pipelines, reducing forecast MAPE by 25% for business-critical workloads
  • Rolled out shadow-testing and canary-release workflows for 3 production ML upgrades, catching 2 latency regressions before fleet-wide deployment
PythonPyTorchBERTGPT-2ResNetDenseNetHive SQLCatBoost
Toolchain

What I build with

Stacks and tools I reach for when shipping production systems.

01

Languages

PythonC++GoSQLBashTypeScriptJavaScript

02

ML & Deep Learning

PyTorchHugging FaceONNX RuntimeQuantizationLoRA / PEFTFine-TuningDistributed TrainingMLflow

03

LLMs & Retrieval

Prompt EngineeringFunction CallingStructured OutputsRAGHybrid RetrievalRerankingpgvectorQdrantRAGAS

04

Backend & Data Systems

FastAPITemporalgRPCRESTSSEKafkaSparkPostgreSQLRedisMongoDB

05

Cloud & DevOps

AWS S3AWS GlueSageMakerBedrockGCPDockerKubernetesLinuxGitHub ActionsPrometheusGrafanaAirflow

06

AI-Assisted Development

Claude CodeCursorGitHub CopilotCodex
Off Hours

Beyond the terminal

What keeps perspective — the things worth stepping away for.

Interests

Football👑Real Madrid🏊Swimming🏓Table Tennis🎬Netflix🎧Spotify🎮Game Dev

Hala Madrid

Real Madrid CF

A proud Real Madrid supporter through and through — the mentality, history, and winning culture is unmatched.

Hala Madrid y nada más.

You miss 100% of the shots you don't take.

Wayne Gretzky
— Michael Scott
Contact

Let's build
something real.

I'm open to full-time SDE / AI engineering roles, research collaborations, and serious engineering conversations. Reach out and I'll reply within a day.

Say hello
AI EngineeringRetrieval SystemsML PlatformsOpen to SDE roles