Sanket Muchhala - AI Engineer and Data Scientist
See what's on my screen

Hello, I'm

Sanket Muchhala

AI/ML Engineer

I build AI that ships. MS in Data Science from Indiana University Bloomington.

About

AI/ML Engineer with 3+ years of experience building scalable solutions using generative AI, LLMs, and NLP.

I specialize in designing agentic systems, document intelligence workflows, and ML pipelines deployed at scale. My work spans insurance, esports, and enterprise analytics, from research and development to production deployment of AI systems that make real-world impact.

Technologies & Tools

Python SQL R JavaScript TensorFlow PyTorch Scikit-learn FastAPI MLflow SpaCy GPT-4 LangChain RAG Agentic AI Vector DBs NER Text Classification Summarization Sentiment Analysis Pandas NumPy PySpark AWS Azure Data Lake Azure SQL Tableau Power BI R Shiny
Explore full skills & projects ecosystem

Education

MS in Data Science

Indiana University Bloomington Aug 2022 – May 2024

Machine Learning, Deep Learning, NLP, Computer Vision, Applied Database Technologies, Data Mining

B.Tech in Computer Engineering

Thakur College of Engineering & Technology Jun 2017 – Jun 2021

Data Structures, Algorithms, Database Systems, Operating Systems, Computer Networks

Experience

AI Engineer

Progressive Insurance May 2024 – Present
  • Engineered custom NLP and CV models using TensorFlow and PyTorch to process and classify claim-related texts, forms, and images
  • Built ML pipelines with Apache Airflow and Azure Data Factory for streaming data from SQL Server and Azure Data Lake
  • Deployed models to production using Azure ML Services; managed versioning with MLflow and DVC
  • Achieved 35% reduction in manual claim processing time and improved fraud detection accuracy by 25%

Research Assistant, Generative AI

Indiana University Bloomington Dec 2023 – May 2024
  • Improved transcript accuracy by 18pp using a GPT-4 RAG pipeline deployed on BigRed200, processing 200+ hours of esports videos
  • Reduced latency 40% in chat feature via GPT-4 sentiment analysis microservice, processing 1M+ messages in near real-time
  • Automated retraining pipelines using SLURM on HPC systems, cutting manual ETL effort by 6 hours per match

Data Analyst

IBM Sep 2020 – Jun 2022
  • Led end-to-end development of a churn prediction model using Python and Scikit-learn, driving a 20% reduction in customer attrition
  • Refactored ETL workflows using Azure Data Lake and SQL, improving data availability and cutting processing time by 15%
  • Deployed ML models to Azure ML environments with CI/CD support, accelerating release cycles by 25%

Other Projects

LexOrchestrator

Litigation workflow agent demo with a multi-agent orchestration pipeline. Features hybrid RAG retrieval, citation verification, adversarial review, and eval scoring.

Next.js TypeScript Supabase RAG MCP

GTM Simulator

An open-source AI simulation engine for B2B founders. Simulates Go-To-Market strategies using Mirofish swarm intelligence to test messaging against buyer personas.

Vue 3 Flask Python Docker

Mirofish API Server

A monetized prediction API powered by the MiroFish swarm intelligence engine. Spawns multi-agent simulations where diverse AI agents debate and collectively predict outcomes.

FastAPI Python Web3 Docker