available · full-time · internships · usa

Building production-grade ML systems that ship — from multimodal transformers and RAG pipelines to real-time AWS inference APIs. M.S. Data Science & AI · UCM · 4.0 GPA.

0Production Projects
0GPA · MS AI
0arXiv Papers RAG'd
0Articles Analyzed
01 · about

ML Engineer &
Data Scientist

I don't just build models — I ship ML systems. My work spans the full stack: statistical research through model training, containerized deployment, REST APIs, and CI/CD automation on AWS.

Currently an M.S. student in Data Science & AI at the University of Central Missouri with a perfect 4.0 GPA, I spend my time building production RAG pipelines, multimodal deep learning systems, and real-time fraud detection APIs you can actually hit with a curl command.

I'm looking for full-time & internship roles in Data Science, ML Engineering, or Applied AI in the United States — available immediately.

02 · arsenal

Tech Stack

Languages
PythonSQLRC++JavaBash
Machine Learning
Scikit-learnXGBoostRandom ForestLogistic RegressionDecision TreesKMeansSMOTEFeature EngineeringROC-AUC
Deep Learning & NLP
PyTorchTensorFlowKerasCNNsTransformersHuggingFaceBERTDistilBERTWav2Vec2ViTTF-IDFWord EmbeddingsFAISSSentence Transformers
Generative AI & LLMs
RAG PipelinesLLMsGemini APIVector DatabasesBM25Prompt Engineering
Data & Statistics
PandasNumPySciPyStatsmodelsSARIMAXProphetEDAHypothesis TestingA/B TestingTime Series
Cloud & MLOps
AWS EC2S3SageMakerAthenaGlueRedshiftMLflowDockerFastAPINginxGitHub ActionsCI/CD
Visualization
PlotlyDashStreamlitMatplotlibSeabornBokeh
03 · work

Featured Projects

Featured · RAG + LLM Pipeline

arXiv RAG — Semantic Research Q&A System

Production-structured RAG pipeline answering questions about 150 machine learning research papers. Empirically compares BGE, MPNet, MiniLM embeddings vs BM25 sparse baseline across chunk sizes (256–1024 tokens) and generation quality metrics. BGE excels at retrieval (MRR@5 0.990, Precision@5 0.950) and generation (Answer Relevance 0.912, Faithfulness 0.989).

MRR@5 · 0.990 (BGE)Precision@5 · 95.0%Answer Relevance · 0.912150 arXiv Papers
PythonFAISSSentenceTransformersGemini 2.5 Flash LiteBM25StreamlitPyMuPDFApple MPS
Featured · Production MLOps · AWS

Real-Time Credit Card Fraud Detection

End-to-end production ML system. Automated GitHub Actions CI/CD trains, evaluates, and promotes models to MLflow Model Registry (PostgreSQL RDS). Dockerized FastAPI inference API behind Nginx on AWS EC2. XGBoost & Random Forest hit ROC-AUC 1.000 with SMOTE-balanced pipeline.

ROC-AUC 1.000MLflow RegistryCI/CD AutomatedAWS Deployed
XGBoostMLflowFastAPIDockerAWS EC2RDSS3GitHub ActionsNginx
🎭 Deep Learning · Multimodal Fusion

Multimodal Emotion Recognition via Cross-Modal Transformers

Fuses text (BERT), audio (Wav2Vec2), and video (ViT) using a custom 4-layer Transformer Encoder (8 attention heads) for 7-class emotion classification on MELD. Mixed-precision AMP training, staged fine-tuning, cached tensor preprocessing, and class-weighted loss for minority emotion handling.

Accuracy · 59.96%Weighted F1 · 60.66%3 Modalities Fused
PyTorchBERTWav2Vec2ViTHuggingFaceAMPOpenCVLibrosa
📰 NLP · Transformers vs Classical

Fake News Detection — Classical NLP vs Transformers

Systematic comparison of TF-IDF + Logistic Regression vs fine-tuned DistilBERT on 62K+ articles. Dynamic F1-threshold optimization, NLTK preprocessing, feature explainability from TF-IDF weights, and real-time NewsAPI integration for live article classification via Streamlit.

62K+ ArticlesDistilBERT Fine-tunedDynamic Threshold
PyTorchDistilBERTTF-IDFScikit-learnNLTKStreamlitNewsAPI
Live · Time Series · Deployed on Render

US COVID-19 Mortality Intelligence & Forecasting System

End-to-end epidemiological forecasting: live JHU data ingestion → preprocessing → feature engineering → multi-model benchmarking (SARIMAX, Prophet, XGBoost) → interactive Plotly Dash dashboard with 30-day forecasts and 95% CI bands. Seasonal SARIMAX(1,1,1)(1,1,1,7) wins with Holdout MAE 138.14 ≈ CV MAE 137.86 — near-zero overfitting confirmed across 5-fold rolling cross-validation.

Holdout MAE · 138.14CV MAE · 137.8630-Day Forecast · 95% CI3,300+ Records5-Fold Rolling CV
PythonSARIMAXProphetXGBoostPlotly DashStatsmodelsDockerRender
04 · experience

Professional History

Feb 2025 — May 2026
University of Central Missouri · Warrensburg, MO
Graduate Laboratory Assistant
  • Support 40+ students per semester in data science labs — Python programming, data analysis, and ML algorithm implementation
  • Guide students through data preprocessing, EDA, and visualization workflows using Pandas, NumPy, and Matplotlib
May 2023 — Jul 2023
AWS Academy – AICTE EduSkills · Virtual
Data Analytics Virtual Intern
  • Designed analytics pipelines on AWS to ingest and process datasets using Amazon S3, AWS Glue, and Amazon Athena
  • Queried large semi-structured datasets (10K+ records) using Athena SQL; maintained metadata in AWS Glue Data Catalog
  • Conducted EDA and predictive modeling using Python, Jupyter, and Amazon SageMaker
May 2022 — Jun 2022
Swecha · Hyderabad, Telangana
Data Scientist Intern
  • Processed structured and semi-structured datasets using Python (Pandas, NumPy) for ML workflows
  • Performed EDA using Matplotlib and Seaborn; trained and evaluated scikit-learn models with feature engineering
05 · education

Academic Background

Aug 2024 — May 2026
Master of Science
Data Science & Artificial Intelligence
University of Central Missouri · Warrensburg, MO
GPA: 4.0 / 4.0 ⭐In Progress
Nov 2020 — Apr 2024
Bachelor of Technology (Hons.)
Computer Science & Engineering
CMR College of Engineering & Technology · Hyderabad, India
GPA: 9.06 / 10.0 🎍First Class · Distinction

Certifications

Introduction to Model Context Protocol
Anthropic Education
Apr 2026
AWS Academy Graduate — Data Analytics
Amazon Web Services
Jun 2023
AWS Academy Graduate — Cloud Foundations
Amazon Web Services
Jun 2023
Foundations: Data, Data, Everywhere
Google / Coursera
Feb 2023
Artificial Intelligence & Machine Learning
Blackbuck Engineers
Dec 2021

Honors & Awards

🥇
Global Rank 1 — CodeChef May Long Two (Division 4 Rated)
🏆
Finalist — IICC National Coding Competition (Top 1% · 100,000+ participants)
🥉
Bronze Medal in Academics — Ranked Top 3 in B.Tech Program
5★ HackerRank · 4★ CodeChef — Data Structures & Algorithms
🔥
GeeksforGeeks POTD Streak — 255 Consecutive Days Solved
🎖️
District Rank 4 — SAST Olympiad
06 · articles

Medium Articles

Long-form breakdowns on applied deep learning and production AI systems — written for practitioners who care about the details.

07 · contact

Let's Connect

Actively seeking full-time & internship Data Science, ML Engineering, and Applied AI roles in the United States. Available immediately. Let's build something that matters.

Open to Opportunities

Data Science & AI · MS · University of Central Missouri

Data ScienceMachine LearningMLOpsNLPDeep LearningAWSRAG
Send a Message