available · full-time · internships · usa

Building production-grade ML systems that ship — from multimodal transformers and RAG pipelines to real-time AWS inference APIs. M.S. Data Science & AI · UCM · 4.0 GPA.

0Production Projects
0GPA · MS AI
0arXiv Papers RAG'd
0Articles Analyzed
01 · about

ML Engineer &
Data Scientist

I don't just build models — I ship ML systems. My work spans the full stack: statistical research through model training, containerized deployment, REST APIs, and CI/CD automation on AWS.

Currently an M.S. student in Data Science & AI at the University of Central Missouri with a perfect 4.0 GPA, I spend my time building production RAG pipelines, multimodal deep learning systems, and real-time fraud detection APIs you can actually hit with a curl command.

I'm looking for full-time & internship roles in Data Science, ML Engineering, or Applied AI in the United States — available immediately.

02 · arsenal

Tech Stack

Languages
PythonSQLRC++JavaBash
Machine Learning
Scikit-learnXGBoostRandom ForestLogistic RegressionDecision TreesKMeansSMOTEFeature EngineeringROC-AUC
Deep Learning & NLP
PyTorchTensorFlowKerasCNNsTransformersHuggingFaceBERTDistilBERTWav2Vec2ViTTF-IDFWord EmbeddingsFAISSSentence Transformers
Generative AI & LLMs
RAG PipelinesLLMsGemini APIVector DatabasesBM25Prompt Engineering
Data & Statistics
PandasNumPySciPyStatsmodelsSARIMAXProphetEDAHypothesis TestingA/B TestingTime Series
Cloud & MLOps
AWS EC2S3SageMakerAthenaGlueRedshiftMLflowDockerFastAPINginxGitHub ActionsCI/CD
Visualization
PlotlyDashStreamlitMatplotlibSeabornBokeh
03 · work

Featured Projects

Featured · RAG + LLM Pipeline

arXiv RAG — Semantic Research Q&A System

Production-structured RAG pipeline answering questions over 120 arXiv ML papers. Rigorously benchmarks BGE, MPNet, MiniLM vs BM25 across chunk sizes and generation quality. BGE achieves perfect MRR 1.000 — 7.7× Answer Relevance over BM25 (0.910 vs 0.118).

MRR 1.000 (BGE)Precision@5 · 97.5%Faithfulness · 97.8%120 arXiv Papers
PythonFAISSSentenceTransformersGemini 2.5 FlashBM25StreamlitPyMuPDF
Featured · Production MLOps · AWS

Real-Time Credit Card Fraud Detection

End-to-end production ML system. Automated GitHub Actions CI/CD trains, evaluates, and promotes models to MLflow Model Registry (PostgreSQL RDS). Dockerized FastAPI inference API behind Nginx on AWS EC2. XGBoost & Random Forest hit ROC-AUC 1.000 with SMOTE-balanced pipeline.

ROC-AUC 1.000MLflow RegistryCI/CD AutomatedAWS Deployed
XGBoostMLflowFastAPIDockerAWS EC2RDSS3GitHub ActionsNginx
🎭 Deep Learning · Multimodal Fusion

Multimodal Emotion Recognition via Cross-Modal Transformers

Fuses text (BERT), audio (Wav2Vec2), and video (ViT) using a custom 4-layer Transformer Encoder (8 attention heads) for 7-class emotion classification on MELD. Mixed-precision AMP training, staged fine-tuning, cached tensor preprocessing, and class-weighted loss for minority emotion handling.

Accuracy · 59.96%Weighted F1 · 60.66%3 Modalities Fused
PyTorchBERTWav2Vec2ViTHuggingFaceAMPOpenCVLibrosa
📰 NLP · Transformers vs Classical

Fake News Detection — Classical NLP vs Transformers

Systematic comparison of TF-IDF + Logistic Regression vs fine-tuned DistilBERT on 62K+ articles. Dynamic F1-threshold optimization, NLTK preprocessing, feature explainability from TF-IDF weights, and real-time NewsAPI integration for live article classification via Streamlit.

62K+ ArticlesDistilBERT Fine-tunedDynamic Threshold
PyTorchDistilBERTTF-IDFScikit-learnNLTKStreamlitNewsAPI
Live · Time Series · Deployed on Render

US COVID-19 Mortality Intelligence & Forecasting System

End-to-end epidemiological forecasting: live JHU data ingestion → preprocessing → feature engineering → multi-model benchmarking (SARIMAX, Prophet, XGBoost) → interactive Plotly Dash dashboard with 30-day forecasts and 95% CI bands. Seasonal SARIMAX(1,1,1)(1,1,1,7) wins with Holdout MAE 138.14 ≈ CV MAE 137.86 — near-zero overfitting confirmed across 5-fold rolling cross-validation.

Holdout MAE · 138.14CV MAE · 137.8630-Day Forecast · 95% CI3,300+ Records5-Fold Rolling CV
PythonSARIMAXProphetXGBoostPlotly DashStatsmodelsDockerRender
04 · experience

Professional History

Feb 2025 — May 2026
University of Central Missouri · Warrensburg, MO
Graduate Laboratory Assistant
  • Support 40+ students per semester in data science labs — Python programming, data analysis, and ML algorithm implementation
  • Guide students through data preprocessing, EDA, and visualization workflows using Pandas, NumPy, and Matplotlib
May 2023 — Jul 2023
AWS Academy – AICTE EduSkills · Virtual
Data Analytics Virtual Intern
  • Designed analytics pipelines on AWS to ingest and process datasets using Amazon S3, AWS Glue, and Amazon Athena
  • Queried large semi-structured datasets (10K+ records) using Athena SQL; maintained metadata in AWS Glue Data Catalog
  • Conducted EDA and predictive modeling using Python, Jupyter, and Amazon SageMaker
May 2022 — Jun 2022
Swecha · Hyderabad, Telangana
Data Scientist Intern
  • Processed structured and semi-structured datasets using Python (Pandas, NumPy) for ML workflows
  • Performed EDA using Matplotlib and Seaborn; trained and evaluated scikit-learn models with feature engineering
05 · education

Academic Background

Aug 2024 — May 2026
Master of Science
Data Science & Artificial Intelligence
University of Central Missouri · Warrensburg, MO
GPA: 4.0 / 4.0 ⭐In Progress
Nov 2020 — Apr 2024
Bachelor of Technology (Hons.)
Computer Science & Engineering
CMR College of Engineering & Technology · Hyderabad, India
GPA: 9.06 / 10.0 🎍First Class · Distinction

Certifications

AWS Academy Graduate — Data Analytics
Amazon Web Services
Jun 2023
AWS Academy Graduate — Cloud Foundations
Amazon Web Services
Jun 2023
Foundations: Data, Data, Everywhere
Google / Coursera
Feb 2023
Artificial Intelligence & Machine Learning
Blackbuck Engineers
Dec 2021

Honors & Awards

🥇
Global Rank 1 — CodeChef May Long Two (Division 4 Rated)
🏆
Finalist — IICC National Coding Competition (Top 1% · 100,000+ participants)
🥉
Bronze Medal in Academics — Ranked Top 3 in B.Tech Program
5★ HackerRank · 4★ CodeChef — Data Structures & Algorithms
🔥
GeeksforGeeks POTD Streak — 255 Consecutive Days Solved
🎖️
District Rank 4 — SAST Olympiad
06 · articles

Medium Articles

Long-form breakdowns on applied deep learning and production AI systems — written for practitioners who care about the details.

07 · contact

Let's Connect

Actively seeking full-time & internship Data Science, ML Engineering, and Applied AI roles in the United States. Available immediately. Let's build something that matters.

Open to Opportunities

Data Science & AI · MS · University of Central Missouri

Data ScienceMachine LearningMLOpsNLPDeep LearningAWSRAG
Send a Message