arXiv RAG — Semantic Research Q&A System
Production-structured RAG pipeline answering questions about 150 machine learning research papers. Empirically compares BGE, MPNet, MiniLM embeddings vs BM25 sparse baseline across chunk sizes (256–1024 tokens) and generation quality metrics. BGE excels at retrieval (MRR@5 0.990, Precision@5 0.950) and generation (Answer Relevance 0.912, Faithfulness 0.989).