About
Accomplished Data Science Engineer with a strong record of developing and optimizing advanced LLM-powered applications and scalable data pipelines. Successfully enhanced SQL query generation accuracy by 20% and reduced information retrieval time by over 85% through innovative automation. Eager to apply deep expertise in AI/ML, backend development, and database management to drive significant technological advancements.
Work
Thoughtclan Technologies
|Data Science Engineer
Bangalore, Karnataka, India
→
Summary
As a Data Science Engineer, I developed and optimized advanced LLM-powered applications and scalable data pipelines, significantly enhancing system efficiency and accuracy for RAG and natural language to SQL query generation projects.
Highlights
Engineered a scalable Python-based data ingestion pipeline, scraping over 50,000 pages from diverse sources to support a Retrieval-Augmented Generation (RAG) system.
Designed a custom chunking algorithm, extracting structured data from over 12,000 documents (PDFs, PPTs, DOCs) including text, tables, and images.
Optimized LLM context retrieval and MilvusDB ingestion using Delta run and Re-config optimizations to enhance system performance.
Fine-tuned hybrid, keyword, and semantic search models with re-ranking techniques, boosting search relevance scores by 25%.
Benchmarked and optimized open-source LLMs (LLaMA 3.1, Mistral 7B, Mixtral 8x7B) through LoRA, QLoRA, and PEFT fine-tuning techniques.
Evaluated proprietary LLMs (OpenAI, Claude, Gemini-1.5 Pro/Flash) to inform cost-effective deployment strategies.
Automated information retrieval, reducing processing time from 1 hour to under 30 seconds, achieving an 85.7% efficiency improvement.
Thoughtclan Technologies
|Data Science Engineer
Bangalore, Karnataka, India
→
Summary
Led initiatives to integrate advanced LLMs and database technologies for enhanced natural language to SQL query generation, significantly improving accuracy and response times.
Highlights
Evaluated Neo4j database integration into existing architecture, extracting relevant table and column descriptions from more than 50 tables.
Engineered core logic for a ReACt Agent using the LangGraph framework, enabling dynamic tool selection based on user queries.
Jointly led the integration of Google's Gemini LLM and few-shot learning, elevating SQL query generation accuracy from 70% to 90%.
Revamped the backend codebase, decreasing overall response time from 27 seconds to 20 seconds, achieving a 25.9% improvement.
Received an Amazon voucher award for exceptional contributions to the project.
Education
RNS Institute of Technology
→
Bachelor of Technology
Computer Science and Engineering
Grade: CGPA of 8.77
Vikas PU College
→
Higher Secondary (PCMC)
Grade: 91.6%
Skills
Languages
Python, C++, HTML/CSS, JavaScript, SQL.
Frameworks/Libraries/Technologies
FastAPI, Langchain, Langgraph, Flask, Pandas, Numpy, Matplotlib, BeautifulSoup, Selenium.
Developer Tools/Platforms
AWS, Azure, Git, Github, Docker, HuggingFace, ArgoCD, Jenkins, Visual Studio Code, Pycharm, Windows, Linux.
Databases
PostgresSQL, MongoDB, Milvus, Weaviate, Chroma, DynamoDB, Neo4j.