Satvik G S

Data Science Engineer
Bangalore, IN.

About

Accomplished Data Science Engineer with a strong record of developing and optimizing advanced LLM-powered applications and scalable data pipelines. Successfully enhanced SQL query generation accuracy by 20% and reduced information retrieval time by over 85% through innovative automation. Eager to apply deep expertise in AI/ML, backend development, and database management to drive significant technological advancements.

Work

Thoughtclan Technologies
|

Data Science Engineer

Bangalore, Karnataka, India

Summary

As a Data Science Engineer, I developed and optimized advanced LLM-powered applications and scalable data pipelines, significantly enhancing system efficiency and accuracy for RAG and natural language to SQL query generation projects.

Highlights

Engineered a scalable Python-based data ingestion pipeline, scraping over 50,000 pages from diverse sources to support a Retrieval-Augmented Generation (RAG) system.

Designed a custom chunking algorithm, extracting structured data from over 12,000 documents (PDFs, PPTs, DOCs) including text, tables, and images.

Optimized LLM context retrieval and MilvusDB ingestion using Delta run and Re-config optimizations to enhance system performance.

Fine-tuned hybrid, keyword, and semantic search models with re-ranking techniques, boosting search relevance scores by 25%.

Benchmarked and optimized open-source LLMs (LLaMA 3.1, Mistral 7B, Mixtral 8x7B) through LoRA, QLoRA, and PEFT fine-tuning techniques.

Evaluated proprietary LLMs (OpenAI, Claude, Gemini-1.5 Pro/Flash) to inform cost-effective deployment strategies.

Automated information retrieval, reducing processing time from 1 hour to under 30 seconds, achieving an 85.7% efficiency improvement.

Thoughtclan Technologies
|

Data Science Engineer

Bangalore, Karnataka, India

Summary

Led initiatives to integrate advanced LLMs and database technologies for enhanced natural language to SQL query generation, significantly improving accuracy and response times.

Highlights

Evaluated Neo4j database integration into existing architecture, extracting relevant table and column descriptions from more than 50 tables.

Engineered core logic for a ReACt Agent using the LangGraph framework, enabling dynamic tool selection based on user queries.

Jointly led the integration of Google's Gemini LLM and few-shot learning, elevating SQL query generation accuracy from 70% to 90%.

Revamped the backend codebase, decreasing overall response time from 27 seconds to 20 seconds, achieving a 25.9% improvement.

Received an Amazon voucher award for exceptional contributions to the project.

Education

RNS Institute of Technology
Bangalore, Karnataka, India

Bachelor of Technology

Computer Science and Engineering

Grade: CGPA of 8.77

Vikas PU College
Mangalore, Karnataka, India

Higher Secondary (PCMC)

Grade: 91.6%

Skills

Languages

Python, C++, HTML/CSS, JavaScript, SQL.

Frameworks/Libraries/Technologies

FastAPI, Langchain, Langgraph, Flask, Pandas, Numpy, Matplotlib, BeautifulSoup, Selenium.

Developer Tools/Platforms

AWS, Azure, Git, Github, Docker, HuggingFace, ArgoCD, Jenkins, Visual Studio Code, Pycharm, Windows, Linux.

Databases

PostgresSQL, MongoDB, Milvus, Weaviate, Chroma, DynamoDB, Neo4j.