Shane Zhang (张欣耕)
- 732-491-6378
- [email protected]
- www.shanezhang.com
- Piscataway, NJ

AI/ML Engineer with over 5 years of experience specializing in fine-tuning Large Language Models (LLMs) and building advanced Agentic Retrieval-Augmented Generation (RAG) systems. Passionate about AI, I aim to master emerging technologies to solve real-world problems and enhance human creativity. My dream is to witness the development of the first AGI. I have a proven ability to bring AI models from the lab to tangible products that reach millions of users.
Work Experience
AI/ML Engineer
Rutherford, NJ
- Built a fast structured data processing pipeline with Python, reducing processing time from 10 seconds to under 1 second per document, enhancing document handling efficiency by 90%.
- Designed and accelerated a robust RESTful API with Python FastAPI for real-time data ingestion and NLP processing into Postgres SQL database, supporting over 1 million API calls per day.
- Deployed solutions at scale with OpenShift and Apache Spark, streamlining CI/CD procedures with zero downtime over 120 days.
- Leveraged advanced Agentic RAG systems and Knowledge Graphs with RDF and LPG formats using Python and LangChain to improve global retrieval accuracy, increasing the F1 score of AI-driven compliance analysis from 0.32 to 0.71.
- Implemented evaluation metrics and visualizations with LangChain and RAGAS to ensure LLM quality, reducing hallucination rate by 30% and error investigation time by 60%.
- Utilized OpenAI GPT, Claude, and Google Gemini models and fine-tuned open-source LLM models to develop customized and scalable Agentic RAG systems handling over 200,000 API calls per day.
- Developed human-centered evaluation frameworks (RLHF) to assess LLM performance in real-world scenarios, continuously ensuring alignment with user intents for over 2,000 active users.
- Deployed scalable AI-driven RAG systems on AWS, using S3, EC2, Glue, Lambda, SageMaker, and Bedrock for efficient data processing and LLM integration pipelines.
- Developed and deployed FastAPI interfaces and gRPC protocols for efficient API integration across cloud providers like Azure and AWS.
AI/ML Engineer
New Brunswick, NJ
- Designed, built, and deployed an Agentic RAG system using Python, NumPy, Pandas, JavaScript, SQL, and Chroma vector database, automating the parsing and summarization of over 5,000 research documents.
- Engineered an advanced Agentic RAG system with OpenAI and React framework, reducing onboarding time by 50% and improving research efficiency by 40%.
- Collaborated with a cross-functional team of 15 academic researchers, integrating over 200 feedback points into the NLP system to align the platform with current and future research objectives.
- Conducted monthly workshops on ML usage for 10+ staff members, achieving 80% adoption of ML-enhanced workflow across the organization.
- Utilized AWS SageMaker for model fine-tuning and AWS Bedrock for serving the models in production environments.
Machine Learning Engineer
San Francisco, CA
- Integrated NLP-driven features into the Node.js backend with Python PyTorch (C++ CUDA), enabling real-time text generation and summarization functionalities, cutting response times by 70%.
- Conducted data pre-processing and data exploration using Python PySpark, NumPy, and Pandas, ensuring high-quality data integration and leading to a 20% improvement in model training results using PyTorch distributed.
- Optimized deep learning models through quantization and pruning techniques with TensorRT, achieving a 40% decrease in inference latency.
- Produced a semantic graph database using Neo4j and Cypher queries to store and query complex relationships, improving data retrieval efficiency by 60%.
Projects
Enterprise-Scale RAG System
Built a comprehensive RAG system for financial document analysis using LangChain, pgvector, and AWS Bedrock
Healthcare Research Assistant
Developed an AI assistant for medical researchers using OpenAI APIs, vector databases, and React
NLP Text Analysis Platform
Created an NLP-powered platform for text analysis with Node.js backend and PyTorch models
Information
Publications
- Berns, M. P., Nunez, G. M., Zhang, X., et al. (Sep 2024). Auditory Decision-making Deficits After Permanent Noise-induced Hearing Loss.
Education
-
M.S. in Computer Science, Machine Learning SpecialtyRutgers, The State University of New JerseyDec 2020 - Dec 2022
-
B.S. in Computer ScienceRutgers, The State University of New JerseySep 2017 - Sep 2020
Skills
Generative AI
Machine Learning & NLP
Data Engineering
Programming
DevOps & MLOps
Professional & Soft Skills
Languages
- English (Native, Fluent)
- Mandarin (Native, Fluent)