Shane Zhang (张欣耕)

AI/ML Engineer, LLM Specialist, RAG Expert

Profile image

AI/ML Engineer with over 5 years of experience specializing in fine-tuning Large Language Models (LLMs) and building advanced Agentic Retrieval-Augmented Generation (RAG) systems. Passionate about AI, I aim to master emerging technologies to solve real-world problems and enhance human creativity. My dream is to witness the development of the first AGI. I have a proven ability to bring AI models from the lab to tangible products that reach millions of users.


Work Experience

AI/ML Engineer

Citigroup Inc. US | Aug 2024 - Present

Rutherford, NJ

  • Built a fast structured data processing pipeline with Python, reducing processing time from 10 seconds to under 1 second per document, enhancing document handling efficiency by 90%.
  • Designed and accelerated a robust RESTful API with Python FastAPI for real-time data ingestion and NLP processing into Postgres SQL database, supporting over 1 million API calls per day.
  • Deployed solutions at scale with OpenShift and Apache Spark, streamlining CI/CD procedures with zero downtime over 120 days.
  • Leveraged advanced Agentic RAG systems and Knowledge Graphs with RDF and LPG formats using Python and LangChain to improve global retrieval accuracy, increasing the F1 score of AI-driven compliance analysis from 0.32 to 0.71.
  • Implemented evaluation metrics and visualizations with LangChain and RAGAS to ensure LLM quality, reducing hallucination rate by 30% and error investigation time by 60%.
  • Utilized OpenAI GPT, Claude, and Google Gemini models and fine-tuned open-source LLM models to develop customized and scalable Agentic RAG systems handling over 200,000 API calls per day.
  • Developed human-centered evaluation frameworks (RLHF) to assess LLM performance in real-world scenarios, continuously ensuring alignment with user intents for over 2,000 active users.
  • Deployed scalable AI-driven RAG systems on AWS, using S3, EC2, Glue, Lambda, SageMaker, and Bedrock for efficient data processing and LLM integration pipelines.
  • Developed and deployed FastAPI interfaces and gRPC protocols for efficient API integration across cloud providers like Azure and AWS.

AI/ML Engineer

Robert Wood Johnson University Hospital | Jan 2023 - Aug 2024

New Brunswick, NJ

  • Designed, built, and deployed an Agentic RAG system using Python, NumPy, Pandas, JavaScript, SQL, and Chroma vector database, automating the parsing and summarization of over 5,000 research documents.
  • Engineered an advanced Agentic RAG system with OpenAI and React framework, reducing onboarding time by 50% and improving research efficiency by 40%.
  • Collaborated with a cross-functional team of 15 academic researchers, integrating over 200 feedback points into the NLP system to align the platform with current and future research objectives.
  • Conducted monthly workshops on ML usage for 10+ staff members, achieving 80% adoption of ML-enhanced workflow across the organization.
  • Utilized AWS SageMaker for model fine-tuning and AWS Bedrock for serving the models in production environments.

Machine Learning Engineer

Fiskkit Inc. | Jan 2020 - July 2021

San Francisco, CA

  • Integrated NLP-driven features into the Node.js backend with Python PyTorch (C++ CUDA), enabling real-time text generation and summarization functionalities, cutting response times by 70%.
  • Conducted data pre-processing and data exploration using Python PySpark, NumPy, and Pandas, ensuring high-quality data integration and leading to a 20% improvement in model training results using PyTorch distributed.
  • Optimized deep learning models through quantization and pruning techniques with TensorRT, achieving a 40% decrease in inference latency.
  • Produced a semantic graph database using Neo4j and Cypher queries to store and query complex relationships, improving data retrieval efficiency by 60%.

Projects

Enterprise-Scale RAG System

Citigroup

Built a comprehensive RAG system for financial document analysis using LangChain, pgvector, and AWS Bedrock

Healthcare Research Assistant

RWJUH

Developed an AI assistant for medical researchers using OpenAI APIs, vector databases, and React

NLP Text Analysis Platform

Fiskkit

Created an NLP-powered platform for text analysis with Node.js backend and PyTorch models

Information

Publications

  • Berns, M. P., Nunez, G. M., Zhang, X., et al. (Sep 2024). Auditory Decision-making Deficits After Permanent Noise-induced Hearing Loss.

Education

  • M.S. in Computer Science, Machine Learning Specialty
    Rutgers, The State University of New Jersey
    Dec 2020 - Dec 2022
  • B.S. in Computer Science
    Rutgers, The State University of New Jersey
    Sep 2017 - Sep 2020

Skills

Generative AI

LLM Fine-Tuning
Parameter-Efficient Fine-Tuning (PEFT)
Retrieval-Augmented Generation (RAG)
Agentic RAG Systems
Knowledge Graphs (RDF)
Knowledge Graphs (LPG formats)
GraphRAG
OpenAI GPT
Google Gemini
Anthropic Claude
LangChain
LangGraph
Prompt Engineering
RLHF

Machine Learning & NLP

PyTorch
TensorFlow
Keras
Transformers
HuggingFace
Natural Language Processing (NLP)
spaCy
NLTK
Feature Engineering
Statistical Modeling
Quantization
Pruning Techniques
Model Evaluation
Performance Analysis
Neural Networks
GANs
Stable Diffusion

Data Engineering

Postgres with pgvector
FAISS
Chroma
Neo4j
Cypher
MySQL
PostgreSQL
Oracle
Elasticsearch
MongoDB
PySpark
Pandas
NumPy
Apache Spark
Hadoop
ETL Workflows
Data Processing Pipelines

Programming

Python
FastAPI
Streamlit
PyTest
JavaScript
Node.js
React
Scala
Java
C++
REST API Development
gRPC
JWT
Authentication Systems

DevOps & MLOps

Docker
Kubernetes
OpenShift
CI/CD Pipelines
Development Workflows
AWS SageMaker
AWS Bedrock
AWS S3
AWS EC2
AWS Lambda
AWS Glue
Azure Cloud Services
MLflow
RAGAS

Professional & Soft Skills

Technical Communication
Documentation
Team Collaboration
Cross-functional Leadership
Stakeholder Management
Client Presentations
Mentoring
Training
Agile Methodologies
Project Management

Languages

  • English (Native, Fluent)
  • Mandarin (Native, Fluent)

Interests

Generative AI
SCUBA Diving
Reading
Philosophy
Psychology
Sociology

© 2025 Shane Zhang. All Rights Reserved.