About Me

AI/ML Engineer with over 5 years of experience specializing in fine-tuning Large Language Models (LLMs) and building advanced Agentic Retrieval-Augmented Generation (RAG) systems. Passionate about AI, I aim to master emerging technologies to solve real-world problems and enhance human creativity. My dream is to witness the development of the first AGI. I have a proven ability to bring AI models from the lab to tangible products that reach millions of users.

Work Experience

AI/ML Engineer

Citigroup Inc. US | Aug 2024 - Present

Rutherford, NJ

Built a fast structured data processing pipeline with Python, reducing processing time from 10 seconds to under 1 second per document, enhancing document handling efficiency by 90%.
Designed and accelerated a robust RESTful API with Python FastAPI for real-time data ingestion and NLP processing into Postgres SQL database, supporting over 1 million API calls per day.
Deployed solutions at scale with OpenShift and Apache Spark, streamlining CI/CD procedures with zero downtime over 120 days.
Leveraged advanced Agentic RAG systems and Knowledge Graphs with RDF and LPG formats using Python and LangChain to improve global retrieval accuracy, increasing the F1 score of AI-driven compliance analysis from 0.32 to 0.71.
Implemented evaluation metrics and visualizations with LangChain and RAGAS to ensure LLM quality, reducing hallucination rate by 30% and error investigation time by 60%.
Utilized OpenAI GPT, Claude, and Google Gemini models and fine-tuned open-source LLM models to develop customized and scalable Agentic RAG systems handling over 200,000 API calls per day.
Developed human-centered evaluation frameworks (RLHF) to assess LLM performance in real-world scenarios, continuously ensuring alignment with user intents for over 2,000 active users.
Deployed scalable AI-driven RAG systems on AWS, using S3, EC2, Glue, Lambda, SageMaker, and Bedrock for efficient data processing and LLM integration pipelines.
Developed and deployed FastAPI interfaces and gRPC protocols for efficient API integration across cloud providers like Azure and AWS.

AI/ML Engineer

Robert Wood Johnson University Hospital | Jan 2023 - Aug 2024

New Brunswick, NJ

Designed, built, and deployed an Agentic RAG system using Python, NumPy, Pandas, JavaScript, SQL, and Chroma vector database, automating the parsing and summarization of over 5,000 research documents.
Engineered an advanced Agentic RAG system with OpenAI and React framework, reducing onboarding time by 50% and improving research efficiency by 40%.
Collaborated with a cross-functional team of 15 academic researchers, integrating over 200 feedback points into the NLP system to align the platform with current and future research objectives.
Conducted monthly workshops on ML usage for 10+ staff members, achieving 80% adoption of ML-enhanced workflow across the organization.
Utilized AWS SageMaker for model fine-tuning and AWS Bedrock for serving the models in production environments.

Machine Learning Engineer

Fiskkit Inc. | Jan 2020 - July 2021

San Francisco, CA

Integrated NLP-driven features into the Node.js backend with Python PyTorch (C++ CUDA), enabling real-time text generation and summarization functionalities, cutting response times by 70%.
Conducted data pre-processing and data exploration using Python PySpark, NumPy, and Pandas, ensuring high-quality data integration and leading to a 20% improvement in model training results using PyTorch distributed.
Optimized deep learning models through quantization and pruning techniques with TensorRT, achieving a 40% decrease in inference latency.
Produced a semantic graph database using Neo4j and Cypher queries to store and query complex relationships, improving data retrieval efficiency by 60%.

Projects

Enterprise-Scale RAG System

Citigroup

Built a comprehensive RAG system for financial document analysis using LangChain, pgvector, and AWS Bedrock

Healthcare Research Assistant

RWJUH

Developed an AI assistant for medical researchers using OpenAI APIs, vector databases, and React

NLP Text Analysis Platform

Fiskkit

Created an NLP-powered platform for text analysis with Node.js backend and PyTorch models

Information

Publications

Berns, M. P., Nunez, G. M., Zhang, X., et al. (Sep 2024). Auditory Decision-making Deficits After Permanent Noise-induced Hearing Loss.

Education

M.S. in Computer Science, Machine Learning Specialty

Rutgers, The State University of New Jersey

Dec 2020 - Dec 2022
B.S. in Computer Science

Rutgers, The State University of New Jersey

Sep 2017 - Sep 2020

Skills

Generative AI

LLM Fine-Tuning

Parameter-Efficient Fine-Tuning (PEFT)

Retrieval-Augmented Generation (RAG)

Agentic RAG Systems

Knowledge Graphs (RDF)

Knowledge Graphs (LPG formats)

GraphRAG

OpenAI GPT

Google Gemini

Anthropic Claude

LangChain

LangGraph

Prompt Engineering

RLHF

Machine Learning & NLP

PyTorch

TensorFlow

Keras

Transformers

HuggingFace

Natural Language Processing (NLP)

spaCy

NLTK

Feature Engineering

Statistical Modeling

Quantization

Pruning Techniques

Model Evaluation

Performance Analysis

Neural Networks

GANs

Stable Diffusion

Data Engineering

Postgres with pgvector

FAISS

Chroma

Neo4j

Cypher

MySQL

PostgreSQL

Oracle

Elasticsearch

MongoDB

PySpark

Pandas

NumPy

Apache Spark

Hadoop

ETL Workflows

Data Processing Pipelines

Programming

Python

FastAPI

Streamlit

PyTest

JavaScript

Node.js

React

Scala

Java

C++

REST API Development

gRPC

JWT

Authentication Systems

DevOps & MLOps

Docker

Kubernetes

OpenShift

CI/CD Pipelines

Development Workflows

AWS SageMaker

AWS Bedrock

AWS S3

AWS EC2

AWS Lambda

AWS Glue

Azure Cloud Services

MLflow

RAGAS

Professional & Soft Skills

Technical Communication

Documentation

Team Collaboration

Cross-functional Leadership

Stakeholder Management

Client Presentations

Mentoring

Training

Agile Methodologies

Project Management

Languages

Interests

Generative AI

SCUBA Diving

Reading

Philosophy

Psychology

Sociology

Shane Zhang (张欣耕)

Work Experience

AI/ML Engineer

AI/ML Engineer

Machine Learning Engineer

Projects

Enterprise-Scale RAG System

Healthcare Research Assistant

NLP Text Analysis Platform

Information

Publications

Education

Skills

Generative AI

Machine Learning & NLP

Data Engineering

Programming

DevOps & MLOps

Professional & Soft Skills

Languages

Interests