Shane Zhang (张欣耕)

AI/ML Engineer, LLM Specialist, RAG Expert
Profile image

AI/ML Engineer with over 5 years of experience specializing in fine-tuning Large Language Models (LLMs) and building advanced Agentic Retrieval-Augmented Generation (RAG) systems. Passionate about AI, I aim to master emerging technologies to solve real-world problems and enhance human creativity. My dream is to witness the development of the first AGI. I have a proven ability to bring AI models from the lab to tangible products that reach millions of users.

Work Experience

AI/ML Engineer

Citigroup Inc. US | Aug 2024 - Present

Rutherford, NJ

  • Built a fast structured data processing pipeline with Python, reducing processing time from 10 seconds to under 1 second per document, enhancing document handling efficiency by 90%.
  • Designed and accelerated a robust RESTful API with Python FastAPI for real-time data ingestion and NLP processing into Postgres SQL database, supporting over 1 million API calls per day.
  • Deployed solutions at scale with OpenShift and Apache Spark, streamlining CI/CD procedures with zero downtime over 120 days.
  • Leveraged advanced Agentic RAG systems and Knowledge Graphs with RDF and LPG formats using Python and LangChain to improve global retrieval accuracy, increasing the F1 score of AI-driven compliance analysis from 0.32 to 0.71.
  • Implemented evaluation metrics and visualizations with LangChain and RAGAS to ensure LLM quality, reducing hallucination rate by 30% and error investigation time by 60%.
  • Utilized OpenAI GPT, Claude, and Google Gemini models and fine-tuned open-source LLM models to develop customized and scalable Agentic RAG systems handling over 200,000 API calls per day.
  • Developed human-centered evaluation frameworks (RLHF) to assess LLM performance in real-world scenarios, continuously ensuring alignment with user intents for over 2,000 active users.
  • Deployed scalable AI-driven RAG systems on AWS, using S3, EC2, Glue, Lambda, SageMaker, and Bedrock for efficient data processing and LLM integration pipelines.
  • Developed and deployed FastAPI interfaces and gRPC protocols for efficient API integration across cloud providers like Azure and AWS.

AI/ML Engineer

Robert Wood Johnson University Hospital | Jan 2023 - Aug 2024

New Brunswick, NJ

  • Designed, built, and deployed an Agentic RAG system using Python, NumPy, Pandas, JavaScript, SQL, and Chroma vector database, automating the parsing and summarization of over 5,000 research documents.
  • Engineered an advanced Agentic RAG system with OpenAI and React framework, reducing onboarding time by 50% and improving research efficiency by 40%.
  • Collaborated with a cross-functional team of 15 academic researchers, integrating over 200 feedback points into the NLP system to align the platform with current and future research objectives.
  • Conducted monthly workshops on ML usage for 10+ staff members, achieving 80% adoption of ML-enhanced workflow across the organization.
  • Utilized AWS SageMaker for model fine-tuning and AWS Bedrock for serving the models in production environments.

Machine Learning Engineer

Fiskkit Inc. | Jan 2020 - July 2021

San Francisco, CA

  • Integrated NLP-driven features into the Node.js backend with Python PyTorch (C++ CUDA), enabling real-time text generation and summarization functionalities, cutting response times by 70%.
  • Conducted data pre-processing and data exploration using Python PySpark, NumPy, and Pandas, ensuring high-quality data integration and leading to a 20% improvement in model training results using PyTorch distributed.
  • Optimized deep learning models through quantization and pruning techniques with TensorRT, achieving a 40% decrease in inference latency.
  • Produced a semantic graph database using Neo4j and Cypher queries to store and query complex relationships, improving data retrieval efficiency by 60%.

Projects

Built a comprehensive RAG system for financial document analysis using LangChain, pgvector, and AWS Bedrock

Python FastAPI LangChain Postgres pgvector AWS Bedrock

Developed an AI assistant for medical researchers using OpenAI APIs, vector databases, and React

Python React OpenAI Chroma DB AWS SageMaker

Created an NLP-powered platform for text analysis with Node.js backend and PyTorch models

Node.js Python PyTorch PySpark Neo4j

Education

  • M.S. in Computer Science, Machine Learning Specialty
    Rutgers, The State University of New Jersey
    Dec 2020 - Dec 2022
    GPA: 3.71
  • B.S. in Computer Science
    Rutgers, The State University of New Jersey
    Sep 2017 - Sep 2020

Certifications

  • AWS Certified Machine Learning - Specialty (MLS-C01)
    Amazon Web Services
    Jun 2024
    This certification validates expertise in building, training, tuning, and deploying machine learning, Deep Learning, Generative AI, and LLM models on AWS.

Teaching

  • Teaching Assistant
    Rutgers University
    Dec 2020 - Dec 2022
    Tutored over 400 students in Computer Science and Machine Learning topics

Publications

Publications

  • Berns, M. P., Nunez, G. M., Zhang, X., et al. (Sep 2024). Auditory Decision-making Deficits After Permanent Noise-induced Hearing Loss.
© 2025 Shane Zhang. All Rights Reserved.