Inference Acceleration for the 70B LLaMA-2 Large Language Model

Qingyu Zhang
Qingyu Zhang
Master Student of Computer Science and Technology

I work on AI sales and customer-service agents, with experience across LLM pretraining, post-training, evaluation, and model efficiency.