Qingyu Zhang

Master Student of Computer Science and Technology

Institute of Software Chinese Academy of Sciences

Biography

I’m Qingyu Zhang, a first-year master’s student at Chinese Information Processing Laboratory in the Institute of Software Chinese Academy of Sciences. My research focuses on large language models, including long-text capabilities and multi-turn dialogue abilities.

Interests

LLM Long Context
LLM Compression & Efficiency
LLM Post-training
LLM Reinforcement Learning

Education

M.S. in Computer Science and Technology, 2024 - Present
Institute of Software Chinese Academy of Sciences
B.S. in Computer Science and Technology, 2020 - 2024
College of Computer and Data Science, Fuzhou University

News

May, 2025 One paper “ShortGPT” is accepted by ACL Findings 2025.
Sep, 2024 One paper “Base of RoPE Bounds Context Length” is accepted by NeurIPS 2024.
Jun, 2024 Honored as an Outstanding Graduate at Fuzhou University.
May, 2023 Won the First Prize in the 10th ASC Student Supercomputer Challenge.
Nov, 2022 Won the First Prize in the 13th National College Student Mathematics Competition.

Experience

Algorithm Intern

Meituan

December 2024 – Present Beijing, China

Led the R&D of an RL-based dialogue optimization system for large models.
Deployed in a live business environment, increasing core business conversion rate by ~20%.
Research submitted to AAAI 2026.

Foundation Model Intern

Baichuan Intelligence

January 2024 – October 2024 Beijing, China

Investigated Transformer redundancy and proposed a layer-based pruning method (ShortGPT, ACL Findings, 2025).
Researched the lower bounds of RoPE Base (Base of RoPE Bounds Context Length, NeurIPS, 2024).
Proposed a variant of the “Needle in a Haystack” evaluation method (Patent Granted).

Research Intern

Institute of Software, Chinese Academy of Sciences

October 2023 – September 2024 Beijing, China

Adapted and optimized SFT/DPO algorithms for the Megatron framework (ACL Demo, 2025).
Implemented large-scale distributed training on Ascend 910b using the ModelLink framework.

Recent Publications

Here are some of my recent publications. You can find the full list in my CV.

Xin Men, Xin Men, Mingyu Xu, Mingyu Xu, Qingyu Zhang, Qingyu Zhang, Qianhao Yuan, Qianhao Yuan, Bingning Wang, Bingning Wang, Hongyu Lin, Hongyu Lin, Yaojie Lu, Yaojie Lu, Xianpei Han, Xianpei Han, Weipeng Chen, Weipeng Chen (2025). ShortGPT: Layers in Large Language Models are More Redundant Than You Expect. In ACL Findings 2025.

PDF Cite

Xin Men*, Xin Men*, Mingyu Xu*, Mingyu Xu*, Bingning Wang†, Bingning Wang†, Qingyu Zhang, Qingyu Zhang, Hongyu Lin, Hongyu Lin, Xianpei Han, Xianpei Han, Weipeng Chen, Weipeng Chen (2024). Base of RoPE Bounds Context Length. In NeurIPS 2024.

PDF Cite

Projects

Last updated on Aug 4, 2025

Inference Acceleration for the 70B LLaMA-2 Large Language Model

As a coaching assistant for the ASC24 Student Supercomputer Challenge, I helped the team optimize the inference performance of the LLaMA-2-70B model using efficient frameworks like vLLM and designed data parallelism strategies to significantly reduce latency.

Last updated on Aug 4, 2025

Training the 10-Billion Parameter Yuan-1.0 LLM

As a key member of the ASC23 team, I trained a 10B-level large language model using the DeepSpeed-Megatron framework, combining tensor, pipeline, and data parallelism. Our work won the First Prize.

Contact

ttraveller2001@gmail.com
No. 4, South Fourth Street, Zhongguancun, Beijing, Haidian District 100190