Qingyu Zhang
Qingyu Zhang
Home
News
Publications
Projects
Awards
Contact
Light
Dark
Automatic
Model Compression
ShortGPT: Layers in Large Language Models are More Redundant Than You Expect
We investigate the redundancy within Transformer layers and propose an effective layer-based pruning method.
Xin Men
,
Mingyu Xu
,
Qingyu Zhang
,
Qianhao Yuan
,
Bingning Wang
,
Hongyu Lin
,
Yaojie Lu
,
Xianpei Han
,
Weipeng Chen
PDF
Cite
Cite
×