张清宇
张清宇
首页
最新消息
近期发表
联系方式
浅色
深色
自动
中文 (简体)
中文 (简体)
English
Distributed Training
Training the 10-Billion Parameter Yuan-1.0 LLM
As a key member of the ASC23 team, I trained a 10B-level large language model using the DeepSpeed-Megatron framework, combining tensor, pipeline, and data parallelism. Our work won the
First Prize
.
引用
×