autotp training(fix dco) #7004

inkcherry · 2025-02-05T05:21:10Z

Same as this PR. affeb88
I noticed the CI updated the DCO check recently. Using the suggested rebase method for sign-off would reintroduce many conflicts, so I opted for a squash merge with sign-off instead. thanks: )

Signed-off-by: inkcherry <mingzhi.liu@intel.com>

delock · 2025-02-06T01:29:03Z

Kudos @inkcherry for contributing AutoTP training! It's a nice feature make tensor parallel training/finetuning more available to HF model users.

I think a tutorial page would help user discover and learn how to use this feature in DeepSpeed. Is it possible to write a tutorial and add it under https://github.com/deepspeedai/DeepSpeed/tree/master/docs/_tutorials introducing steps how to use this feature? I remember you have an example training alpaca with DeepSpeed AutoTP.

autotp training squash merge

09d09a5

Signed-off-by: inkcherry <mingzhi.liu@intel.com>

inkcherry requested review from tjruwase and loadams as code owners February 5, 2025 05:21

tjruwase approved these changes Feb 5, 2025

View reviewed changes

tjruwase added this pull request to the merge queue Feb 5, 2025

Merged via the queue into deepspeedai:master with commit f04649d Feb 5, 2025
12 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

autotp training(fix dco) #7004

autotp training(fix dco) #7004

inkcherry commented Feb 5, 2025 •

edited

Loading

delock commented Feb 6, 2025

autotp training(fix dco) #7004

autotp training(fix dco) #7004

Conversation

inkcherry commented Feb 5, 2025 • edited Loading

delock commented Feb 6, 2025

inkcherry commented Feb 5, 2025 •

edited

Loading