Skip to content

Pull upstream into saforem2/fix-formatting#90

Open
saforem2 wants to merge 8 commits intoargonne-lcf:saforem2/fix-formattingfrom
deepspeedai:main
Open

Pull upstream into saforem2/fix-formatting#90
saforem2 wants to merge 8 commits intoargonne-lcf:saforem2/fix-formattingfrom
deepspeedai:main

Conversation

@saforem2
Copy link
Member

Pull upstream commits into saforem2/fix-formatting branch:

argonne-lcf/Megatron-DeepSpeed @ saforem2/fix-formattingdeepspeedai/Megatron-DeepSpeed @ main

Copilot Summary

This pull request primarily updates repository links across multiple files and introduces enhancements to the fine-tuning configuration for Hugging Face LLAMA models. The changes ensure consistency in repository references and improve usability for fine-tuning workflows.

Repository Link Updates:

Fine-Tuning Enhancements:

YJHMITWEB and others added 8 commits December 4, 2024 17:34
* pass batch_dim_idx to deepspeed sequence parallel distributed attention for supporting batch size larger than 1

* add FPDT support; add Ulysses rotary position embedding support

* add FPDT support; add Ulysses rotary position embedding support

* add FPDT support; add Ulysses rotary position embedding support

* add FPDT support; add Ulysses rotary position embedding support

* remove unnecessary files

* set the warmup length to be FPDT chunk size if enabled

---------

Co-authored-by: Jinghan Yao <yjhmitweb@ascend-rw02.ten.osc.edu>
Co-authored-by: Jinghan Yao <yjhmitweb@ascend-rw01.ten.osc.edu>
* [tools]GQA convert support

* fix readme
Previously, `deepspeed_to_megatron.py` would raise an import error
due to the relative import.

This commit fixes this issue by changing from the relative import
to the absolute import like in `deepspeed_to_transformers.py`.
Signed-off-by: Logan Adams <loadams@microsoft.com>
…run successfully with DeepSpeed (#468)

Signed-off-by: yisheng <yi.sheng@intel.com>
Signed-off-by: yisheng <yi.sheng@intel.com>
Signed-off-by: Schwidola0607 <khoadangpham82944@gmail.com>
…nabled (#479)

* pass batch_dim_idx to deepspeed sequence parallel distributed attention for supporting batch size larger than 1

Signed-off-by: Jinghan Yao <yjhmitweb@cardinal-rw02.ten.osc.edu>

* add fused_rms_norm support on XPU device (#431)

Signed-off-by: Jinghan Yao <yjhmitweb@cardinal-rw02.ten.osc.edu>

* [LLaMa] Adding support converting checkpoint from mds to hf (#432)

* add support converting checkpoint from hf to mds

* Fix PP issue

* update

Signed-off-by: Jinghan Yao <yjhmitweb@cardinal-rw02.ten.osc.edu>

* add device check when import ipex (#436)

Signed-off-by: Jinghan Yao <yjhmitweb@cardinal-rw02.ten.osc.edu>

* fix TFLOPs calculation (#371)

* fix TFLOPs calculation

when GQA used, we observe right TFLOPs after this fix.
when GQA is not used, huge difference in TFLOPs is solved with
selective recompute .
some other minor difference will also be observed as logits macs also added.

* add copyrights

Signed-off-by: Jinghan Yao <yjhmitweb@cardinal-rw02.ten.osc.edu>

* fix nan issue when running megatron-deepspeed (#434)

Signed-off-by: Jinghan Yao <yjhmitweb@cardinal-rw02.ten.osc.edu>

* enable empty cache on XPU device (#438)

Signed-off-by: Jinghan Yao <yjhmitweb@cardinal-rw02.ten.osc.edu>

* [wandb] disable wandb more gracefully (#422)

Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com>
Signed-off-by: Jinghan Yao <yjhmitweb@cardinal-rw02.ten.osc.edu>

* [Bug] Fix crash when logging optimizer state to tb (#417)

Signed-off-by: Jinghan Yao <yjhmitweb@cardinal-rw02.ten.osc.edu>

* add FPDT support; add Ulysses rotary position embedding support

Signed-off-by: Jinghan Yao <yjhmitweb@cardinal-rw02.ten.osc.edu>

* add FPDT support; add Ulysses rotary position embedding support

Signed-off-by: Jinghan Yao <yjhmitweb@cardinal-rw02.ten.osc.edu>

* add FPDT support; add Ulysses rotary position embedding support

Signed-off-by: Jinghan Yao <yjhmitweb@cardinal-rw02.ten.osc.edu>

* add FPDT support; add Ulysses rotary position embedding support

Signed-off-by: Jinghan Yao <yjhmitweb@cardinal-rw02.ten.osc.edu>

* remove unnecessary files

Signed-off-by: Jinghan Yao <yjhmitweb@cardinal-rw02.ten.osc.edu>

* set the warmup length to be FPDT chunk size if enabled

Signed-off-by: Jinghan Yao <yjhmitweb@cardinal-rw02.ten.osc.edu>

* Enable Sequence Parallelism (#429)

Signed-off-by: Jinghan Yao <yjhmitweb@cardinal-rw02.ten.osc.edu>

* grad_wei can't be NoneType when running with DeepSpeed, for zero3 will divided the gradient (#428)

Signed-off-by: Jinghan Yao <yjhmitweb@cardinal-rw02.ten.osc.edu>

* fix init issue for rms_norm in squence_parallel (#448)

Signed-off-by: Jinghan Yao <yjhmitweb@cardinal-rw02.ten.osc.edu>

* enable profiler for specific ranks (#451)

Signed-off-by: Jinghan Yao <yjhmitweb@cardinal-rw02.ten.osc.edu>

* fix init issue for silently ignoring the deepspeed config (#452)

Signed-off-by: Jinghan Yao <yjhmitweb@cardinal-rw02.ten.osc.edu>

* fix moe tflops (#445)

Signed-off-by: Jinghan Yao <yjhmitweb@cardinal-rw02.ten.osc.edu>

* [tool]GQA convert support (#454)

* [tools]GQA convert support

* fix readme

Signed-off-by: Jinghan Yao <yjhmitweb@cardinal-rw02.ten.osc.edu>

* Fix import error in `deepspeed_to_megatron.py` (#455)

Previously, `deepspeed_to_megatron.py` would raise an import error
due to the relative import.

This commit fixes this issue by changing from the relative import
to the absolute import like in `deepspeed_to_transformers.py`.

Signed-off-by: Jinghan Yao <yjhmitweb@cardinal-rw02.ten.osc.edu>

* Update references to new GitHub org (deepspeedai) (#462)

Signed-off-by: Logan Adams <loadams@microsoft.com>
Signed-off-by: Jinghan Yao <yjhmitweb@cardinal-rw02.ten.osc.edu>

* add sequence_parallel in layernorm init to enable 3D parallelism can run successfully with DeepSpeed (#468)

Signed-off-by: yisheng <yi.sheng@intel.com>
Signed-off-by: Jinghan Yao <yjhmitweb@cardinal-rw02.ten.osc.edu>

* fix bug when FPDT is disabled but with original Ulysses

Signed-off-by: Jinghan Yao <yjhmitweb@cardinal-rw02.ten.osc.edu>
Signed-off-by: jinghan yao yjhmitweb@gmail.com
Signed-off-by: Jinghan Yao <yjhmitweb@cardinal-rw02.ten.osc.edu>

---------

Signed-off-by: Jinghan Yao <yjhmitweb@cardinal-rw02.ten.osc.edu>
Signed-off-by: Logan Adams <loadams@microsoft.com>
Signed-off-by: yisheng <yi.sheng@intel.com>
Signed-off-by: jinghan yao yjhmitweb@gmail.com
Co-authored-by: Jinghan Yao <yjhmitweb@ascend-rw02.ten.osc.edu>
Co-authored-by: YiSheng5 <syhm@mail.ustc.edu.cn>
Co-authored-by: billishyahao <yahao.he@gmail.com>
Co-authored-by: Polisetty V R K Jyothendra Varma <jvarma@habana.ai>
Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com>
Co-authored-by: Jinghan Yao <yjhmitweb@ascend-rw01.ten.osc.edu>
Co-authored-by: ranzhejiang <zhejiang.ran@intel.com>
Co-authored-by: Xinyu Lian <lian7@illinois.edu>
Co-authored-by: inkcherry <mingzhi.liu@intel.com>
Co-authored-by: hotsuyuki <hotsuyuki.kawanishi@gmail.com>
Co-authored-by: Jinghan Yao <yjhmitweb@cardinal-rw02.ten.osc.edu>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants