Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update efa-env-var.md #760

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Update efa-env-var.md #760

wants to merge 1 commit into from

Conversation

KeitaW
Copy link

@KeitaW KeitaW commented Dec 31, 2024

Making it consistent with https://github.com/aws-samples/awsome-distributed-training/blob/main/1.architectures/efa-cheatsheet.md

Issue #, if available:

Description of changes:

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@KeitaW KeitaW requested a review from a team as a code owner December 31, 2024 00:20
Copy link
Member

@rajachan rajachan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @KeitaW. Since the time we folded this file into the repo.. we have made some improvements to the projects wiki pages and all the environment variables are documented here: https://github.com/aws/aws-ofi-nccl/wiki/Environment-Variables . A cheatsheet style doc like what's here is bound to get stale (it already is, with incorrect recommendations and missing tunables). IMO, the wiki is the best place to capture all configurations and I am in favor of removing this file altogether. I can move the contents into the wiki if you agree.

@aws-nslick
Copy link
Contributor

unfortunately unless you're a team member you cannot push to it or submit PRs on those, but want to note here for others that github supports editing the wiki pages locally:

aws-ofi-nccl.wiki on  master [?] 
❯ git rev-list --all
1a38c65e376751bf5f41378ee79fbec0552637aa
1a7b16347291974a515e930496082cecc2cae7df
f8104e3fa4ab77b63e9d85e6adb2da0ce9a09286
de297662553dab93a4d9d41ff66c4ab06edabf0b
dce70c4b358112e8c733088202ce82421bee90fe
d6485f3d0363b13144e797eec633d1deb371606f
c587642e2ee9e5e1a6c71a479ca1fcb1b3355ddb
47581c73baefd3238ac89ae2b4ec9b872da278bc
10b09bad5596ea8653b4e80d91b7b6945dc388c4
bda4d469dce95d8f23fd85c5b79c47ac4b2989b4
b6232d9185d63bc59a3071ace1c858da99deea13
befc654e11e47572d40524d7e63680b18f3aa014
633a57dc32aebc2cd27338b634cfe5dd4391a9c9
b69ff42b99be8be075fb6e757b6e03f714c1a003
fef45007c0e560ffb45ba776aae09f738b9240f3
048cbeb331a5345cd1707fc4559935ecaf8f3d8d
506b405bb2d49655fcc2d0374fc1c593eb9a41d6
092daf8e6c84a8f2b55ee531c9f6693af0d2a3b1
8b49c87dd4ea575dc8c5032922bbbf44fbb44429
cc7a06e08c66cc589403152547d85e4cdb53ad63
42d39a63e46a9fe5e70c64c7c6fab6d251e4fc1d
aws-ofi-nccl.wiki on  master [?] 
❯ git remote -v             
origin	https://github.com/aws/aws-ofi-nccl.wiki.git (fetch)
origin	https://github.com/aws/aws-ofi-nccl.wiki.git (push)

I wonder what happens if you make the doc folder on the main repo into a git submodule, pointing to this url?

@KeitaW
Copy link
Author

KeitaW commented Jan 22, 2025

Sorry for the late reply and thank you very much for your feedback. I agree with you that the wiki page is the right place to concentrate the information. If you could add the info in the PR to the wiki as well, that would be awesome.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants