From ff2c214992eb0eee6038a082d5e4f18de791785e Mon Sep 17 00:00:00 2001 From: Keita Watanabe Date: Tue, 31 Dec 2024 09:19:49 +0900 Subject: [PATCH] Update efa-env-var.md Making it consistent with https://github.com/aws-samples/awsome-distributed-training/blob/main/1.architectures/efa-cheatsheet.md Signed-off-by: Keita Watanabe --- doc/efa-env-var.md | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/doc/efa-env-var.md b/doc/efa-env-var.md index cf75034e2..f5e04c56b 100644 --- a/doc/efa-env-var.md +++ b/doc/efa-env-var.md @@ -74,6 +74,11 @@ versions of your libfabric. behaves very differently, especially on newer kernels, where RDMAV_FORK_SAFE=1 can break things. + + `NCCL_SHM_USE_CUDA_MEMCPY=1 + Setting this when you run NCCL on g6/g5. It gives x2 performance in comparison to default memcpy. + + RDMAV_* Do not use.