-
Notifications
You must be signed in to change notification settings - Fork 144
Conference call notes 20210609
Kenneth Hoste edited this page Jun 23, 2021
·
5 revisions
(back to Conference calls)
Notes on the 174th EasyBuild conference call, Wednesday June 9th 2021 (08:00 UTC)
Alphabetical list of attendees (10):
- Sebastian Achilles (Jülich Supercomputing Centre, Germany)
- Miguel Dias Costa (National University of Singapore)
- Alexander Grund (TU Dresden, Germany)
- Jorge Guerra (Universidad Politécnica de Madrid, Spain)
- Kenneth Hoste (HPC-UGent, Belgium)
- Adam Huffman (Big Data Institute, Oxford, UK)
- Kurt Lust (Univ. of Antwerp, Belgium + LUMI User Support Team)
- Alan O'Cais (Jülich Supercomputing Centre, Germany)
- Mikael Öhman (Chalmers University of Technology, Sweden)
- Jörg Saßmannshausen (NIHR Biomedical Research Centre, UK)
- overview of recent developments
- Q&A
- last release: EasyBuild v4.4.0 (June 2nd)
- ETA next release: early July
- recent changes
-
framework
-
bug fixes
- various fixes for Fujitsu toolchain support (PR #3704, PR #3712, PR #3713, PR #3714, PR #3717, PR #3721, PR #3731)
- Miguel: toolchain concept in EasyBuild helps a lot here
- now looking into
numpy
- easy when you ignore the failing tests...
- currently fighting with numpy compiler detection
- now looking into
- Miguel: toolchain concept in EasyBuild helps a lot here
- fix support for specifying multiple PRs to
--from-pr
(PR #3707, PR #3708) - avoid overwritting
pr_nr
inpost_pr_test_report
for reports with--include-easyblocks-from-pr
(PR #3724 + PR #3726)- results in posting test report in easyblock PR rather than easyconfigs PR, but only when running EasyBuild on top of Python 2
- various fixes for Fujitsu toolchain support (PR #3704, PR #3712, PR #3713, PR #3714, PR #3717, PR #3721, PR #3731)
- enhancements
-
changes
- drop support for Python 2.6 (PR #3715)
-
bug fixes
-
easyblocks
-
bug fixes
- correctly handle empty list of sources in
PythonPackage._should_unpack_source
(PR #2442)- bug introduced in PR that skips unpacking of
*.whl
files
- bug introduced in PR that skips unpacking of
- make sure that self.python_cmd is set before using it in PythonPackage.sanity_check_step (PR #2447)
- required to avoid breaking
--module-only
for easyconfigs usingPythonBundle
, now that we're also checking extensions when using--module-only
(unless--skip-extensions
is used)
- required to avoid breaking
- only use siterc fix for NVHPC < 21.3 (PR #2453)
- correctly handle empty list of sources in
-
enhancements
- enhance sanity check for Clang to verify if CUDA offload library was produced (PR #2454)
- new software
- add custom easyblock for FreeFEM (PR #1969)
- add custom easyblock for NCCL (built from source) (PR #2337 + PR #2460)
- some PyTorch 1.8.1 tests fail with NCCL 2.8.3, fixed in NCCL 2.8.4
- currently can't patch NCCL due to binary install
- replacing NCCL 2.8.3 with 2.8.4 is painful (can lead to module clashes for stuff that's already installed)
- alternative: build NCCL from source + apply patch
- we do need to bump toolchain for NCCL to
GCCcore
fromsystem
- see NCCL easyconfigs PRs https://github.com/easybuilders/easybuild-easyconfigs/pull/13071 + https://github.com/easybuilders/easybuild-easyconfigs/pull/12183
- add custom easyblock for AOMP (AMD OpenMP compiler) (PR #2435)
-
changes
- (none)
-
bug fixes
-
easyconfigs
- close to 100 easyconfig PRs merged since last conf call
-
over 10,000 easyconfig PRs merged! \o/
-
#10,000
was easyconfig for EasyBuild v4.4.0 (PR #13012)
-
-
bug fixes
- add patches for PyTorch 1.7.1 avoiding failures on POWER and A100 (PR #12753)
- fix download URL for DB 18.1.40 (PR #12974)
- fix test failure in TensorFlow 2.4.1 on recent CUDA drivers (PR #12979)
- add elfutils as build dependency for Clang easyconfigs with CUDA dependency (PR #13008 + PR #13015)
- add patch to fix buffer overflow in OpenMPI 4.1.x (PR #12983)
- add patch to fix installation of HDF 4.2.15 on aarch64 (PR #13059)
- add new checksum of mvabund to R v4.0.4 (PR #13020 + PR #13021)
- fix checksum for snpEff 5.0 (PR #13062)
-
enhancements
- add check to easyconfigs test suite to ensure OpenSSL wrapper is used in easyconfigs using a recent toolchain (PR #13079)
- new software
- noteworthy software updates
-
noteworthy changes
- update easyconfigs for binutils 2.35 to use binutils 2.35.2 source tarball instead to pick up bug fixes (PR #12967 + PR #12988)
- promote
foss/2021.04
tofoss/2021a
(PR #12975) - promote intel/2021.03 to intel/2021a (PR #12976)
- add UCX patch to allow overriding modules (PR #12980)
- to facilitate collapse of
foss
andfosscuda
(2021a
)
- to facilitate collapse of
- disable debuginfod for elfutils to minimize required dependencies (PR #13034)
-
framework
- to merge/fix/tackle soon
-
framework
-
reported bugs / bug fixes
- specified easyblock for extension is not taken into account (issue #3710)
- fix crash in
get_config_dict
when copying modules that were imported in easyconfig file (like 'import os
') (PR #3729, fixes issue #3727) - rebuilding module breaks for HMNS if there are sort-of circular builddependencies (e.g. XZ and gettext) (issue #3722)
-
enhancements
- support additional features in easystack files
- support for filtering via labels (PR #3620)
- avoid using a priority in
prepend_module_path
(Lmod) to avoid costly module calls (PR #3636) - add support for installing extensions in parallel (WIP) (PR #3667)
- add
make_extension_string
and_make_extension_list
toEasyBlock
(PR #3697)- related to avoiding duplicates in Perl extensions
- enhance detection of patch files with better error messages (PR #3709)
- add per-step timing information (PR #3716)
- add module-write hook (PR #3728)
- add option to ignore failing test step (
--ignore-test-failure
) (PR #3732) - finding modules with multiple modulepaths and HMNS (issue #3703)
- support additional features in easystack files
- changes
-
reported bugs / bug fixes
-
easyblocks
-
bug fixes
- treat files/directories of unpacked sources equally in
PackedBinary
(PR #2306) -
--module-only
doesn't always work as expected- we need a better way of catching this in tests
- problem is that you typically need an actual installation to catch these problems, so can't be done in easyconfigs or easyblocks test suite run in CI
- test installations done on
generoso
viaboegelbot
could be enhanced to catch problems with--module-only
?
- explicitly use only OpenBLAS for PyTorch if MKL is not used (PR #2448)
- Fix CPU-only runtime for dpcpp-generated executables in custom easyblock for
intel-compilers
(oneAPI) (PR #2457)
- treat files/directories of unpacked sources equally in
-
enhancements
- enhance test and install step of
CMakePythonPackage
easyblock (PR #2318) - add support for installing R extensions in parallel (WIP) (PR #2408)
- allow for Perl modules being part of other, already installed Perl modules (PR #2386)
- including FlexiBLAS as the default BLAS in foss will require easyblock changes (issue #2421)
- should set
BLA_VENDOR
inCMakeMake
easyblock if BLAS is in the toolchain (PR #2420) - enhance
sitecfg
to support overriding core Python packages (PR #2458) - enable make check and sanity check exec for OpenMPI (PR #2444)
- add CMake support for Amber 20 (PR #2445)
- enhance
ConfigureMake
generic easyblock to add support for building multiple build targets (PR #2449) - update custom easyblock for Boost to always build single and multi threaded versions (PR #2456)
- included CMake modules are broken (and skipped) because we install Boost libraries multiple times
- update CMakeMake to handle old and new Boost/Boost.Python builds using custom easyblock for Boost (PR #2461)
- enhance test and install step of
-
changes
- (nothing major)
-
new software
- new easyblock for NCCL (built from source) (PR #2337)
-
bug fixes
- easyconfigs
-
framework
- TODO:
fosscuda/2021a
- collapsing
foss
andfosscuda
toolchains - see https://github.com/easybuilders/easybuild-easyconfigs/issues/12484
- status? (Mikael)
- we wouldn't have
fosscuda/2021a
anymore, just depend on an extra dependency (UCX-CUDA
) for GPU Direct RDMA support -
CUDA
bundle that depends onCUDAcore
+UCX-CUDA
to use as dependency? - CUDA support should be reflected in
versionsuffix
? - Intel MPI probably doesn't even support GPU Direct?
- upstream UCX issue opened, developers were open to suggestion of environment variable
- currently hardcoded in
UCX-CUDA
easyconfig PR, could be better to have a custom easyblock for this? - CUDA 11.1.3 is not supported with GCC 10.3
- known issues with GCC 10.3
- patches already available for a compiler error (C++ template parsing error)
- CUDA 11 is only compatible with GCC 9.x officially on
x86_64
- known issues with GCC 10.3
- collapsing
- custom AMD toolchain? (AOCC, AOMP, AMD BLIS, etc.)?
- AOCC currently best compiler for CPUs, no GPU offloading support
- AOMP comes with ROCm, behind on CPU optimizations
- there's even a 3rd one (development only)
- can we include both AOCC and AOMP in an AMD toolchain?
- these compilers are not compatible for Fortran...
- would be a CPU and GPU variant?
- CPU: AOCC-based
- GPU: AOMP-based + ROCm
- Alexander: include
GCC
rather thanbinutils
as build dependency- a lot easier, don't have to figure out which binutils version matched with GCCcore
- Mikael: should be we be more careful with
$CPATH
when building binutils?- there may be some mixup there...
- see problem reported by Mikael on having to build binutils 2.35.2 with system before the one with GCCcore