Multi progbar backend #358

mkolopanis · 2021-08-03T21:12:17Z

Adds tqdm progress bar as a backend option for run_uvdata_uvsim also implements a send and receive protocol for run_uvdata_uvsim where the tasks are chunked and sent out to PUs on request.

Motivation and Context

I like how tqdm looks honestly and was tired of two gross branches continually rebasing. This allows for all backends depending on the user's preference. I think I would prefer having some kind of plugin or something to allow custom progress bars but that sounds much more complicated.

TODO:

Show reference sims are identical
benchmarking reference sims
Documentation/tutorial about backend selection
changelog

Types of changes

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to change)
Reference simulation update or replacement
Documentation change (documentation changes only)
Version change
Build or continuous integration change

Checklist:

For all pull requests:

I have read the contribution guide.
My code follows the code style of this project.

New feature checklist:

I have added or updated the docstrings associated with my feature using the numpy docstring format.
I have updated the documentation to highlight my new feature (if appropriate).
I have added tests to cover my new feature.
All new and existing tests pass.
I have checked that I reproduce the reference simulations or if there are differences they are explained below (if appropriate). If there are changes that are correct, I will update the reference simulation files after this PR is merged.
I have checked (e.g., using the benchmarking tools) that this change does not significantly increase typical runtimes. If it does, I have included a justification in the comments on this PR.
I have updated the CHANGELOG.

codecov · 2021-08-03T21:15:48Z

Codecov Report

Attention: Patch coverage is 96.33028% with 8 lines in your changes missing coverage. Please review.

Project coverage is 98.44%. Comparing base (0876a90) to head (f4e5b25).

Files	Patch %	Lines
src/pyuvsim/uvsim.py	95.72%	8 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff             @@
##              main     #358      +/-   ##
===========================================
- Coverage   100.00%   98.44%   -1.56%     
===========================================
  Files           11       11              
  Lines         2165     2317     +152     
===========================================
+ Hits          2165     2281     +116     
- Misses           0       36      +36

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

mkolopanis · 2021-08-04T15:23:26Z

pyuvsim/uvsim.py

+                if tag == mpi.tags.READY:
+                    pass
+                elif tag == mpi.tags.DONE:
+                    uv_container.data_array[msg[1]] += msg[0]
+                    pbar.update(1)
+
+                elif tag == mpi.tags.EXIT:
+                    completed_workers += 1
+                else:
+                    raise ValueError(f"{msg} {tag}")


There's two lines in here that just won't ever get touched by tests. the READY tag is never sent back from a worker thread, and I think the else catch was a debug thing that just never went away just in case. Ideally the "enum" defined in the mpi module is an exhaustive list of statuses that can be sent back. Would it be preferred to just take out these lines?

aelanman · 2021-08-04T17:04:32Z

While I fully support switching to tqdm over progsteps, I'm still not convinced of the push/pull model for task distribution. Has it been shown conclusively that active distribution performs better than the existing system? It looks like it should be capable of about the same performance except with an added overhead from using blocking sends to return visibilities to rank 0.

mkolopanis · 2021-08-04T17:52:14Z

Definitely part of this endeavor is to just get the technology in the repo right alongside the other things in order to test it better. There were branches before, but it was almost impossible to rebase or something after a new feature was brought in. Even just sitting on this branch, it is much easier to make a comparison.

I think one thing we should do is to get a few really good timing tests. We could always drop the send/recv part in the end.

move inner progsteps to own function add in tqdm progressbar, check diffuse analytic, check for npus > 1 add tqdm to 'all' dependency, try to test importerror in min deps added a two helper functions for chunking iterators added the send_recv backend which combines with tqdm to distribute tasks propagate the backend keyword up to the run script allow all backends to use the profiling option make analytic diffuse test also run against all backends import from pyuvsim.uvsim in the import tqdm error test add tqdm to full test suite, make tests that run with only progsteps add test to run profiling for all backends. only make profiling output if uvdata_indices is not empty remove debug prints, remove thread_multiple as a parameter to run fix number of arguments to run_uvsim test define a proper enum for the tags added tqdm to option list in readme fixed typo in run_param_pyuvsim argument parser force ntasks tot to be an int, ceil to be conservative add a context manger to progsteps disentangle the progressbar from the backend; test all combinations revamp rma to not use rank 0, remove messaging equiv to s/r update tests to run in paralell. do gymnastics for warnings have newer tests account for parallel and backend differences propagate errors on worker nodes to rank0 in send_recv. always clean up the window in rma process. [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci restore mpich test yaml lost to rebase overlords update progsteps only ci yaml match versioning on progsteps_only yaml to other ci yamls break some check from run_uvdata_uvsim into a separate function

for more information, see https://pre-commit.ci

mkolopanis commented Aug 4, 2021

View reviewed changes

mkolopanis force-pushed the multi_progbar_backend branch from 8357105 to de5680a Compare August 11, 2021 20:20

mkolopanis force-pushed the multi_progbar_backend branch from 3d49b7f to c5ee33d Compare November 3, 2021 16:36

mkolopanis force-pushed the multi_progbar_backend branch from 3f8888d to 08f25e1 Compare November 12, 2021 17:43

mkolopanis force-pushed the multi_progbar_backend branch 2 times, most recently from 7daa109 to 186d4ba Compare January 25, 2022 16:58

mkolopanis force-pushed the multi_progbar_backend branch 2 times, most recently from 47c6ed9 to 34767ae Compare March 21, 2024 21:22

mkolopanis force-pushed the multi_progbar_backend branch 2 times, most recently from bf28b61 to f87a714 Compare April 1, 2024 17:16

mkolopanis and others added 8 commits June 27, 2024 09:35

[pre-commit.ci] auto fixes from pre-commit.com hooks

8ed8a0c

for more information, see https://pre-commit.ci

remove 10s timeout on non square bls test

20236cc

bump only progsteps test version to py3.10. give longer for some tests

786f7b1

rebase fall through the cracks

5c1f5b1

[pre-commit.ci] auto fixes from pre-commit.com hooks

fe3b1d4

for more information, see https://pre-commit.ci

remove duplicated _all__

53aa9d0

move gleam test to only run checks on rank 0

984d83d

mkolopanis force-pushed the multi_progbar_backend branch from 0d2f1f3 to 984d83d Compare June 27, 2024 17:10

[pre-commit.ci] auto fixes from pre-commit.com hooks

f4e5b25

for more information, see https://pre-commit.ci

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multi progbar backend #358

Multi progbar backend #358

mkolopanis commented Aug 3, 2021

codecov bot commented Aug 3, 2021 •

edited

Loading

mkolopanis Aug 4, 2021

aelanman commented Aug 4, 2021

mkolopanis commented Aug 4, 2021

Multi progbar backend #358

Are you sure you want to change the base?

Multi progbar backend #358

Conversation

mkolopanis commented Aug 3, 2021

Motivation and Context

Types of changes

Checklist:

codecov bot commented Aug 3, 2021 • edited Loading

Codecov Report

mkolopanis Aug 4, 2021

Choose a reason for hiding this comment

aelanman commented Aug 4, 2021

mkolopanis commented Aug 4, 2021

codecov bot commented Aug 3, 2021 •

edited

Loading