-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
#36 add DM support to field_checksum() and turn off field tiling and allow for that in copy_field() #37
Conversation
Ready for review. This is required in order to fix the manual, distributed-memory version of nemolite2d. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@arporter I made a superficial read through the changes and add a couple of in-line comments.
Does this solve the manual MPI implementation in PSycloneBench without any change in that repository? If that's the case I will give it a go bringing the submodule to this branch to double check it works.
do jj= field_out%tile(it)%whole%ystart, field_out%tile(it)%whole%ystop | ||
do ji = field_out%tile(it)%whole%xstart, field_out%tile(it)%whole%xstop | ||
field_out%data(ji,jj) = field_in%data(ji,jj) | ||
do it = 1, field_out%ntiles, 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
!$OMP private(it,ji,jj)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought loop variables were private by default but I've just checked and that only applies to the loop being parallelised. I've therefore fixed this and checked that it compiles OK with OpenMP enabled (which it didn't but I've sorted that). In doing that I've realised that we don't in fact provide any support for building dl_esm_inf with OpenMP enabled so I've created a new issue (#38) in case we care in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am OK deferring this to #38, but now that you have added the DEFAULT(none)
clause I assume the 'loop variables are private' is not necessarily true anymore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the standard specifies that the loop variable is private and therefore it's not affected by the DEFAULT(none)
clause.
Thanks @sergisiso. Yes, it does solve the DM problem without any changes to PSycloneBench (stfc/PSycloneBench#20). |
Thanks for spotting the errors Sergi. Ready for another look if Travis is happy... |
Given that this is a fix for the DM version of NEMOLite2D, @sergisiso has also suggested we add DM support into the checksum routine. I'll do that as part of this PR. |
@sergisiso would you mind trying this again with the Intel compiler and nemolite2d? I'm getting an ICE with gfortran :-( |
I compiled it with gfortran without any issues and it runs fine: |
It's in time_step_mod.f90:
|
I've just managed to build it OK on my desktop so I must have messed something up on my laptop (or there's a bug related to the specific version of gfortran that I have). |
This is ready for another look @sergisiso. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See some in-line comments. The global_sum and its new test looks good.
I still have some problems with the compiler. When compiling without MPI, using just gfortran but with the -fopenmp flag I am getting the following error (note this is needed for the manual_verison/psykal_omp
of nemolite2d):
gfortran -Wall -Wsurprising -Wuninitialized -faggressive-function-elimination -Ofast -mtune=native -finline-limit=50000 -fopt-info-all=gnu_opt_report.txt -march=core2 -mtune=core2 -ffree-line-length-none -fopenmp -c /home/sergi/workspace/euroexa/PSycloneBench/shared/dl_esm_inf/finite_difference/src//grid_mod.f90
/home/sergi/workspace/euroexa/PSycloneBench/shared/dl_esm_inf/finite_difference/src//grid_mod.f90:353:0:
do ji = xstart-1, xstop+1
Error: ‘xstart’ not specified in enclosing ‘parallel’
/home/sergi/workspace/euroexa/PSycloneBench/shared/dl_esm_inf/finite_difference/src//grid_mod.f90:353:0: Error: enclosing ‘parallel’
/home/sergi/workspace/euroexa/PSycloneBench/shared/dl_esm_inf/finite_difference/src//grid_mod.f90:353:0: Error: ‘xstop’ not specified in enclosing ‘parallel’
/home/sergi/workspace/euroexa/PSycloneBench/shared/dl_esm_inf/finite_difference/src//grid_mod.f90:353:0: Error: enclosing ‘parallel’
This again can be solved by removing the default(none) clause or setting the shared() variables. but I am a bit puzzled as this was already on master and I didn't had this issue before.
Finally the manual_verison/psykal_dm
still gives a different checksum for rank > 4 compared to the serial version. If the other mentioned issues are fixed I can approve this PR but we still have to find the cause of PsycloneBench/20 issue.
do jj= field_out%tile(it)%whole%ystart, field_out%tile(it)%whole%ystop | ||
do ji = field_out%tile(it)%whole%xstart, field_out%tile(it)%whole%xstop | ||
field_out%data(ji,jj) = field_in%data(ji,jj) | ||
do it = 1, field_out%ntiles, 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am OK deferring this to #38, but now that you have added the DEFAULT(none)
clause I assume the 'loop variables are private' is not necessarily true anymore.
I too don't get this problem but then I see from the Makefile that we don't build dl_esm_inf with OpenMP enabled - just the mini-app itself. Therefore we will only build dl_esm_inf with OpenMP if we set F90FLAGS to include "-fopenmp" before we make. Given that we don't have explicit support in dl_esm_inf for building with OpenMP I think we should leave this for the moment. |
I've built the |
@arporter I tried again with the current HEAD of the branch and the same namelist I was using and a few others and now I get the same checksums independently of the rank size. It was probably me messing something up trying to force the OMP flag into the compilation. So if you can bring FortCL submodule to master I am happy to merge the PR. |
@sergisiso Ooh, I've made it go faster? That feels good :-) Ready for another look now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All looks good now. I did remove the performance comment because I realized I was reporting wrong times, but the performance degradation I observed some time ago is now gone.
Small fix that corrects use of tiling in the copy_field() routine.