-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
how to specify parameters to allocate distributed memory #293
Comments
These are the commands I use in the two runs: |
huh..., Actually I have the same problem when I try to run a 27 scale graph500 on 10 nodes with each of 15 GB memory. It seems that Grappa just allocate all memory on node 0, here is some hint: https://github.com/uwsampa/grappa/blob/master/system/GlobalAllocator.hpp#L86 that is odd... |
That piece of code is just saying that core 0 runs the code to allocate a region of the shared address space. The actual storage backing this address space is allocated when the Grappa program starts, and it is always striped across all the cores in the cluster. If this is not happening, it's most likely a problem with your MPI installation keeping your cores from talking to each other. Usually the best way to debug this is to run the Grappa hello_world program. If it prints "hello from core 0 of 1" over and over again there's an MPI problem, unrelated to Grappa. If it prints "hello from core n of m" where n and m make sense, then there's some other problem. And to respond to the question from October: the stats printed when Grappa starts are all per-node stats for us to use in debugging internal data structures, so you won't expect them to change significantly as you add more nodes. I don't believe it prints a total amount of shared memory across all the nodes without you requesting it. |
yeah, nelsonje is right, after having a close look at that piece of code, now I understand the magic of your PGAS... |
Hi, I run the pagerank in application/ graphlab/ pagerank. The performance is too fast that I suspect that the memory is allocated locally and there is only one worker.
I run the pagerank on a graph two times, the first with 1 node and the second time with 10 nodes. But the shared memory breakdown in the output of these two runs are the same. For example, "node total" is 31 G of these two runs; "locale shared heap total" is 18G, etc..
How come when I run it with 10 nodes, the total shared memory is still the same ?
The text was updated successfully, but these errors were encountered: