-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add reference implementation for parallel_phase feature #1570
Conversation
@akukanov @aleksei-fedotov @vossmjp Could you please take a look at the PR in terms of implementation and ABI. P.S. Tests are still WIP. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a complete review, just a couple of starters to think about.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A bunch of comments from me. Have not looked at the tests yet. Also, not finished reviewing the PHASE_* switching logic because of found issue.
test/tbb/test_task_arena.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In addition, I think need to cover in tests the following:
- The DELAYED_LEAVE is returned back once new task is submitted.
- Here I think we can test that the return of a worker after one-time-fast-leave happens generally longer than the following returns made after new task(s) is/are submitted.
- Nested parallel phases with combinations of DELAYED and (ONE TIME) FAST leaves.
Signed-off-by: pavelkumbrasev <pavel.kumbrasev@intel.com>
Signed-off-by: pavelkumbrasev <pavel.kumbrasev@intel.com>
Signed-off-by: pavelkumbrasev <pavel.kumbrasev@intel.com>
Signed-off-by: Isaev, Ilya <ilya.isaev@intel.com>
Signed-off-by: Isaev, Ilya <ilya.isaev@intel.com>
…rallel_block Signed-off-by: Isaev, Ilya <ilya.isaev@intel.com>
Signed-off-by: Isaev, Ilya <ilya.isaev@intel.com>
Signed-off-by: Isaev, Ilya <ilya.isaev@intel.com>
Signed-off-by: Isaev, Ilya <ilya.isaev@intel.com>
Signed-off-by: Isaev, Ilya <ilya.isaev@intel.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My comments below are not critical; feel free to commit as-is and address later.
include/oneapi/tbb/task_arena.h
Outdated
@@ -95,6 +95,11 @@ TBB_EXPORT void __TBB_EXPORTED_FUNC isolate_within_arena(d1::delegate_base& d, s | |||
TBB_EXPORT void __TBB_EXPORTED_FUNC enqueue(d1::task&, d1::task_arena_base*); | |||
TBB_EXPORT void __TBB_EXPORTED_FUNC enqueue(d1::task&, d1::task_group_context&, d1::task_arena_base*); | |||
TBB_EXPORT void __TBB_EXPORTED_FUNC submit(d1::task&, d1::task_group_context&, arena*, std::uintptr_t); | |||
|
|||
#if __TBB_PREVIEW_PARALLEL_PHASE | |||
TBB_EXPORT void __TBB_EXPORTED_FUNC register_parallel_phase(d1::task_arena_base*, std::uintptr_t); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason to use "register" and "unregister" in the names for the entry points? To me "registration" does not imply the start and end of the active time in the region, but might be used to indicate that a region just exists. For example, an athlete might register for a race but that doesn't mean they've started running.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the main idea was to use more generic names, which do not depend on feature API. So, if API will change dramatically, entry point names would still make some sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There might be alternative suitable generic verbs, such as "initiate", "enter", "commence", "launch", "open" for the beginning and "complete", "conclude", "close", "finish", "exit".
From the semantical point of view (i.e. speaking of a program execution phase), I would likely use "enter/exit" or "initiate/complete"; but since it's an internal entry point, I am OK with any choice made, including "register/unregister".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Of the suggestions, I like "enter/exit".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
auto median_automatic = utils::median(times_automatic.begin(), times_automatic.end()); | ||
auto median_fast = utils::median(times_fast.begin(), times_fast.end()); | ||
|
||
WARN_MESSAGE(median_automatic < median_fast, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are all of the tests guaranteed to work (i.e. not have warnings) for hybrid systems where automatic is not delayed leave?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is a good question. I suppose that only one test case(Parallel Phase retains workers in task_arena) would produce expected results on hybrid systems. Other tests don't make much sense on such configurations, but I think it is fine since these are warnings, not asserts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we skip the warning on hybrid systems? For example, don't output warning if there is more than 1 core type.
Co-authored-by: Alexey Kukanov <alexey.kukanov@intel.com> Signed-off-by: Isaev, Ilya <ilya.isaev@intel.com>
Signed-off-by: Isaev, Ilya <ilya.isaev@intel.com>
c05d9a7
to
515fd33
Compare
Signed-off-by: Isaev, Ilya <ilya.isaev@intel.com>
Signed-off-by: Isaev, Ilya <ilya.isaev@intel.com>
Signed-off-by: Isaev, Ilya <ilya.isaev@intel.com>
Signed-off-by: Isaev, Ilya <ilya.isaev@intel.com>
@aleksei-fedotov @akukanov @vossmjp @pavelkumbrasev @dnmokhov I have updated the RFC document: included some technical details and conditions to leave the experimental stage. |
Signed-off-by: Isaev, Ilya <ilya.isaev@intel.com>
@@ -433,6 +528,9 @@ void arena::advertise_new_work() { | |||
workers_delta = 1; | |||
} | |||
|
|||
#if __TBB_PREVIEW_PARALLEL_PHASE | |||
my_thread_leave.reset_if_needed(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will the following assert be useful?
my_thread_leave.reset_if_needed(); | |
__TBB_ASSERT(mandatory_delta > 0 || workers_delta > 0, "Potentially delaying fast leave?"); | |
my_thread_leave.reset_if_needed(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not a valid assert. In case of workerless arena and spawn of the task both of those values will be zeroes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay... let it be then:
my_thread_leave.reset_if_needed(); | |
__TBB_ASSERT((mandatory_delta > 0 || workers_delta > 0) || | |
(is_arena_workerless() && work_type == work_spawned), | |
"Potentially delaying fast leave?"); | |
my_thread_leave.reset_if_needed(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First of all, I do not readily see how this assert is related to the parallel phase.
And if you move the assertion next to the code in the lines 522-529, it seems it just duplicates almost exactly the conditions and assignments there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I know that it duplicates. But the stress here is on the words "almost exactly". With the assert I sort of wanted to make sure we do not fall into recently discovered issue of dropping the fast leave setting when threads are recalled from arena. I don't mind not having an assert here, since, as you said, the conditions duplicate it. However, I do think we need to tell reader about the desired behavior here, it feels like the code itself might not be good at telling that.
Another interesting consideration is to reduce cross-NUMA accesses as much as possible. If thread can base its decision whether it needs to read and/or modify global-to-arena variables on the data it has already read, then it makes sense to avoid doing more re-reads of other global-to-arena data. Of course, if we are sure that the set of "almost exactly" is empty here, then the thread already avoids unnecessary touches to the data that increases cross-NUMA traffic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assertions are good to enforce preconditions for a function, and in some cases of sophisticated internal logic also good to check expected invariants in certain point. Having an assertion that checks the conditions explicitly set by the code right above is a pure duplication. The part that is not an exact duplication could be checked in a proper place, of course; also reset_if_needed
could internally assert on its expected preconditions.
For the NUMA part of the comment, I cannot make the connection to the assertion being discussed.
Signed-off-by: Isaev, Ilya <ilya.isaev@intel.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that might also simplify the implementation logic.
Below is how. Note that separation of significant bits for different leave states was a necessary prerequisite to operate with ONE_TIME_FAST_LEAVE as a single bit flag.
Signed-off-by: Isaev, Ilya <ilya.isaev@intel.com>
__TBB_ASSERT(prev - PARALLEL_PHASE < prev, | ||
"A call to unregister without its register complement"); | ||
desired = prev - PARALLEL_PHASE; // Mark the end of this phase in reference counter | ||
if (enable_fast_leave && /*it was the last parallel phase*/desired == DELAYED_LEAVE) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A bit of nitpicking: the comment does not really match the condition tested. To check if that's the last parallel phase, you'd use desired < PARALLEL_PHASE
. Here you also check that neither of the fast-leave flags is set.
Perhaps it can be improved like this:
if (enable_fast_leave && /*it was the last parallel phase*/desired == DELAYED_LEAVE) { | |
if (enable_fast_leave && /*it was the last parallel phase, and delayed leave is the default*/ desired == DELAYED_LEAVE) { |
But also feel free to ignore :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. At some point we need to improve tests though.
Description
Add a comprehensive description of proposed changes
Fixes # - issue number(s) if exists
Type of change
Choose one or multiple, leave empty if none of the other choices apply
Add a respective label(s) to PR if you have permissions
Tests
Documentation
Breaks backward compatibility
Notify the following users
List users with
@
to send notificationsOther information