-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MDEV-28452 wsrep_ready: OFF after MDL BF-BF conflict #426
base: 10.6
Are you sure you want to change the base?
Conversation
Galera Cluster failed when two transactions that were executed serially on a master node were incorrectly executed in parallel on a slave node. The slave node detected MDL BF-BF conflict and quit the cluster. The problem was caused by OPTIMIZE TABLE on a table that is the child table of a foreign key constraint. If UPDATE was performed on the parent table of the foreign key constraint while OPTIMIZE TABLE was running on the child table, the master node would run the transactions serially, but a slave node might run them in parallel resulting in MDL BF-BF conflict. The problem is fixed by adding foreign key constraint to the replicated write set preventing parallel apply of the transactions on the slave node.
Removed debug messages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test case needs work to remove unnecessary sleeps by replacing them with wait_condition or debug_sync WAIT_FOR/SIGNAL controlling.
# 1) OPTIMIZE TABLE on the child table of the foreign key constraint, | ||
--connect node_1a, 127.0.0.1, root, , test, $NODE_MYPORT_1 | ||
--connection node_1a | ||
--sleep 5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you really need this sleep? I suggest using wait_condition instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right, this sleep is not needed.
Removed the sleep.
|
||
# create two tables | ||
CREATE TABLE `user` ( | ||
`id` char(36) COLLATE utf8mb4_unicode_ci NOT NULL, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is these table column names directly from customer and do you really need all columns?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These column names are indeed copied exactly from the customer case.
I have now removed all those columns that are not part of the keys and, thus, do not affect the test in any way.
But it would be possible to create a minimal test case with table names t1
and t2
and with just two columns per table.
|
||
# allow the stopped OPTIMIZE TABLE transaction to proceed | ||
--connection node_2a | ||
--sleep 5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These sleeps make test slow, there is mechanism to wait until we have reached debug sync by sending messages and then using WAIT_FOR e.g see test MW-369.inc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have not been able to replace this sleep with other synchronization mechanism, such as WAIT_FOR.
But I commented out the sleep and the test seems to work without it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replaced this sleep with DEBUG_SYNC synchronization mechanism.
|
||
# cleanup | ||
--connection node_1 | ||
--sleep 5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove all sleeps and rely on either wait_condition (using processlist) or debug_sync WAIT_FOR/SIGNAL
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have now removed two of the three sleeps in the MTR test.
Removing the last sleep is more tricky.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed also the last sleep.
sql/mdl.cc
Outdated
DBUG_EXECUTE_IF("sync.mdev_28452", | ||
{ | ||
const char act[]= | ||
"now " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here you could signal that we have reached this code by now signal mdev_28452_reached and you could wait that on test case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How could I wait for a signal in the MTR code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
now SIGNAL mdev_28452_reached to above code and in mtr DEBUG_SYNC='NOW WAIT_FOR mdev_28452_reached";
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed as suggested.
Now there are no sleeps in the MTR test.
sql/sql_admin.cc
Outdated
} | ||
} | ||
} | ||
#else | ||
WSREP_TO_ISOLATION_BEGIN_WRTCHK(NULL, NULL, first_table); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The line WSREP_TO_ISOLATION_BEGIN_WRTCHK()
should be removed. It is replaced by your code above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you are right.
Fixed as suggested.
there. Variables are reset back in THD::reset_for_next_command() before | ||
processing of next command. | ||
*/ | ||
if (wsrep_auto_increment_control) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand why we need to change auto_increment variables here.
What happens if you don't?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested also without changing auto_increment_*
variables and the MTR tests succeeded.
I don't understand what these variable do, but the code segment was copied from
Sql_cmd_alter_table::execute()
(in sql/sql_admin.cc
) and this fix is very similar to that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Optimize table could do table re-create, but I do not know if that has any effect on autoinc values (could be verified using show create table x; in both nodes).
sql/sql_admin.cc
Outdated
} | ||
|
||
wsrep::key_array keys; | ||
if (!wsrep_append_fk_parent_table(thd, first_table, &keys)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This call to function wsrep_append_fk_parent_table()
will be executed also when TOI applying. This is unnecessary.
I think it would be best to add clause wsrep_thd_is_local(thd)
to the if
statement above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed as suggested.
|
||
wsrep::key_array keys; | ||
if (wsrep_thd_is_local(thd) && | ||
!wsrep_append_fk_parent_table(thd, first_table, &keys)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If wsrep_append_fk_parent_table fails, then we skip TOI, and but still perform the OPTIMIZE locally.
This is likely harmless. Still, I would prefer to perform OPTIMIZE only if we did enter TOI mode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it is certainly possible to change the code so that wsrep_append_fk_parent_table()
failure causes Sql_cmd_optimize_table::execute()
to fail.
But the current code is now analogous to the implementation of Sql_cmd_alter_table::execute()
.
Galera Cluster failed when two transactions that were executed
serially on a master node were incorrectly executed in parallel on a
slave node. The slave node detected MDL BF-BF conflict and quit the
cluster.
The problem was caused by OPTIMIZE TABLE on a table that is the child
table of a foreign key constraint. If UPDATE was performed on the
parent table of the foreign key constraint while OPTIMIZE TABLE was
running on the child table, the master node would run the transactions
serially, but a slave node might run them in parallel resulting in MDL
BF-BF conflict. The problem is fixed by adding foreign key constraint
to the replicated write set preventing parallel apply of the
transactions on the slave node. The code fixing this issue for
OPTIMIZE TABLE is similar to the code for ALTER TABLE.