Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mdev-31272 new fix variant #332

Open
wants to merge 12 commits into
base: 10.6
Choose a base branch
from
Open

mdev-31272 new fix variant #332

wants to merge 12 commits into from

Conversation

sjaakola
Copy link

Here, empty transaction is checked in wsrep_run_commit_hook

@sjaakola sjaakola force-pushed the 10.6-MDEV-31272-new branch from 3d04fe2 to 9b32f40 Compare June 21, 2023 13:20
sjaakola and others added 12 commits September 29, 2023 10:13
If a statement execution has successfully modified some rows, then the
write set for the transaction, will be appended by the key values related
to the modified rows. If the staement execution eventually fails,
e.g. for duplicate key error, then statement rollback will be carried out,
to undo changes in storage engine and truncate binlog cache to wipe out
replication events related to the statement execution.

If the transaction has no other successful write statements then there is
nothing to commit, at commit time,  and no binlog events either. But as there
were earlier key appends, the transaction remains registred for wsrep
replication, and at commit time replicator extracts binlog events from cache
and populates the final write set for the replication. In the buggy version,
it was observed that there are no binlog events, but the write set was populated,
nevertheless, and the empty write set was replicated.  Note that this consumes
one GTID, for a void operation.

The fix for this issue, cancels the replication of such empty write set anf gives
a warning for the client.
This fix is in: Wsrep_client_service::prepare_data_for_replication()

It also turned out that the binlog cache truncate for statement rollback was not
complete if server was not configured to use binloggging. In buggy version,
only statmenet cache was truncated, but events remained in the transaction cache.
This has been fixed in wsrep_thd_binlog_stmt_rollback()
binlog_commit maintains also binlog cache, which is needed
also when binlogging is not used, e.g. to support steement rollback
…have empty last WS carrying only commit flag
methods for managing binlog cache when actual binlog is not used
…ines

mtr test galera.galera_ctas should pass with this patch
where IO cache is looked and if no events in cache, then wsrep commit
hooks will be skipped.
With this, wsrep replication will not be tried, and first version of the
fix becomes obsolete.
@@ -138,6 +136,10 @@ int Wsrep_client_service::prepare_data_for_replication()
{
WSREP_DEBUG("empty rbr buffer, query: %s", wsrep_thd_query(m_thd));
}

/* SR may have empty last WS to just carry the comit flag */

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: "comit" should be "commit"

@@ -375,7 +398,7 @@ static inline int wsrep_after_commit(THD* thd, bool all)
wsrep_is_active(thd),
(long long)wsrep_thd_trx_seqno(thd),
wsrep_has_changes(thd));
DBUG_ASSERT(wsrep_run_commit_hook(thd, all));
//DBUG_ASSERT(wsrep_run_commit_hook(thd, all));

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's not have assertions that are commented out. Either remove this line altogether, or put it back and fix if necessary.

if (thd->wsrep_cs().mode() == Wsrep_client_state::m_local &&
!thd->wsrep_trx().is_streaming())
{
IO_CACHE* trx_cache= wsrep_get_cache(thd, true);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps it would be good to put this block into a function in wsrep_binlog.cc or log.cc

@@ -348,7 +348,7 @@ class binlog_cache_data

my_off_t get_prev_position()
{
return(before_stmt_pos);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drop this whitespace change.

up the cache, so this must be done explicitly when the transaction
terminates.
*/
if (WSREP_EMULATE_BINLOG_NNULL(this))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you drop this optimization, then we should probably remove the patch at the beginning of ha_savepoint().



#
# Case 5: testing statement rollback

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Case 5: It is not so clear what case 5 is trying to test. If I understand correctly, you want to make sure that the COMMIT from node_1 does not go through replication / certification because it is an empty transaction. If so simply check that no gtid was consumed, or alternatively that seqno has not advanced. This case would not need debug sync at all and make it trivial to understand.

DROP TABLE t1;

#
# Case 7:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Case 7 looks like case 2. If so, drop it.



#
# Case 6: testing statement rollback with BF abort

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Case 6 can also simplified and make use of wait conditions instead of debug sync points (see galera_bf_abort.test). If we make this test debug sync free it has the advantage of being run in debug and release builds.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants