Galera feature: retry applying of write sets at slave nodes #387

plampio · 2024-01-09T09:51:25Z

Description

Initial version of the feature that allows retrying of applying of write sets at slave nodes. If Galera Cluster fails to apply a write set at a slave node, replication fails and the slave node is dropped from the cluster. However, the apply failure might be only temporary; a second apply attempt might succeed. Therefore, Galera Cluster might sometimes recover from such a failure by retrying applying of write sets at slave nodes.

This patch is quite separate from the server code and should not introduce any side-effects in other parts of the server.

How can this PR be tested?

This patch introduces two new MTR tests: galera.galera_retry_applying and galera.galera_retry_applying2 for testing the feature. In the first test, a retry succeeds and in the second test all retry attempts fail.

The new MTR tests need some more work: the cleanup after the tests is not complete.

sql/wsrep_high_priority_service.cc

temeo · 2024-01-15T13:54:33Z

mysql-test/suite/galera/t/galera_retry_applying.test

+--connection node_2
+SET GLOBAL wsrep_applier_retry_count = 2;
+SET GLOBAL debug_dbug = "+d,innodb_insert_fail:o,/dev/null";
+CALL mtr.add_suppression("WSREP: Event 3 Write_rows_v1 apply failed: 146, seqno 4");


Better not to include sequence numbers in suppressions, the seqno will not remain the same if the test is run repeatedly or part of full MTR run.

Ok, I will fix this as suggested.

Fixed a suggested.

janlindstrom

LGTM but

is default 0 good choice? What are dangers to set it larger e.g. 5 ?

sjaakola · 2024-01-31T11:27:31Z

mysql-test/suite/galera/t/galera_retry_applying2.test

@@ -0,0 +1,40 @@
+#


The common practice in mtr testing is to bundle several test cases in same test file, this should yield faster testing. As galera_retry_applying and galera_retry_applying2 quite similar test cases, please accommodate both in galera_retry_applying.test and remove the galera_retry_applying2.test and result as well

Fixed as suggested.

sjaakola · 2024-01-31T11:30:50Z

mysql-test/suite/galera/t/galera_retry_applying2.test

+SET GLOBAL debug_dbug = "+d,innodb_insert_fail:o,/dev/null";
+
+--connection node_1
+SLEEP 1; 


there should be no reason for sleep here, please remove it.
It is possible to add wait_condition for connection node_2 to wait until table t2 has been replicated there, but as we have wsrep_sync_wait=15 by default, there should be no absolute reason to wait for the replication

Fixed as suggested.

sjaakola · 2024-01-31T11:37:20Z

storage/innobase/row/row0ins.cc

+				} else if (0 == strcmp("test/t2", node->table->name.m_name)) {
+					err = DB_LOCK_WAIT_TIMEOUT;
+					goto error_handling;}});
+
 	err = row_ins(node, thr);


using hard-coded table names here for the failure injection works, but is not elegant.
We have recently created similar failure injection for FK failure cases, where the failure injection happens for a given number of times, and execution succeeds after last failure. Please take a look at commit ea08242 and DBUG_EXECUTE_IF implementation in storage/innobase/row/row0ins.cc. Same approach should work here.
The FK failure injection has also hard coded part to fail exactly 4 times, I assume this could be parametrized and enable the mtr script to decide how many failures are wanted (if we want to take this to the next level)

Removed one of the two hard-coded table names and replaced the other hard-coded name with the option variable wsrep_innodb_insert_fail_table.

now failure injection is effective only for table t1. However, the mtr test phase 2 expects insert failure for table t2, so the test phase 2 is not effective atm

Fixed this problem.

sjaakola

mtr test needs more attention

sjaakola · 2024-07-03T07:53:01Z

storage/innobase/row/row0ins.cc

+				} else if (0 == strcmp("test/t2", node->table->name.m_name)) {
+					err = DB_LOCK_WAIT_TIMEOUT;
+					goto error_handling;}});
+
 	err = row_ins(node, thr);


now failure injection is effective only for table t1. However, the mtr test phase 2 expects insert failure for table t2, so the test phase 2 is not effective atm

sjaakola · 2024-07-03T08:01:28Z

mysql-test/suite/galera/t/galera_retry_applying.test

+--echo Shutting down server ...
+SET wsrep_on=OFF;
+--source include/shutdown_mysqld.inc
+--remove_file $MYSQLTEST_VARDIR/mysqld.2/data/grastate.dat


here is a race condition: previously a transaction was committed in node_1 and here node_2 will shutdown. But there is no check for the fate of the replicated INSERT from node_1: it may still be replicating or is currently applying or has already committed in node_2. For deterministic test behavior, the state of the INSERT transaction should be synced here or documented if it does not matter for the test result and can be safely ignored

also, as this test phase is supposed to cause sure applier failure, the node should crash, so no need to shutdown it anymore.

Removed table names in row0ins.c, but DBUG_EXECUTE_IF macros still
mention the test database test. There are now two separate DBUG_EXECUTE_IF labels:
innodb_insert_fail_once - for failing INSERT once inside InnoDB, and
innodb_insert_fail_always - for failing INSERT always inside InnoDB.

Synchronization.
There are now 4 synchronization points in the MTR test, two in each of
the 2 test cases:

Test case 1: wait till the insert transaction has been replicated and committed in node_2 (line 25)

Test case 1: wait till the transaction has been replicated and committed in node_2 (line 44)

Test case 2: wait for node_2 to crash (line 54)

Test case 2: wait till node 2 is back in the cluster (line 120)

The test does not work without shutting down the server on node 2 after applier failure.

sjaakola · 2024-07-03T08:03:02Z

sql/wsrep_high_priority_service.cc

+
+    /* rollback to savepoint without telling Wsrep-lib */
+    bool saved_wsrep_on = thd->variables.wsrep_on;
+    thd->variables.wsrep_on = false;


this thread is replication applier, so it must have wsrep_on == ON, therefore using saved_wsrep_on is redundant

Fixed as suggested.

temeo · 2024-07-03T10:11:29Z

sql/sys_vars.cc

+       VALID_RANGE(0, UINT_MAX), DEFAULT(0), BLOCK_SIZE(1),
+       NO_MUTEX_GUARD, NOT_IN_BINLOG);
+
+#if defined(ENABLED_DEBUG_SYNC)


Adding a new system variable just for test fault injection does not sound like acceptable. This change will change the result set for all MTR tests which list system variable, and the change is depends on build configuration.

Reverted the change that introduced a new system variable. So, now the table names for which inserts fail inside InnoDB are once again hard-coded in the InnoDB code.

Is there any other way (besides a system variable) to pass table names from MTR tests to InnoDB code at run-time?
If not, then hard-coding table names in InnoDB code can't be avoided for the MTR test galera.galera_retry_applying.

as a reference, take a look at another PR: #385
which causes failure injection for given number of times. This is for all tables, but could be usable in your test as well.

It would be possible to follow the method displayed in PR 385, but I think the way failure injection is now implemented inside InnoDB for INSERTs is actually simpler.

temeo

Adding a new system variable just for test fault injection does not sound like acceptable. This change will change the result set for all MTR tests which list system variable, and the change is depends on build configuration.

sjaakola

this is good, branch needs rebase though

plampio · 2024-11-08T13:57:35Z

Rebased the branch.

temeo

The following tests fail in this branch, but are not failing in 11.4 HEAD:

galera.galera_performance_schema
galera.galera_retry_applying
galera.GCF-939

Also, see MariaDB coding standards about commit messages: https://github.com/codership/mariadb-server/blob/11.4/CODING_STANDARDS.md#git-commit-messages: The title line should be separated by a text body by a blank line.

A new Galera feature that allows retrying of applying of writesets at slave nodes (codership/mysql-wsrep-bugs/MariaDB#1619). Currently replication applying stops for first non ignored failure occurring in event applying, and node will do emergency abort (or start inconsistency voting). Some failures, however, can be concurrency related, and applying may succeed if the operation is tried at later time. This feature introduces a new dynamic global option variable "wsrep_applier_retry_count" that controls the retry-applying feature: a zero value disables retrying and a positive value sets the maximum number of retry attempts. The default value for this option is zero, which means that this feature is disabled by default.

galera.galera_performance_schema, galera.galera_retry_applying, and galera.GCF-939 to fail: 1) Moved mtr.add_suppression() calls to the beginning of the galera_retry_applying MTR test. 2) Modified the condition in wsrep_apply_events() for calling wsrep_dump_rbr_buf_with_header() function.

plampio requested review from sjaakola and temeo January 9, 2024 09:52

temeo reviewed Jan 15, 2024

View reviewed changes

sql/wsrep_high_priority_service.cc Outdated Show resolved Hide resolved

temeo reviewed Jan 15, 2024

View reviewed changes

janlindstrom approved these changes Jan 16, 2024

View reviewed changes

sjaakola requested changes Jan 31, 2024

View reviewed changes

sjaakola requested changes Jul 3, 2024

View reviewed changes

temeo reviewed Jul 3, 2024

View reviewed changes

temeo requested changes Jul 3, 2024

View reviewed changes

sjaakola approved these changes Nov 6, 2024

View reviewed changes

plampio force-pushed the 11.4-retry-applying branch from 37c0855 to 2b477c4 Compare November 8, 2024 13:56

plampio force-pushed the 11.4-retry-applying branch 2 times, most recently from 57d413f to dfedb1a Compare January 16, 2025 13:26

temeo requested changes Jan 21, 2025

View reviewed changes

plampio added 2 commits January 27, 2025 11:00

plampio force-pushed the 11.4-retry-applying branch from dfedb1a to 7b5783b Compare January 27, 2025 14:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Galera feature: retry applying of write sets at slave nodes #387

Galera feature: retry applying of write sets at slave nodes #387

plampio commented Jan 9, 2024

temeo Jan 15, 2024

plampio Jan 16, 2024

plampio Jan 29, 2024

janlindstrom left a comment

sjaakola Jan 31, 2024

plampio Feb 26, 2024

sjaakola Jan 31, 2024

plampio Feb 26, 2024

sjaakola Jan 31, 2024

plampio Jun 28, 2024

sjaakola Jul 3, 2024

plampio Jul 12, 2024

sjaakola left a comment

sjaakola Jul 3, 2024

sjaakola Jul 3, 2024

sjaakola Jul 3, 2024

plampio Aug 13, 2024

sjaakola Jul 3, 2024

plampio Jul 12, 2024

temeo Jul 3, 2024

plampio Jul 8, 2024

sjaakola Jul 8, 2024

plampio Aug 13, 2024

temeo left a comment

sjaakola left a comment

plampio commented Nov 8, 2024

temeo left a comment

Galera feature: retry applying of write sets at slave nodes #387

Are you sure you want to change the base?

Galera feature: retry applying of write sets at slave nodes #387

Conversation

plampio commented Jan 9, 2024

Description

How can this PR be tested?

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

janlindstrom left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sjaakola left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

temeo left a comment

Choose a reason for hiding this comment

sjaakola left a comment

Choose a reason for hiding this comment

plampio commented Nov 8, 2024

temeo left a comment

Choose a reason for hiding this comment