Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Galera feature: retry applying of write sets at slave nodes #387

Open
wants to merge 2 commits into
base: 11.4
Choose a base branch
from

Conversation

plampio
Copy link

@plampio plampio commented Jan 9, 2024

Description

Initial version of the feature that allows retrying of applying of write sets at slave nodes. If Galera Cluster fails to apply a write set at a slave node, replication fails and the slave node is dropped from the cluster. However, the apply failure might be only temporary; a second apply attempt might succeed. Therefore, Galera Cluster might sometimes recover from such a failure by retrying applying of write sets at slave nodes.

This patch is quite separate from the server code and should not introduce any side-effects in other parts of the server.

How can this PR be tested?

This patch introduces two new MTR tests: galera.galera_retry_applying and galera.galera_retry_applying2 for testing the feature. In the first test, a retry succeeds and in the second test all retry attempts fail.

The new MTR tests need some more work: the cleanup after the tests is not complete.

@plampio plampio requested review from sjaakola and temeo January 9, 2024 09:52
--connection node_2
SET GLOBAL wsrep_applier_retry_count = 2;
SET GLOBAL debug_dbug = "+d,innodb_insert_fail:o,/dev/null";
CALL mtr.add_suppression("WSREP: Event 3 Write_rows_v1 apply failed: 146, seqno 4");
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better not to include sequence numbers in suppressions, the seqno will not remain the same if the test is run repeatedly or part of full MTR run.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I will fix this as suggested.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed a suggested.

Copy link

@janlindstrom janlindstrom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM but

is default 0 good choice? What are dangers to set it larger e.g. 5 ?

@@ -0,0 +1,40 @@
#

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The common practice in mtr testing is to bundle several test cases in same test file, this should yield faster testing. As galera_retry_applying and galera_retry_applying2 quite similar test cases, please accommodate both in galera_retry_applying.test and remove the galera_retry_applying2.test and result as well

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed as suggested.

SET GLOBAL debug_dbug = "+d,innodb_insert_fail:o,/dev/null";

--connection node_1
SLEEP 1;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there should be no reason for sleep here, please remove it.
It is possible to add wait_condition for connection node_2 to wait until table t2 has been replicated there, but as we have wsrep_sync_wait=15 by default, there should be no absolute reason to wait for the replication

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed as suggested.

} else if (0 == strcmp("test/t2", node->table->name.m_name)) {
err = DB_LOCK_WAIT_TIMEOUT;
goto error_handling;}});

err = row_ins(node, thr);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

using hard-coded table names here for the failure injection works, but is not elegant.
We have recently created similar failure injection for FK failure cases, where the failure injection happens for a given number of times, and execution succeeds after last failure. Please take a look at commit ea08242 and DBUG_EXECUTE_IF implementation in storage/innobase/row/row0ins.cc. Same approach should work here.
The FK failure injection has also hard coded part to fail exactly 4 times, I assume this could be parametrized and enable the mtr script to decide how many failures are wanted (if we want to take this to the next level)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed one of the two hard-coded table names and replaced the other hard-coded name with the option variable wsrep_innodb_insert_fail_table.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now failure injection is effective only for table t1. However, the mtr test phase 2 expects insert failure for table t2, so the test phase 2 is not effective atm

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed this problem.

Copy link

@sjaakola sjaakola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mtr test needs more attention

} else if (0 == strcmp("test/t2", node->table->name.m_name)) {
err = DB_LOCK_WAIT_TIMEOUT;
goto error_handling;}});

err = row_ins(node, thr);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now failure injection is effective only for table t1. However, the mtr test phase 2 expects insert failure for table t2, so the test phase 2 is not effective atm

--echo Shutting down server ...
SET wsrep_on=OFF;
--source include/shutdown_mysqld.inc
--remove_file $MYSQLTEST_VARDIR/mysqld.2/data/grastate.dat
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here is a race condition: previously a transaction was committed in node_1 and here node_2 will shutdown. But there is no check for the fate of the replicated INSERT from node_1: it may still be replicating or is currently applying or has already committed in node_2. For deterministic test behavior, the state of the INSERT transaction should be synced here or documented if it does not matter for the test result and can be safely ignored

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, as this test phase is supposed to cause sure applier failure, the node should crash, so no need to shutdown it anymore.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed table names in row0ins.c, but DBUG_EXECUTE_IF macros still
mention the test database test. There are now two separate DBUG_EXECUTE_IF labels:
innodb_insert_fail_once - for failing INSERT once inside InnoDB, and
innodb_insert_fail_always - for failing INSERT always inside InnoDB.

Synchronization.
There are now 4 synchronization points in the MTR test, two in each of
the 2 test cases:

  1. Test case 1: wait till the insert transaction has been replicated and committed in node_2 (line 25)

  2. Test case 1: wait till the transaction has been replicated and committed in node_2 (line 44)

  3. Test case 2: wait for node_2 to crash (line 54)

  4. Test case 2: wait till node 2 is back in the cluster (line 120)

The test does not work without shutting down the server on node 2 after applier failure.


/* rollback to savepoint without telling Wsrep-lib */
bool saved_wsrep_on = thd->variables.wsrep_on;
thd->variables.wsrep_on = false;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this thread is replication applier, so it must have wsrep_on == ON, therefore using saved_wsrep_on is redundant

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed as suggested.

sql/sys_vars.cc Outdated
VALID_RANGE(0, UINT_MAX), DEFAULT(0), BLOCK_SIZE(1),
NO_MUTEX_GUARD, NOT_IN_BINLOG);

#if defined(ENABLED_DEBUG_SYNC)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding a new system variable just for test fault injection does not sound like acceptable. This change will change the result set for all MTR tests which list system variable, and the change is depends on build configuration.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reverted the change that introduced a new system variable. So, now the table names for which inserts fail inside InnoDB are once again hard-coded in the InnoDB code.

Is there any other way (besides a system variable) to pass table names from MTR tests to InnoDB code at run-time?
If not, then hard-coding table names in InnoDB code can't be avoided for the MTR test galera.galera_retry_applying.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as a reference, take a look at another PR: #385
which causes failure injection for given number of times. This is for all tables, but could be usable in your test as well.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be possible to follow the method displayed in PR 385, but I think the way failure injection is now implemented inside InnoDB for INSERTs is actually simpler.

Copy link

@temeo temeo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding a new system variable just for test fault injection does not sound like acceptable. This change will change the result set for all MTR tests which list system variable, and the change is depends on build configuration.

Copy link

@sjaakola sjaakola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is good, branch needs rebase though

@plampio plampio force-pushed the 11.4-retry-applying branch from 37c0855 to 2b477c4 Compare November 8, 2024 13:56
@plampio
Copy link
Author

plampio commented Nov 8, 2024

Rebased the branch.

@plampio plampio force-pushed the 11.4-retry-applying branch 2 times, most recently from 57d413f to dfedb1a Compare January 16, 2025 13:26
Copy link

@temeo temeo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The following tests fail in this branch, but are not failing in 11.4 HEAD:

  • galera.galera_performance_schema
  • galera.galera_retry_applying
  • galera.GCF-939

Also, see MariaDB coding standards about commit messages: https://github.com/codership/mariadb-server/blob/11.4/CODING_STANDARDS.md#git-commit-messages: The title line should be separated by a text body by a blank line.

A new Galera feature that allows retrying of applying of writesets at
slave nodes (codership/mysql-wsrep-bugs/MariaDB#1619). Currently replication
applying stops for first non ignored failure occurring in event
applying, and node will do emergency abort (or start inconsistency
voting). Some failures, however, can be concurrency related, and
applying may succeed if the operation is tried at later time.

This feature introduces a new dynamic global option variable
"wsrep_applier_retry_count" that controls the retry-applying feature:
a zero value disables retrying and a positive value sets the maximum
number of retry attempts. The default value for this option is zero,
which means that this feature is disabled by default.
   galera.galera_performance_schema,
   galera.galera_retry_applying, and
   galera.GCF-939

to fail:
   1) Moved mtr.add_suppression() calls to the beginning of the
      galera_retry_applying MTR test.
   2) Modified the condition in wsrep_apply_events() for
      calling wsrep_dump_rbr_buf_with_header() function.
@plampio plampio force-pushed the 11.4-retry-applying branch from dfedb1a to 7b5783b Compare January 27, 2025 14:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants