Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MDEV-35693: Improve SSS column sizes #3772

Merged
merged 2 commits into from
Feb 1, 2025
Merged

MDEV-35693: Improve SSS column sizes #3772

merged 2 commits into from
Feb 1, 2025

Conversation

ParadoxV5
Copy link
Contributor

@ParadoxV5 ParadoxV5 commented Jan 17, 2025

  • The Jira issue number for this PR is: MDEV-35693

Description

Resize the type and width of SHOW REPLICA STATUS (technically INFORMATION_SCHEMA.SLAVE_STATUS) columns to better match their possible values

In case of intentionally but absurdly long lists, text columns that list an uncapped number of elements have expanded to accept as many bytes as we could support.

While here, I found all our integer columns are unsigned, so I updated ones that weren’t configured as unsigned before for both column types and the field storing function (grep \(\*field\+\+\)->store\(.+(?<!false|true|my_charset_bin)\); minus the one (double)).

In response to ‘MAX_SLAVE_ERRMSG’ was not declared in this scope in Embedded builds, a new #ifdef HAVE_REPLICATION guard wraps slave_status_info to skip this unused data in Replication-less builds.

What problem is the patch trying to solve?

Particularly, the first-gen Replicate_ filters were incorrectly typed as singlular Name()s during MDEV-33526.
Under Names’ 64-char limit, they could overflow (read: truncate) even before their lengths got absurd.

Do you think this patch might introduce side-effects in other parts of the server?

Let me know if resizing I.S. column types is a significant backward-incompatibility.

Release Notes

Increased VARCHAR width limits of SHOW SLAVE STATUS so it doesn’t truncate long lists that should be long.

Optionally list any https://mariadb.com/kb/ pages that need changing.

By the way, these three columns are missing on https://mariadb.com/kb/en/show-replica-status/#column-descriptions:

Replicate_Do_Domain_Ids
Replicate_Ignore_Domain_Ids
Parallel_Mode

How can this PR be tested?

🤔 How shall we test this? Cook up a replica so its status report has all columns filled to the brim?
Discussion: #3772 (comment)
This PR includes a cherry-pick of 3795 to test for the MDEV-35693 issue itself.

Sample manual test from MDEV-35693

Of course, the original issue about Replicate_ fields specifically is reproducible by setting them to anything longer than 64 bytes.

set @@global.replicate_do_table='db_to_filter.0123456789_1,db_to_filter.0123456789_2,db_to_filter.0123456789_3,db_to_filter.0123456789_4,db_to_filter.0123456789_5,db_to_filter.0123456789_6,db_to_filter.0123456789_7';
 
show slave status\G

Unexpected: Replicate_Do_Table: db_to_filter.0123456789_7,db_to_filter.0123456789_1,db_to_filter

Basing the PR against the correct MariaDB version

  • This is a new feature or a refactoring, and the PR is based against the main branch.
  • This is a bug fix, and the PR is based against the earliest maintained branch in which the bug can be reproduced.

PR quality check

  • I checked the CODING_STANDARDS.md file and my PR conforms to this where appropriate.
  • For any trivial modifications to the PR, I am ok with the reviewer making the changes themselves.

sql/sql_show.cc Outdated
Comment on lines 10666 to 10668
Column("Gtid_IO_Pos", Longtext(), NOT_NULL),
Column("Replicate_Do_Domain_Ids", Longtext(), NOT_NULL),
Column("Replicate_Ignore_Domain_Ids", Longtext(), NOT_NULL),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we continue truncating those not-Name() ones, just at a more lenient limit?
How about the others?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are all probably ok as-is. I wouldn't expect these do/ignore_domain_ids lists to ever be that long. Then for Gtid_IO_Pos, each GTID is %u-%u-%llu, so that'd be what, 10+1+10+1+20=42 max-len each? At which point we could show 23 max-len GTIDs (which would be rare), but say the more likely length of a GTID would be 20 chars long, leading to ~50 being shown. I haven't heard of such configurations, but I suppose if someone really knew what they were doing (and potentially using domain ids specifically to enhance replication performance via parallelization), they could hit that.. I guess I'm ok either way for Gtid_IO_Pos.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4096, the current width of INFORMATION_SCHEMA.GLOBAL_VARIABLES.VARIABLE_VALUE?

server/sql/sql_show.cc

Lines 10275 to 10280 in 74331a4

ST_FIELD_INFO variables_fields_info[]=
{
Column("VARIABLE_NAME", Varchar(64), NOT_NULL, "Variable_name"),
Column("VARIABLE_VALUE", Varchar(4096), NOT_NULL, "Value"),
CEnd()
};

sql/sql_show.cc Outdated
@@ -10626,66 +10623,66 @@ ST_FIELD_INFO slave_status_info[]=
Column("Slave_IO_State", Varchar(64), NULLABLE),
Column("Master_Host", Varchar(HOSTNAME_LENGTH), NULLABLE),
Column("Master_User", Varchar(USERNAME_LENGTH), NULLABLE),
Column("Master_Port", ULong(7), NOT_NULL),
Column("Master_Port", ULong(5), NOT_NULL),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TCP port numbers are uint16_t (up to 65535). (There’s no UShort() though…)

Let me know if MariaDB doesn’t use TCP, and our port numbers are uint32 with 7 decimal digits.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Looks like it has been 7 for a long time (in MySQL source at present too)... I wonder if in the distant past, there would have been a comma and \0 to account for with that number.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There’s no UShort() though…

I’ve added one for this and the three Last_Errnos.

Column("Last_Errno", SLong(4), NOT_NULL),
Column("Last_Error", Varchar(20), NULLABLE),
Column("Last_Error", Varchar(MAX_SLAVE_ERRMSG), NULLABLE),
Column("Skip_Counter", ULong(10), NOT_NULL),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Relay_log_info::slave_skip_counter is a ulonglong, but

(*field++)->store((uint32) mi->rli.slave_skip_counter);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sql_slave_skip_counter system variable is max UINT_MAX:

static Sys_var_multi_source_ulonglong Sys_slave_skip_counter(
       "sql_slave_skip_counter", "Skip the next N events from the master log",
       SESSION_VAR(slave_skip_counter), NO_CMD_LINE,
       &Master_info::get_slave_skip_counter,
       VALID_RANGE(0, UINT_MAX), DEFAULT(0), BLOCK_SIZE(1),
       ON_UPDATE(update_slave_skip_counter));

Perhaps the variable type itself should be fixed (Relay_log_info::slave_skip_counter and system_variables::slave_skip_counter). I don't think we'd need to do that in any GA release (better not to touch it if we don't need to), but I'd think we can do it here (11.7) or in the next release.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, another one similar to MDEV-35706 Shrink my_thread_id to uint32_t?

sql/sql_show.cc Outdated Show resolved Hide resolved
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

U/Long(void) and U/Longlong(void) already exists, but they use MY_INT32_NUM_DECIMAL_DIGITS (11) and MY_INT32_NUM_DECIMAL_DIGITS (21) respectively for both signed and unsigned, which doesn’t match sql/sql_show.cc’s ULong(10) and ULonglong(20).

09876543210987654321 width
18446744073709551616 1<<64
          4294967296 1<<32

So, what excactly are those constant macros?

sql/sql_show.cc Show resolved Hide resolved
sql/sql_show.cc Outdated Show resolved Hide resolved
sql/sql_show.cc Show resolved Hide resolved
Column("Gtid_IO_Pos", Longtext(), NOT_NULL),
Column("Replicate_Do_Domain_Ids", Longtext(), NOT_NULL),
Column("Replicate_Ignore_Domain_Ids", Longtext(), NOT_NULL),
Column("Parallel_Mode", Varchar(sizeof("conservative")-1), NOT_NULL),
Column("SQL_Delay", ULong(10), NOT_NULL),
Copy link
Contributor Author

@ParadoxV5 ParadoxV5 Jan 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm?

(*field++)->store((uint32) mi->rli.get_sql_delay());

Not only Relay_log_info::get_sql_delay() returns a signed int, but it’s also missing , true; this would invoke store(double nr) instead of store(longlong nr, bool unsigned_val).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added the , true


According to rpl.rpl_delayed_slave, SQL_Delay’s unsigned in practice.
So changing int to unsigned int is not a bug but a clarifying refactor. I won’t include it here.
Same for checking whether each of the (uint32) casts make sense. We could instead use more Field::store() overloads – for the various integer sizes as well as separate handlers for unsigned ones.

@ParadoxV5 ParadoxV5 requested review from bnestere and removed request for andrelkin January 17, 2025 04:36
Copy link
Contributor

@bnestere bnestere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ParadoxV5 , you have some great finds here! See my notes.

Also, can you add a new test which fails before your patch to show the problems, which then succeeds with your patch?

Then for the git commit message, for the title, SSS is only common for us casually speaking on the replication team. Can you spell it out?

sql/sql_show.cc Outdated
@@ -10626,66 +10623,66 @@ ST_FIELD_INFO slave_status_info[]=
Column("Slave_IO_State", Varchar(64), NULLABLE),
Column("Master_Host", Varchar(HOSTNAME_LENGTH), NULLABLE),
Column("Master_User", Varchar(USERNAME_LENGTH), NULLABLE),
Column("Master_Port", ULong(7), NOT_NULL),
Column("Master_Port", ULong(5), NOT_NULL),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Looks like it has been 7 for a long time (in MySQL source at present too)... I wonder if in the distant past, there would have been a comma and \0 to account for with that number.

Column("Last_Errno", SLong(4), NOT_NULL),
Column("Last_Error", Varchar(20), NULLABLE),
Column("Last_Error", Varchar(MAX_SLAVE_ERRMSG), NULLABLE),
Column("Skip_Counter", ULong(10), NOT_NULL),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sql_slave_skip_counter system variable is max UINT_MAX:

static Sys_var_multi_source_ulonglong Sys_slave_skip_counter(
       "sql_slave_skip_counter", "Skip the next N events from the master log",
       SESSION_VAR(slave_skip_counter), NO_CMD_LINE,
       &Master_info::get_slave_skip_counter,
       VALID_RANGE(0, UINT_MAX), DEFAULT(0), BLOCK_SIZE(1),
       ON_UPDATE(update_slave_skip_counter));

Perhaps the variable type itself should be fixed (Relay_log_info::slave_skip_counter and system_variables::slave_skip_counter). I don't think we'd need to do that in any GA release (better not to touch it if we don't need to), but I'd think we can do it here (11.7) or in the next release.

sql/sql_show.cc Show resolved Hide resolved
mysql-test/suite/funcs_1/r/is_tables_is.result Outdated Show resolved Hide resolved
sql/sql_show.cc Outdated Show resolved Hide resolved
sql/sql_show.cc Outdated
Comment on lines 10666 to 10668
Column("Gtid_IO_Pos", Longtext(), NOT_NULL),
Column("Replicate_Do_Domain_Ids", Longtext(), NOT_NULL),
Column("Replicate_Ignore_Domain_Ids", Longtext(), NOT_NULL),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are all probably ok as-is. I wouldn't expect these do/ignore_domain_ids lists to ever be that long. Then for Gtid_IO_Pos, each GTID is %u-%u-%llu, so that'd be what, 10+1+10+1+20=42 max-len each? At which point we could show 23 max-len GTIDs (which would be rare), but say the more likely length of a GTID would be 20 chars long, leading to ~50 being shown. I haven't heard of such configurations, but I suppose if someone really knew what they were doing (and potentially using domain ids specifically to enhance replication performance via parallelization), they could hit that.. I guess I'm ok either way for Gtid_IO_Pos.

sql/sql_show.cc Outdated Show resolved Hide resolved
sql/sql_show.cc Outdated
Column("Retried_transactions", ULong(10), NOT_NULL),
Column("Max_relay_log_size", ULonglong(10), NOT_NULL),
Column("Executed_log_entries", ULong(10), NOT_NULL),
Column("Slave_received_heartbeats", ULong(10), NOT_NULL),
Column("Slave_heartbeat_period", Float(703), NOT_NULL), // 3 decimals
Column("Gtid_Slave_Pos", Varchar(FN_REFLEN), NOT_NULL),
Column("Gtid_Slave_Pos", Longtext(), NOT_NULL),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same goes here as for Gtid_IO_Pos

Copy link
Contributor Author

@ParadoxV5 ParadoxV5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, can you add a new test which fails before your patch to show the problems, which then succeeds with your patch?

I wondered at “§ How can this PR be tested?”, which fields should we test and how long the strings should we assert?

Then for the git commit message, for the title, SSS is only common for us casually speaking on the replication team. Can you spell it out?

😄 When I picked up MDEV-35304, I too questioned what its description meant by ‘SSS’.
I thought it was a common jargon!

As for the title… With ‘SHOW REPLICA STATUS’ it doesn’t fit the 50-char guideline, but I suppose it’s fine omitting MDEV-35693: from the count.
(It’s time to avoid calling things ‘SLAVE’s when we can.)

Column("Last_Errno", SLong(4), NOT_NULL),
Column("Last_Error", Varchar(20), NULLABLE),
Column("Last_Error", Varchar(MAX_SLAVE_ERRMSG), NULLABLE),
Column("Skip_Counter", ULong(10), NOT_NULL),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, another one similar to MDEV-35706 Shrink my_thread_id to uint32_t?

sql/sql_show.cc Outdated Show resolved Hide resolved
sql/sql_show.cc Outdated Show resolved Hide resolved
Comment on lines +145 to +148
// utf8mb3 Varchars longer than MAX_FIELD_VARCHARLENGTH/3 become Longtexts
DBUG_ASSERT(length * 3 <= MAX_FIELD_VARCHARLENGTH);
}
Varchar(): Type(&type_handler_varchar, MAX_FIELD_VARCHARLENGTH/3, false) {}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://mariadb.com/kb/en/unicode/ says 11.8 wants to change utf8 to utf8mb4.
Assuming the default charset is simply ‘utf8’, I’ll let that work change these 3s to 4s, in hopes that it notices it creates unexpected #3772 (comment) changes too.

@bnestere
Copy link
Contributor

bnestere commented Jan 19, 2025

I wondered at “§ How can this PR be tested?”, which fields should we test and how long the strings should we assert?

master_last_event_time_row.test should provide a good example to use on how to query/assert the correctness of SHOW REPLICA STATUS columns. Then as far as how long the strings should be, with 64 being the old length where truncation would happen, maybe just double it? I don't necessarily think it needs to be MAX_FIELD_VARCHARLENGTH characters long.. But perhaps make it easy to change? E.g. a lot of our tests use REPEAT('a', 128) (or whatever string and number parameters to repeat); I'd suggest using that. And note you can parameterize the length with something like

--let $len= 64
--let $long_db_a= `SELECT REPEAT('a', $len)`
--let $long_db_b= `SELECT REPEAT('b', $len)`

And then you can use it in queries using eval:

--eval SET @@global.replicate_do_db="$long_db_a,$long_db_b"

Then to

As for the title… With ‘SHOW REPLICA STATUS’ it doesn’t fit the 50-char guideline, but I suppose it’s fine omitting MDEV-35693: from the count.

Hmm, I didn't realize we had a 50-char title maximum - I probably break that rule often 😅. I'd say not to omit the MDEV-35693 part, that's the most important part. Maybe try to summarize as best you can within 50 chars, and if you can't, then just use the JIRA title and truncate it after 50.

Also, in the git commit message, the high-level description of the problem was dropped since your last commit:

In particular, many text columns – namely Replicate_ ones – could overflow (read: truncate) if the data is absurdly (but validly) long.

Such descriptions are good to say up-front to summarize the issue for non-programmers, who often read through our commit messages (I also think we should be explicit about the length where the truncation happens, so I'd say to add that they'd truncate at 64 characters to your above message.

@ParadoxV5
Copy link
Contributor Author

ParadoxV5 commented Jan 21, 2025

I wondered at “§ How can this PR be tested?”, which fields should we test and how long the strings should we assert?

master_last_event_time_row.test should provide a good example to use on how to query/assert the correctness of SHOW REPLICA STATUS columns. Then as far as how long the strings should be, with 64 being the old length where truncation would happen, maybe just double it? I don't necessarily think it needs to be MAX_FIELD_VARCHARLENGTH characters long.. […]

So, just testing those was-Named() OG-Replicate_s is fine, since they came directly from user configs and should never truncate?
Hmm, then maybe the newer Replicate_s too.

Hmm, I didn't realize we had a 50-char title maximum - I probably break that rule often 😅.

rules are made to be broken

Also, in the git commit message, the high-level description of the problem was dropped since your last commit:

In particular, many text columns – namely Replicate_ ones – could overflow (read: truncate) if the data is absurdly (but validly) long.

Such descriptions are good to say up-front to summarize the issue for non-programmers, who often read through our commit messages (I also think we should be explicit about the length where the truncation happens, so I'd say to add that they'd truncate at 64 characters to your above message.

The mention is still there but yes, the emphasis on how it’s a problem was lost. 👍

Comment on lines +427 to +436
def information_schema SLAVE_STATUS Replicate_Do_DB 15 NULL NO varchar 21844 65532 NULL NULL NULL utf8mb3 utf8mb3_general_ci varchar(21844) select NEVER NULL NO NO
def information_schema SLAVE_STATUS Replicate_Do_Domain_Ids 47 NULL NO varchar 21844 65532 NULL NULL NULL utf8mb3 utf8mb3_general_ci varchar(21844) select NEVER NULL NO NO
def information_schema SLAVE_STATUS Replicate_Do_Table 17 NULL NO varchar 21844 65532 NULL NULL NULL utf8mb3 utf8mb3_general_ci varchar(21844) select NEVER NULL NO NO
def information_schema SLAVE_STATUS Replicate_Ignore_DB 16 NULL NO varchar 21844 65532 NULL NULL NULL utf8mb3 utf8mb3_general_ci varchar(21844) select NEVER NULL NO NO
def information_schema SLAVE_STATUS Replicate_Ignore_Domain_Ids 48 NULL NO varchar 21844 65532 NULL NULL NULL utf8mb3 utf8mb3_general_ci varchar(21844) select NEVER NULL NO NO
def information_schema SLAVE_STATUS Replicate_Ignore_Server_Ids 41 NULL NO varchar 21844 65532 NULL NULL NULL utf8mb3 utf8mb3_general_ci varchar(21844) select NEVER NULL NO NO
def information_schema SLAVE_STATUS Replicate_Ignore_Table 18 NULL NO varchar 21844 65532 NULL NULL NULL utf8mb3 utf8mb3_general_ci varchar(21844) select NEVER NULL NO NO
def information_schema SLAVE_STATUS Replicate_Rewrite_DB 56 NULL NO varchar 21844 65532 NULL NULL NULL utf8mb3 utf8mb3_general_ci varchar(21844) select NEVER NULL NO NO
def information_schema SLAVE_STATUS Replicate_Wild_Do_Table 19 NULL NO varchar 21844 65532 NULL NULL NULL utf8mb3 utf8mb3_general_ci varchar(21844) select NEVER NULL NO NO
def information_schema SLAVE_STATUS Replicate_Wild_Ignore_Table 20 NULL NO varchar 21844 65532 NULL NULL NULL utf8mb3 utf8mb3_general_ci varchar(21844) select NEVER NULL NO NO
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually @bnestere do these count as tests for this issue?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sometimes they will, but in this case I'd say it isn't sufficient, as the existing tests weren't enough to catch the regression in the first place 😅. So we should add tests to ensure validly-long values don't get truncated.

Copy link
Contributor Author

@ParadoxV5 ParadoxV5 Jan 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reviewed the SHOW REPLICA STATUS listing and gave them a good consideration.
(Some like these Replicate_s come straight from user configs, others are computed live or are string messages (the L10N-able kind).)

I’ve decided that edge case testing should be part of their feature testing; that is, entries should validate their SHOW REPLICA STATUS apperance in addition to their SELECT GLOBAL/SESSION/SHOW STATUS/etc.
Which means they ideally should be available on the branch SHOW REPLICA STATUS added them; i.e., ideally backport to the earliest applicable branches.

The maintenance sprint is finishing up, @bnestere. Lemme see if tests for MDEV-35693’s Replicate_s ones can make it.
Update: don’t think so. They were in 10.5 or earlier, whose new version is releasing soon™️.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

filed MDEV-35948 to generalize the scope to all of SHOW REPLICA STATUS

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i.e., ideally backport to the earliest applicable branches.
...
filed MDEV-35948 to generalize the scope to all of SHOW REPLICA STATUS

Sure. It makes sense to have the complete test in 10.5+ as well, though with the addition...

Lemme see if tests for MDEV-35693’s Replicate_s ones can make it.

I think would be good. I'd say we should add tests now for any of the fields you've changed in this patch, and then backport it and test the rest in 10.5 (say during the next light week). And the cutoff for this part is Friday, so I'd think there's time (this would take priority over MDEV-35304).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’ve prepared #3795 for the ones reported (and are mainly affected) by MDEV-35693.
I’ve yet to search where the rests’ (both the latter Replicate_s and others) tests are.

@ParadoxV5 ParadoxV5 force-pushed the mdev-35693 branch 2 times, most recently from e112005 to eee96c7 Compare January 23, 2025 00:32
@ParadoxV5 ParadoxV5 requested a review from bnestere January 23, 2025 00:48
@ParadoxV5
Copy link
Contributor Author

I wondered at “§ How can this PR be tested?”, which fields should we test and how long the strings should we assert?

master_last_event_time_row.test should provide a good example to use on how to query/assert the correctness of SHOW REPLICA STATUS columns. Then as far as how long the strings should be, with 64 being the old length where truncation would happen, maybe just double it? I don't necessarily think it needs to be MAX_FIELD_VARCHARLENGTH characters long.. […]

So, just testing those was-Named() OG-Replicate_s is fine, since they came directly from user configs and should never truncate?
Hmm, then maybe the newer Replicate_s too.

Uhh… I don’t feel right testing only a subset. Reflecting from the Broken Windows Theory, we either test none of them or we should test all of them.
Concerningly, looks like overflow test are done for the underlying variables but not for SSS reports.

sql/sql_show.cc Outdated Show resolved Hide resolved
sql/sql_show.cc Outdated
Column("Replicate_Ignore_Table", Name(), NOT_NULL),
Column("Replicate_Wild_Do_Table", Name(), NOT_NULL),
Column("Replicate_Wild_Ignore_Table", Name(), NOT_NULL),
Column("Slave_IO_Running", Varchar(sizeof("Connecting")-1), NOT_NULL),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find this unnecessary. Both versions are not future-proof in that a new state with the name length greater than sizeof("Connecting")-1 might be introduced.
A better approach to me would be to max out the allowed size with Varchar() or restore the OLD version plus add a compile-time assert to constrain the string lengths to 10 in slave_running[].

I'd go with the simple Varchar().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You’re right, both versions are future-proof only by assuming that a new enum is tested to not print truncated in SSS.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[…] restore the OLD version plus add a compile-time assert to constrain the string lengths to 10 in slave_running[].

I prefer this better, so these columns aren’t unnecessarily max-sized.
Thus I’ve reverted these changes (at least for now).
This is a 🐞, not a 🐛. Harmless 🐞s should be refactored, not forcibly fixed.

sql/sql_show.cc Outdated Show resolved Hide resolved
@ParadoxV5
Copy link
Contributor Author

sql/slave.cc Outdated Show resolved Hide resolved
Copy link
Contributor

@bnestere bnestere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates, @ParadoxV5 !

After addressing my 2 standalone comments (sorry, I didn't make them part of this review chunk):

  1. Adding master_ssl_verify_server_cert=0 to CHANGE MASTER TO commands
  2. Reverting the Seconds_Behind_Master changes (temporarily, where we can investigate more thoroughly later)

and my next point, I'd say this is good to go!

For the PR structure, I'd also say to squash the third commit with the tests into the second, so the tests go in with the fixes, for easy reference/potential future debugging.

Also a few quick remarks about the commit message:

  1. I'd say to avoid phrases like "See that PR for details", and be up-front right away about any relevant details (so one doesn't have to go on a goose chase looking for why something was done some way, in a volatile way). So here, for example, I'd say something along the lines of "these tests are eventually to be back-ported into a GA release, with its progress tracked via MDEV-35948). Effectively, it is to clearly organize the work by keeping all the content of this patch within the context of MDEV-35693, and the future-backport and extensions to be done later.
  2. I'd also say just to refer to JIRA tickets (i.e. to drop references to the PR in the commit message). The JIRA is supposed to be the source of truth of the ticket (which one can find the PR from). Ideally too, we should take important decisions that happen when discussing the PR, and add them as comments to the JIRA

@andrelkin andrelkin self-requested a review January 31, 2025 11:57
Copy link
Contributor

@andrelkin andrelkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Thanks Jimmy and Brandon!

Resize the types and widths of SHOW REPLICA STATUS
(technically `INFORMATION_SCHEMA.SLAVE_STATUS`)
columns to better match their possible values

In case of intentionally but absurdly long lists,
text columns that list an uncapped number of elements
have expanded to accept as many bytes as we could support.

Particularly, the first-gen `Replicate_` filters were
incorrectly typed as singlular `Name()`s during MDEV-33526.
Under `Name`s’ 64-char limit, they could overflow
(read: truncate) even before their lengths got absurd.

In response to `‘MAX_SLAVE_ERRMSG’ was not declared in this scope` in
Embedded builds, a new `#ifdef HAVE_REPLICATION` guard wraps
`slave_status_info` to skip this unused data in Replication-less builds.

For testing, this commit forward-ports a modified cherry-pick of #3795
(the latter targets our oldest maintained LTS as part of MDEV-35948).

> Assert that 1st-gen `replicate_*` filter variables display
> their input – including long but reasonable lists –
> correctly (without truncation) in
> * direct SELECT
> * [semi-new] INFORMATION_SCHEMA.GLOBAL_VARIABLES.VARIABLE_VALUE
> * [new] SHOW REPLICA STATUS

Reviewed-by: Brandon Nesterenko <brandon.nesterenko@mariadb.com>
Update all integer columns of SHOW REPLICA STATUS (technically
INFORMATION_SCHEMA.SLAVE_STATUS) to unsigned because, well, they are (:.

Some `uint32` ones were accidentally using the `Field::store(double nr)`
overload because they forgot the `, true` for
`Field::store(longlong nr, bool unsigned_val)`.
The mistake’s harmless, fortunately, as `double` supports over 15
significant decimal digits, well over `uint32`’s 9-and-a-half.
Copy link
Contributor Author

@ParadoxV5 ParadoxV5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And thanks Brandon and Andrei for reviewing!

I'd say to avoid phrases like "See that PR for details", and be up-front right away about any relevant details.

Alrighty, I now state it a forward port and include the entire relevant commit description.

I'd also say just to refer to JIRA tickets (i.e. to drop references to the PR in the commit message).

I’d like MDEV-35948 to track the progress of all columns, not just the ones MDEV-35693 requested, which is #3795’s (at least current) coverage.
So mentioning the PR as well is more precise, albeit somewhat redundant.

sql/slave.cc Outdated Show resolved Hide resolved
mysql-test/suite/sys_vars/r/replicate_do_db_basic.result Outdated Show resolved Hide resolved
@ParadoxV5 ParadoxV5 enabled auto-merge (rebase) February 1, 2025 03:44
@ParadoxV5 ParadoxV5 merged commit 697b88b into 11.7 Feb 1, 2025
13 of 14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

3 participants