Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preserve HEAP_COMBOCID when restoring t_cid from WAL #8503

Merged
merged 9 commits into from
Aug 14, 2024
Merged

Conversation

knizhnik
Copy link
Contributor

@knizhnik knizhnik commented Jul 25, 2024

Problem

See #8499

Summary of changes

Save HEAP_COMBOCID flag in WAL and do not clear it in redo handlers.

Related Postgres PRs:
neondatabase/postgres#457
neondatabase/postgres#458
neondatabase/postgres#459

Checklist before requesting a review

  • I have performed a self-review of my code.
  • If it is a core feature, I have added thorough tests.
  • Do we need to implement analytics? if so did you add the relevant metrics to the dashboard?
  • If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section.

Checklist before merging

  • Do not forget to reformat commit message to not include the above checklist

@knizhnik knizhnik requested a review from a team as a code owner July 25, 2024 05:13
@knizhnik knizhnik requested a review from MMeent July 25, 2024 05:13
Copy link

github-actions bot commented Jul 25, 2024

2144 tests run: 2075 passed, 0 failed, 69 skipped (full report)


Flaky tests (2)

Postgres 15

Postgres 14

Code coverage* (full report)

  • functions: 32.3% (7162 of 22161 functions)
  • lines: 50.3% (57930 of 115176 lines)

* collected from Rust tests only


The comment gets automatically updated with the latest test results
2b569a4 at 2024-08-13T20:13:25.776Z :recycle:

@knizhnik knizhnik changed the title Preserve HEAP_COMBOCID when restroing t_cid from WAL Preserve HEAP_COMBOCID when restoring t_cid from WAL Jul 26, 2024
@knizhnik knizhnik force-pushed the preserve_combocid branch from b20f670 to 333678c Compare July 29, 2024 10:53
Copy link
Contributor

@MMeent MMeent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ololobus ololobus requested a review from hlinnaka July 30, 2024 16:14
@knizhnik knizhnik force-pushed the preserve_combocid branch from 5dbd642 to e6fa1f3 Compare August 4, 2024 04:49
@hlinnaka
Copy link
Contributor

hlinnaka commented Aug 6, 2024

Hmm, so let's look at the case that you UPDATE a row that has been inserted earlier in the same transaction. In that case:

  • the old tuple needs to be stamped with a combo cid that represents the cmin and cmax combination. It needs to have the HEAP_COMBOCID flag set.
  • the new tuple needs to be stamped with the command id of the updating command, as cmin. It must not have the HEAP_COMBOCID flag set.

The combocid stored on the old tuple, and the cmin stored on the new tuple are fundamentally two different values. But we only have one t_cid field in the WAL record. That's bad news :-(. I don't see how this can possibly work without changing the WAL record format.

@hlinnaka
Copy link
Contributor

hlinnaka commented Aug 6, 2024

I added a test that fails without the fixes from this PR, and passes with them. It only covers the DELETE case, but it should be straightforward to exercise the other cases with a copy-paste of this test.

@knizhnik
Copy link
Contributor Author

knizhnik commented Aug 7, 2024

The combocid stored on the old tuple, and the cmin stored on the new tuple are fundamentally two different values. But we only have one t_cid field in the WAL record. That's bad news :-(. I don't see how this can possibly work without changing the WAL record format.

Sorry, I do not understand it. We really have two t_cid in HEAP_UPDATE WAL record: one in record itself and another in xl_neon_heap_header.

@hlinnaka
Copy link
Contributor

hlinnaka commented Aug 7, 2024

The combocid stored on the old tuple, and the cmin stored on the new tuple are fundamentally two different values. But we only have one t_cid field in the WAL record. That's bad news :-(. I don't see how this can possibly work without changing the WAL record format.

Sorry, I do not understand it. We really have two t_cid in HEAP_UPDATE WAL record: one in record itself and another in xl_neon_heap_header.

Oh ok. Good, never mind then :-).

Copy link
Contributor

@hlinnaka hlinnaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks correct. TODOs:

  • Add tests to cover each of the cases, INSERT, UPDATE, DELETE, LOCK, MULTI_INSERT
  • There's a funny case within heap_update, where it writes a LOCK record for the tuple first, if it has a toast tuple. Not sure of the details, but it looks correct. Let's just make sure we have a test to cover that too.
  • There's some unrelated smgrexists() changes in the postgres branches. Need to rebase, I guess

The compute_infobits function is also used by the LOCK_UPDATED record. That means that we will set the XLHL_COMBOCID flag on those records too, even though we don't use the custom neon rmgr for that. That's a bit dirty. So what happens if you feed such a record to vanilla Postgres? It will work, but vanilla fix_infomask_from_infobits() won't set the COMBOCID flag. That's OK, the COMBOCID flag is irrelevant in that scenario.

pgxn/neon_rmgr/neon_rmgr.c Show resolved Hide resolved
@knizhnik knizhnik force-pushed the preserve_combocid branch 2 times, most recently from f47ddbc to 126a2d6 Compare August 7, 2024 09:34
@knizhnik
Copy link
Contributor Author

knizhnik commented Aug 7, 2024

Can somebody also approve correspondent Postgres PRs?

@knizhnik knizhnik merged commit 7a1736d into main Aug 14, 2024
63 checks passed
@knizhnik knizhnik deleted the preserve_combocid branch August 14, 2024 05:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants