Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

safekeeper: consider partial uploads when pulling timeline #8628

Merged
merged 12 commits into from
Aug 15, 2024

Conversation

VladLazar
Copy link
Contributor

@VladLazar VladLazar commented Aug 7, 2024

Problem:

The control file contains the id of the safekeeper that uploaded it.
Previously, when sending a snapshot of the control file to another sk,
it would eventually be gc-ed by the receiving sk. This is incorrect
because the original sk might still need it later.

Summary of Changes:

When sending a snapshot and the control file contains an uploaded
segment:

  • Create a copy of the segment in s3 with the destination sk in the
    object name
  • Tweak the streamed control file to point to the object create in the
    previous step

Note that the snapshot endpoint now has to know the id of the requestor,
so the api has been extended to include the node if of the destination
sk.

Closes #8542

Checklist before requesting a review

  • I have performed a self-review of my code.
  • If it is a core feature, I have added thorough tests.
  • Do we need to implement analytics? if so did you add the relevant metrics to the dashboard?
  • If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section.

Checklist before merging

  • Do not forget to reformat commit message to not include the above checklist

Will be useful when we need to this from multiple places
Note that the write is not meant to be persistent.
Problem:
The control file contains the id of the safekeeper that uploaded it.
Previously, when sending a snapshot of the control file to another sk,
it would eventually be gc-ed by the receiving sk. This is incorrect
because the original sk might still need it later.

Summary of Changes:
When sending a snapshot and the control file contains an uploaded
segment:
* Create a copy of the segment in s3 with the destination sk in the
  object name
* Tweak the streamed control file to point to the object create in the
  previous step

Note that the snapshot endpoint now has to know the id of the requestor,
so the api has been extended to include the node if of the destination
sk.
Copy link

github-actions bot commented Aug 7, 2024

2160 tests run: 2091 passed, 0 failed, 69 skipped (full report)


Code coverage* (full report)

  • functions: 32.4% (7200 of 22251 functions)
  • lines: 50.3% (58230 of 115676 lines)

* collected from Rust tests only


The comment gets automatically updated with the latest test results
8f88330 at 2024-08-14T17:15:23.486Z :recycle:

@VladLazar VladLazar marked this pull request as ready for review August 7, 2024 12:04
@VladLazar VladLazar requested a review from a team as a code owner August 7, 2024 12:04
@VladLazar VladLazar requested review from skyzh and arssher August 7, 2024 12:04
@VladLazar VladLazar added the c/storage/safekeeper Component: storage: safekeeper label Aug 8, 2024
@VladLazar VladLazar requested a review from petuhovskiy August 13, 2024 12:15
Copy link
Member

@petuhovskiy petuhovskiy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, can be merged after a few fixes

safekeeper/src/wal_backup_partial.rs Show resolved Hide resolved
safekeeper/src/pull_timeline.rs Outdated Show resolved Hide resolved
safekeeper/src/pull_timeline.rs Outdated Show resolved Hide resolved
safekeeper/src/pull_timeline.rs Show resolved Hide resolved
@VladLazar VladLazar requested a review from petuhovskiy August 13, 2024 13:50
test_runner/regress/test_wal_acceptor.py Show resolved Hide resolved
test_runner/regress/test_wal_acceptor.py Outdated Show resolved Hide resolved
@arssher
Copy link
Contributor

arssher commented Aug 14, 2024

Thanks! LGTM.

@VladLazar VladLazar enabled auto-merge (squash) August 14, 2024 16:15
@VladLazar VladLazar merged commit fef77b0 into main Aug 15, 2024
64 checks passed
@VladLazar VladLazar deleted the vlad/sk-control-file-fix branch August 15, 2024 08:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c/storage/safekeeper Component: storage: safekeeper
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Partial upload is not working well together will pull_timeline
3 participants