Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gc: add --expire-to option #1843

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

adlternative
Copy link

@adlternative adlternative commented Dec 24, 2024

I want to perform a "safe" garbage collection for the Git repository
on the server, which avoids data corruption issues caused by
concurrent pushes during git GC. To achieve this, I currently need to
use git repack --cruft --expire-to=<dir> and git prune
in combination. However, it would be simpler if we could directly use
--expire-to=<dir> with the git-gc command.

cc: gitster@pobox.com
cc: me@ttaylorr.com
cc: peff@peff.net

This commit extends the functionality of `git gc`
by adding a new option, `--expire-to=<dir>`. Previously,
this feature was implemented in `git repack` (see 91badeb),
allowing users to specify a directory where unreachable and
expired cruft packs are stored during garbage collection.
However, users had to run `git repack --cruft --expire-to=<dir>`
followed by `git prune` to achieve similar results within `git gc`.

By introducing `--expire-to=<dir>` directly into `git gc`,
we simplify the process for users who wish to manage their
repository's cleanup more efficiently. This change involves
passing the `--expire-to=<dir>` parameter through to `git repack`,
making it easier for users to set up a backup location for cruft
packs that will be pruned.

Signed-off-by: ZheNing Hu <adlternative@gmail.com>
@adlternative
Copy link
Author

/submit

Copy link

gitgitgadget bot commented Dec 24, 2024

Submitted as pull.1843.git.1735041177817.gitgitgadget@gmail.com

To fetch this version into FETCH_HEAD:

git fetch https://github.com/gitgitgadget/git/ pr-1843/adlternative/zh/gc-expire-to-v1

To fetch this version to local tag pr-1843/adlternative/zh/gc-expire-to-v1:

git fetch --no-tags https://github.com/gitgitgadget/git/ tag pr-1843/adlternative/zh/gc-expire-to-v1

Copy link

gitgitgadget bot commented Dec 30, 2024

There are issues in commit 4254269:
fix(gc): make --prune=now compatible with --expire-to
Lines in the body of the commit messages should be wrapped between 60 and 76 characters.
Indented lines, and lines without whitespace, are exempt

The original `git gc --prune=now` attempted to delete all
unreachable objects. However, after the introduction of
`--cruft` and `--expire-to=<dir>` in git gc, `--prune=now`
can now compress unreachable objects into a cruft pack and
store them in the specified <dir> instead of deleting them
directly. This is beneficial for recovery in case of data
corruption during repository GC. Therefore, update the
handling logic of `--prune=now` in gc so that `-a` parameter
is only passed to the repack command when neither `--cruft`
nor `--expire-to` are used.

Signed-off-by: ZheNing Hu <adlternative@gmail.com>
Copy link

gitgitgadget bot commented Dec 31, 2024

Submitted as pull.1843.v2.git.1735611513.gitgitgadget@gmail.com

To fetch this version into FETCH_HEAD:

git fetch https://github.com/gitgitgadget/git/ pr-1843/adlternative/zh/gc-expire-to-v2

To fetch this version to local tag pr-1843/adlternative/zh/gc-expire-to-v2:

git fetch --no-tags https://github.com/gitgitgadget/git/ tag pr-1843/adlternative/zh/gc-expire-to-v2

@@ -69,6 +69,12 @@ be performed as well.
the `--max-cruft-size` option of linkgit:git-repack[1] for
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, ZheNing Hu wrote (reply to this):

ZheNing Hu via GitGitGadget <gitgitgadget@gmail.com> 于2024年12月31日周二 10:18写道:
>
> From: ZheNing Hu <adlternative@gmail.com>
>
> This commit extends the functionality of `git gc`
> by adding a new option, `--expire-to=<dir>`. Previously,
> this feature was implemented in `git repack` (see 91badeb),
> allowing users to specify a directory where unreachable and
> expired cruft packs are stored during garbage collection.
> However, users had to run `git repack --cruft --expire-to=<dir>`
> followed by `git prune` to achieve similar results within `git gc`.
>
> By introducing `--expire-to=<dir>` directly into `git gc`,
> we simplify the process for users who wish to manage their
> repository's cleanup more efficiently. This change involves
> passing the `--expire-to=<dir>` parameter through to `git repack`,
> making it easier for users to set up a backup location for cruft
> packs that will be pruned.
>
> Signed-off-by: ZheNing Hu <adlternative@gmail.com>
> ---
>  Documentation/git-gc.txt | 6 ++++++
>  builtin/gc.c             | 6 +++++-
>  t/t6500-gc.sh            | 6 ++++++
>  3 files changed, 17 insertions(+), 1 deletion(-)
>
> diff --git a/Documentation/git-gc.txt b/Documentation/git-gc.txt
> index 370e22faaeb..b4c0cf02972 100644
> --- a/Documentation/git-gc.txt
> +++ b/Documentation/git-gc.txt
> @@ -69,6 +69,12 @@ be performed as well.
>         the `--max-cruft-size` option of linkgit:git-repack[1] for
>         more.
>
> +--expire-to=<dir>::
> +       When packing unreachable objects into a cruft pack, write a cruft
> +       pack containing pruned objects (if any) to the directory `<dir>`.
> +       See the `--expire-to` option of linkgit:git-repack[1] for
> +       more.
> +
>  --prune=<date>::
>         Prune loose objects older than date (default is 2 weeks ago,
>         overridable by the config variable `gc.pruneExpire`).
> diff --git a/builtin/gc.c b/builtin/gc.c
> index d52735354c9..77904694c9f 100644
> --- a/builtin/gc.c
> +++ b/builtin/gc.c
> @@ -136,6 +136,7 @@ struct gc_config {
>         char *prune_worktrees_expire;
>         char *repack_filter;
>         char *repack_filter_to;
> +       char *repack_expire_to;
>         unsigned long big_pack_threshold;
>         unsigned long max_delta_cache_size;
>  };
> @@ -441,6 +442,8 @@ static void add_repack_all_option(struct gc_config *cfg,
>                 if (cfg->max_cruft_size)
>                         strvec_pushf(&repack, "--max-cruft-size=%lu",
>                                      cfg->max_cruft_size);
> +               if (cfg->repack_expire_to)
> +                       strvec_pushf(&repack, "--expire-to=%s", cfg->repack_expire_to);
>         } else {
>                 strvec_push(&repack, "-A");
>                 if (cfg->prune_expire)
> @@ -675,7 +678,6 @@ struct repository *repo UNUSED)
>         const char *prune_expire_sentinel = "sentinel";
>         const char *prune_expire_arg = prune_expire_sentinel;
>         int ret;
> -
>         struct option builtin_gc_options[] = {
>                 OPT__QUIET(&quiet, N_("suppress progress reporting")),
>                 { OPTION_STRING, 0, "prune", &prune_expire_arg, N_("date"),
> @@ -694,6 +696,8 @@ struct repository *repo UNUSED)
>                            PARSE_OPT_NOCOMPLETE),
>                 OPT_BOOL(0, "keep-largest-pack", &keep_largest_pack,
>                          N_("repack all other packs except the largest pack")),
> +               OPT_STRING(0, "expire-to", &cfg.repack_expire_to, N_("dir"),
> +                          N_("pack prefix to store a pack containing pruned objects")),
>                 OPT_END()
>         };
>
> diff --git a/t/t6500-gc.sh b/t/t6500-gc.sh
> index ee074b99b70..d4b0653a9b7 100755
> --- a/t/t6500-gc.sh
> +++ b/t/t6500-gc.sh
> @@ -339,6 +339,12 @@ test_expect_success 'gc.maxCruftSize sets appropriate repack options' '
>         test_subcommand $cruft_max_size_opts --max-cruft-size=3145728 <trace2.txt
>  '
>
> +test_expect_success '--expire-to sets appropriate repack options' '
> +       mkdir expired &&
> +       GIT_TRACE2_EVENT=$(pwd)/trace2.txt git -C cruft--max-size gc --cruft --expire-to=./expired/pack &&
> +       test_subcommand $cruft_max_size_opts --expire-to=./expired/pack <trace2.txt
> +'
> +
>  run_and_wait_for_gc () {
>         # We read stdout from gc for the side effect of waiting until the
>         # background gc process exits, closing its fd 9.  Furthermore, the
> --
> gitgitgadget
>

Hi, Jeff King, could you come and help take a look at this patch?
I would be very grateful if you have time!

ZheNing Hu

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant