Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potential regression due to chroot #11806

Closed
gador opened this issue Nov 5, 2024 · 7 comments
Closed

Potential regression due to chroot #11806

gador opened this issue Nov 5, 2024 · 7 comments
Labels

Comments

@gador
Copy link
Member

gador commented Nov 5, 2024

Describe the bug

On all nix versions which include the fix 0e4baff (or backports like 1ee7a9b for GHSA-q82p-44mg-mgh5), yarn install (and potentially npm) will hang indefinitely due to still unknown reasons.

My excerpt from my bug report on nixpkgs NixOS/nixpkgs#353709 :

  • This is a linux issue, as the chroot is deactivated on APPLE
  • When running yarn install on a working derivation, it will hang and be unkillable. This implies not being terminatable by sudo kill -9 $PID and will make a shutdown or a reboot also hang. A cold reset is needed
  • When
    tmpDir = topTmpDir + "/build";
    createDir(tmpDir, 0700);
    are removed (or the condional above is changed to include Linux) said derivation builds fine

Steps To Reproduce

  1. Use a nix version which includes GHSA-q82p-44mg-mgh5
  2. Run nix build github:nixos/nixpkgs/71e91c409d1e654808b2621f28a327acfdad8dc2#pgadmin --rebuild
  3. Wait forever

Note: Forever here is at least 48h, as this is the time I had it run.
Note2: This does not happen on the 24-05 Nixos release, as it uses Nix version 2.18.2

Expected behavior

pgadmin should build fine.

nix-env --version output

nix-shell -p nix-info --run "nix-info -m"
 - system: `"x86_64-linux"`
 - host os: `Linux 6.6.57, NixOS, 24.11 (Vicuna), 24.11.20241021.257ee9d`
 - multi-user?: `yes`
 - sandbox: `yes`
 - version: `nix-env (Nix) 2.24.9`
 - channels(wogan): `""`
 - channels(root): `"nixos, nixpkgs"`
 - nixpkgs: `/nix/store/6cj8fs8ry03hykxhdl9y64rcg4r2q621-source`

Additional context

@roberth @edolstra

Priorities

Add 👍 to issues you find important.

@gador gador added the bug label Nov 5, 2024
@gador
Copy link
Member Author

gador commented Nov 5, 2024

with

diff --git a/src/libstore/unix/build/local-derivation-goal.cc b/src/libstore/unix/build/local-derivation-goal.cc
index 2a09e3dd4..12e49d7d3 100644
--- a/src/libstore/unix/build/local-derivation-goal.cc
+++ b/src/libstore/unix/build/local-derivation-goal.cc
@@ -527,6 +527,7 @@ void LocalDerivationGoal::startBuilder()
         tmpDir = topTmpDir;
     }
     chownToBuilder(tmpDir);
+    chownToBuilder(topTmpDir);

     for (auto & [outputName, status] : initialOutputs) {
         /* Set scratch path we'll actually use during the build.

it builds.
So if the parent directory is not owned by root but by the nix build process, yarn completes. It seems it needs a readable parent directory to work in. Also, an easy fix to put the yarn $HOME directory further below did not work. It seems the process needs to at least read up until the root dir.
I'll later test it without chown but with 740 permissions to confirm my suspicion.

@gador
Copy link
Member Author

gador commented Nov 5, 2024

I'll later test it without chown but with 740 permissions to confirm my suspicion.

So with actually 750 as a permission set, it also builds. The directory structure does look like this:

s -la /tmp
drwx------  3 root  root  nix-build-pgadmin-8.12.drv-0
sudo ls -la /tmp/nix-build-pgadmin-8.12.drv-0
drwxr-x---  5 nixbld1 nixbld  7 Nov  5 06:21 build

So the /build directory is now rx by the build process group and this fixes it. I don't know why, though

@gador
Copy link
Member Author

gador commented Nov 5, 2024

So, it seems there are two potential fixes:

  1. Let the parent build-dir be owned by the nixbld[1-...] process
  2. Set the /build dir to 750

Option 1 is IMO not preferable as it just removes the added security by using a subdir

Option 2 seems viable. nixbld1 cannot access a build by nixbld2 even though they share the same group and the /build directory is group-readable, because it cannot traverse to it because of the root owned parent directory which is 700. This way we do not sacrifice the new gained sandbox security, but fix the yarn and npm build issue.

@edolstra
Copy link
Member

edolstra commented Nov 5, 2024

When running yarn install on a working derivation, it will hang and be unkillable. This implies not being terminatable by sudo kill -9 $PID and will make a shutdown or a reboot also hang. A cold reset is needed

That sounds like a kernel bug...

@FliegendeWurst
Copy link
Member

When running yarn install on a working derivation, it will hang and be unkillable. This implies not being terminatable by sudo kill -9 $PID and will make a shutdown or a reboot also hang. A cold reset is needed

That sounds like a kernel bug...

Not really, the process is just stuck in "uninterruptible sleep". See e.g. Stack Overflow, LWN.

@edolstra
Copy link
Member

edolstra commented Nov 5, 2024

FWIW I couldn't reproduce this with Nix 2.24.9 on NixOS 24.05.20241009.d51c286 with kernel 6.1.112. Maybe it's a regression between Linux 6.1 and 6.6?

Not really, the process is just stuck in "uninterruptible sleep". See e.g. Stack Overflow, LWN.

I mean, it's a bug if there is no good reason for the uninterruptible sleep (like waiting for a hard NFS mount). It would be interesting to know what syscall the unkillable process is in.

@gador
Copy link
Member Author

gador commented Nov 6, 2024

Closing. Issue found not in nix, but in the kernel. See NixOS/nixpkgs#353709 (comment) and NixOS/nixpkgs#353709 (comment)

@gador gador closed this as completed Nov 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants