Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Desktop does not start if there is another GUI running in the host #25

Closed
Aerosane opened this issue Jun 13, 2022 · 14 comments
Closed

Desktop does not start if there is another GUI running in the host #25

Aerosane opened this issue Jun 13, 2022 · 14 comments

Comments

@Aerosane
Copy link

Hello,

when I open the port 8080(selkies-gstreamer) URL,I get “502 Bad gateway “ error from nginx,

Thank you,

@Neoplanetz
Copy link

Neoplanetz commented Jun 14, 2022

first of all, thank you for sharing docker for nvidia glx desktop.

I'm also same things with @Aerosane

my host pc server has 3 x Quadro RTX 8000 GPU.

nvidia graphic driver version is 510.73.05

I'm running docker with basic command like README with noVNC enable environment variable

docker run --gpus 1 -it -e TZ=UTC -e SIZEW=1920 -e SIZEH=1080 -e REFRESH=60 -e DPI=96 -e CDEPTH=24 -e VIDEO_PORT=DFP -e PASSWD=mypasswd -e WEBRTC_ENCODER=nvh264enc -e BASIC_AUTH_PASSWORD=mypasswd  -e NOVNC_ENABLE=true -p 8080:8080 ghcr.io/ehfd/nvidia-glx-desktop:latest

after that, I try to access Web VNC, but I cannot access. Error message is ERR_CONNECTION_REFUSED.

I'm already adding 8080 port to my Asus AP.

after some minute later, this docker container is stopped automatically.

this is the above docker container logs.

2022-06-10 07:12:31,571 INFO Set uid to user 1000 succeeded
2022-06-10 07:12:31,573 INFO supervisord started with pid 1
2022-06-10 07:12:32,576 INFO spawned: 'entrypoint' with pid 8
2022-06-10 07:12:32,580 INFO spawned: 'pulseaudio' with pid 9
2022-06-10 07:12:32,584 INFO spawned: 'selkies-gstreamer' with pid 10
2022-06-10 07:12:33,720 INFO success: entrypoint entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2022-06-10 07:12:33,720 INFO success: pulseaudio entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2022-06-10 07:12:33,720 INFO success: selkies-gstreamer entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2022-06-13 02:13:28,831 INFO Set uid to user 1000 succeeded
2022-06-13 02:13:28,834 INFO supervisord started with pid 1
2022-06-13 02:13:29,837 INFO spawned: 'entrypoint' with pid 8
2022-06-13 02:13:29,841 INFO spawned: 'pulseaudio' with pid 9
2022-06-13 02:13:29,844 INFO spawned: 'selkies-gstreamer' with pid 10
2022-06-13 02:13:29,901 INFO exited: pulseaudio (exit status 1; not expected)
2022-06-13 02:13:31,180 INFO success: entrypoint entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2022-06-13 02:13:31,183 INFO spawned: 'pulseaudio' with pid 82
2022-06-13 02:13:31,184 INFO success: selkies-gstreamer entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2022-06-13 02:13:32,259 INFO success: pulseaudio entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2022-06-13 02:19:32,673 WARN received SIGTERM indicating exit request
2022-06-13 02:19:32,674 INFO waiting for entrypoint, pulseaudio, selkies-gstreamer to die
2022-06-13 02:19:35,677 INFO waiting for entrypoint, pulseaudio, selkies-gstreamer to die
2022-06-13 02:19:38,681 INFO waiting for entrypoint, pulseaudio, selkies-gstreamer to die
2022-06-13 02:19:41,685 INFO waiting for entrypoint, pulseaudio, selkies-gstreamer to die

how can I resolve this problem? thank you for reply

@ehfd
Copy link
Member

ehfd commented Jun 14, 2022

Some kind of an upstream regression maybe. I cannot check now, but will make sure to fix asap, by next week at the latest

@ehfd
Copy link
Member

ehfd commented Jun 14, 2022

Triggered a rebuild to see if it helps. Please pull the new container in an hour and check if it works, if not, manual intervention required.

@ehfd
Copy link
Member

ehfd commented Jun 14, 2022

And please upload all three of the .log files at /tmp to troubleshoot quickly.

@Neoplanetz
Copy link

And please upload all three of the .log files at /tmp to troubleshoot quickly.

I remove old nvidia-glx-desktop docker image and pull new docker image and running docker.

But the current status is also same like before. thus, I upload logs files all here. thank you for fast response

glx_logs.zip

additionally, docker-nvidia-egl-desktop docker is running smoothly with selkies-gstremer or novnc.

@ehfd
Copy link
Member

ehfd commented Jun 15, 2022

For you @Neoplanetz duplicate of #11
PLEASE DO NOT START TWO X SERVERS (THE OTHER LIKELY ON YOUR HOST) ON ONE GPU
@Aerosane Please check this too

@Neoplanetz
Copy link

sorry for that I don't know about x server well.
I refer #11 . and I change PC with only one rtx 3090 GPU.
I command it

$ sudo nvidia-xconfig --no-probe-all-gpus --busid=BUS_ID --only-one-x-screen 

and this is /etc/X11/xorg.conf I made using above command automatically.

# nvidia-xconfig: X configuration file generated by nvidia-xconfig
# nvidia-xconfig:  version 510.73.05

Section "ServerLayout"
    Identifier     "Layout0"
    Screen      0  "Screen0"
    InputDevice    "Keyboard0" "CoreKeyboard"
    InputDevice    "Mouse0" "CorePointer"
EndSection

Section "Files"
EndSection

Section "InputDevice"
    # generated from default
    Identifier     "Mouse0"
    Driver         "mouse"
    Option         "Protocol" "auto"
    Option         "Device" "/dev/psaux"
    Option         "Emulate3Buttons" "no"
    Option         "ZAxisMapping" "4 5"
EndSection

Section "InputDevice"
    # generated from default
    Identifier     "Keyboard0"
    Driver         "kbd"
EndSection

Section "Monitor"
    Identifier     "Monitor0"
    VendorName     "Unknown"
    ModelName      "Unknown"
    Option         "DPMS"
EndSection

Section "Device"
    Identifier     "Device0"
    Driver         "nvidia"
    VendorName     "NVIDIA Corporation"
    BusID          "PCI:1:0:0"
EndSection

Section "Screen"
    Identifier     "Screen0"
    Device         "Device0"
    Monitor        "Monitor0"
    DefaultDepth    24
    Option         "ProbeAllGpus" "False"
    SubSection     "Display"
        Depth       24
    EndSubSection
EndSection

and I did running nvidia-glx-desktop docker but the result is same.

is it correct to set only one x server on one GPU?

@Aerosane
Copy link
Author

Aerosane commented Jun 15, 2022

Well,already tried removing my old x server containers,still the results are same.Just tried,but got this error - {"type":"https://tools.ietf.org/html/rfc7231#section-6.5.1","title":"Bad Request","status":400,"traceId":"|ccebc03b-438d60027d25ace1."}

after updating packages of the container,I restarted it,and the bad gateway shows up again.

@ehfd
Copy link
Member

ehfd commented Jun 15, 2022

@Neoplanetz The container is usable with multiple GPUs in one host, as long as you set up nvidia-container-toolkit.
However, you should try disabling the GUI on your host (not the container) first and try if it works, then do what you did for #11 in the host (again not the container) if you have multiple GPUs and then exclude the GPU with the set PCI ID on nvidia-container-toolkit. You still have an X server operating in the host and now you only have one GPU, so in no way the container works.
Therefore, you should do what you just did in now your 3x GPU node (then exclude the single GPU with the PCI ID on xorg.conf from docker allocation), or turn off GUI on your current 1x GPU node overall.

You can stop the GUI in the host with sudo service lightdm stop or sudo service gdm stop based on your display manager. You can work in the console on your monitor now or ssh in.
The focus here is to sideline one GPU for the host X server, then allocate the remaining GPUs to the container to prevent collision.

https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/user-guide.html#gpu-enumeration
This is how you select a specific GPU ID (this may not be the PCI ID, check nvidia-smi) or the UUID for a container.

@Aerosane Probably same issue with you (again this is the host, not the container), but please upload all three of the .log files at /tmp inside the container after docker exec -it [container_id] /bin/bash then cd /tmp && ls -l /tmp to troubleshoot.

@Aerosane
Copy link
Author

Still doesn’t work.

@ehfd
Copy link
Member

ehfd commented Jun 18, 2022

Until I can get to why this issue persists, docker-nvidia-egl-desktop is still good. Depending on the container toolkit version or driver version, I see that Vulkan now works somehow.

@Aerosane
Copy link
Author

Sure.

@ehfd
Copy link
Member

ehfd commented Aug 18, 2022

In order to use an X server on the host for your monitor with one GPU, and then provision other GPUs for the containers, it is required to change your /etc/X11/xorg.conf configurations.
First use nvidia-xconfig --no-probe-all-gpus --busid=$BUS_ID --only-one-x-screen to generate /etc/X11/xorg.conf where BUS_ID is generated with the below script. GPU_SELECT is the ID of the specific GPU you want to provision.

HEX_ID=$(nvidia-smi --query-gpu=pci.bus_id --id="$GPU_SELECT" --format=csv | sed -n 2p)
IFS=":." ARR_ID=($HEX_ID)
unset IFS
BUS_ID=PCI:$((16#${ARR_ID[1]})):$((16#${ARR_ID[2]})):$((16#${ARR_ID[3]}))

Then, edit /etc/X11/xorg.conf and add the following to the end.

Section "ServerFlags"
    Option "AutoAddGPU" "false"
EndSection

Note: https://man.archlinux.org/man/extra/xorg-server/xorg.conf.d.5.en
If you restart your OS or the Xorg server, you will now be able to use one GPU for your host X server and your real monitor, and use the rest of the GPUs for the containers.
Use docker --gpus '"device=1,2"' to provision GPUs with device IDs 1 and 2 to the container. --gpus 1 means any one GPU, not device ID of 1. Same for podman.

@ehfd ehfd changed the title Web VNC error Desktop does not start if there is another GUI running in the host Aug 18, 2022
@ehfd ehfd closed this as completed Aug 22, 2022
@ehfd
Copy link
Member

ehfd commented Aug 24, 2022

Added in Documentation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants