lxc
容器重启时出现问题,提示:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
Name: pang Remote: unix:// Architecture: x86_64 Created: 2022/11/02 06:47 UTC Status: Stopped Type: persistent Profiles: default Log: lxc pangyindong 20221105051756.396 ERROR conf - conf.c:mount_entry:2019 - No such device or address - Failed to mount "/var/lib/lxd/devices/pangyindong/unix.gpu0.dev-nvidia-caps" on "/usr/lib/x86_64-linux-gnu/lxc/dev/nvidia-caps" lxc pangyindong 20221105051756.396 ERROR conf - conf.c:lxc_setup:3611 - Failed to setup mount entries lxc pangyindong 20221105051756.396 ERROR start - start.c:do_start:1263 - Failed to setup container "pangyindong" lxc pangyindong 20221105051756.396 ERROR sync - sync.c:__sync_wait:62 - An error occurred in another process (expected sequence number 5) lxc pangyindong 20221105051756.396 WARN network - network.c:lxc_delete_network_priv:2589 - Operation not permitted - Failed to remove interface "eth0" with index 18 lxc pangyindong 20221105051756.396 ERROR lxccontainer - lxccontainer.c:wait_on_daemonized_start:842 - Received container state "ABORTING" instead of "RUNNING" lxc pangyindong 20221105051756.396 ERROR start - start.c:__lxc_start:1939 - Failed to spawn container "pangyindong" lxc 20221105051756.397 WARN commands - commands.c:lxc_cmd_rsp_recv:132 - Connection reset by peer - Failed to receive response for command "get_state" |
原因是容器启动时试图挂载unix.gpu0.dev-nvidia-caps
,但是这个是无法挂载的,所以出错。
我们可以ls
一下/dev/nvidia*
看看:
1 2 3 4 |
/dev/nvidia0 /dev/nvidia1 /dev/nvidiactl /dev/nvidia-modeset /dev/nvidia-uvm /dev/nvidia-uvm-tools /dev/nvidia-caps: nvidia-cap1 nvidia-cap2 |
目前的解决方案是手动移除/dev/nvidia-caps
这个文件夹。当然,安全起见,把这个文件夹换个名字吧。
本文最后更新于2022年11月5日,已超过 1 年没有更新,如果文章内容或图片资源失效,请留言反馈,我们会及时处理,谢谢!