这篇文章之前写的时候是装的390.25驱动+CUDA9.0+CUDNN 7.0.5,这里更新到目前我使用的版本,即418.56+CUDA10.1+CUDNN 7.6.5
cuda下载地址:https://developer.nvidia.com/cuda-toolkit-archive
cudnn下载地址:https://developer.nvidia.com/rdp/cudnn-archive
NVIDIA驱动下载地址:https://www.nvidia.cn/Download/Find.aspx?lang=cn
或者用天翼云盘,速度快,下载地址见文末。
1 安装NVIDIA显卡驱动
在安装显卡驱动之前,要确定我们要装的显卡驱动和CUDA版本的对应问题。可以参考下表:
这里推荐装个版本高点的驱动,不然以后升级挺麻烦的。下面我都以418.56为例。
我安装的是NVIDIA-Linux-x86_64-418.56.run
首先找个目录,比如/home/ubuntu/nvidia/
,然后把run
文件放到这个目录里,先给权限
1 2 |
cd /home/ubuntu/nvidia chmod 777 NVIDIA-Linux-x86_64-418.56.run |
接着ctrl+alt+f1~f6进入控制台,登陆之后,关闭lightdm
1 |
sudo service lightdm stop |
如果你的系统是中文的话,这里会出现乱码,不过没关系,输入密码就好
然后禁用nouveau
,打开/etc/modprobe.d/blacklist.conf
,在最后添加:
1 2 |
blacklist nouveau options nouveau modeset=0 |
接着更新一下:
1 |
sudo update-initramfs -u |
接着重启电脑,然后输入lsmod | grep nouveau
,测试一下nouveau
是否正确关闭,如果什么都没有输出则代表已经正确关闭了。
然后进入init 3
模式并安装驱动:
1 2 3 4 |
sudo init 3 # 或者 sudo service lightdm stop sudo ./NVIDIA-Linux-x86_64-418.56.run # 注意这里有可能这条命令不好使,如果已经进入init 3了,还是说x服务正在运行,那么运行下面这条: sudo ./NVIDIA-Linux-x86_64-418.56.run -no-x-check -no-nouveau-check -no-opengl-files |
-no-x-check 安装驱动时不检查x服务
-no-nouveau-check 安装驱动时不检查nouveau
-no-opengl-files 只安装驱动文件,不安装openGl文件
了解更多init 启动级别相关,见:
接下来就是一堆选项,问你是不是同意,直接全部同意即可,安装完成之后,重启。
这个时候输入命令查看驱动nvidia-smi
:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
(base) mcj@mcj2080:~$ nvidia-smi Mon Jan 18 16:42:28 2021 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 418.56 Driver Version: 418.56 CUDA Version: 10.1 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 GeForce RTX 208... Off | 00000000:04:00.0 On | N/A | | 27% 34C P8 21W / 250W | 53MiB / 10986MiB | 0% Default | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: GPU Memory | | GPU PID Type Process name Usage | |=============================================================================| | 0 4591 G /usr/lib/xorg/Xorg 51MiB | +-----------------------------------------------------------------------------+ |
这样显卡驱动就装好了,如果想要卸载的话,可以执行:
1 |
sudo ./NVIDIA-Linux-x86_64-418.56.run --uninstall |
2 安装CUDA 10.1
先下载CUDA,我使用的是cuda_10.1.105_418.39_linux
还是放在之前的目录下(只是为了方便),然后继续之前的步骤,进行安装:
1 2 3 |
cd /home/ubuntu/nvidia chmod 777 cuda_10.1.105_418.39_linux sudo ./cuda_10.1.105_418.39_linux |
CUDA10.X之前:
稍等几秒,会出现一个百分比的协议界面,我们只需要按q即可跳过,接着选择accept,当问及是否需要安装驱动的时候,我们选择N,因为我们之前已经安装过了。其余的我们一律选择是,安装目录也都选择默认即可。
CUDA10.X之后:
注意了,10.X跟之前的不同。这里我们选择10.1,运行上面的命令之后,会出现:
然后我们输入accept,回车
这里把第一个选项取消了,按空格键可以取消。
取消之后选择Install,回车
选择Yes,回车
出现上面这个就可以了。
接下来选择添加环境变量,不然nvcc -V
没法用:
1 2 3 4 5 6 7 8 9 10 11 |
sudo nano /etc/profile # 在最后加入 export PATH=/usr/local/cuda-10.1/bin:$PATH export LD_LIBRARY_PATH=/usr/local/cuda-10.1/lib64:$LD_LIBRARY_PATH # ctrl + x 保存 source /etc/profile sudo nano /etc/ld.so.conf # 在最后加入 /usr/local/cuda-10.1/lib64 # ctrl + x 保存 sudo ldconfig |
OK,检查一下,输入ldconfig -v|grep cuda
,我们可以看到:
1 2 3 4 5 6 7 8 9 10 11 12 13 |
ubuntu@mcj:~$ ldconfig -v|grep cuda /sbin/ldconfig.real: Can't stat /usr/local/lib/x86_64-linux-gnu: No such file or directory /sbin/ldconfig.real: Can't stat /lib32: No such file or directory /sbin/ldconfig.real: Path `/lib/x86_64-linux-gnu' given more than once /sbin/ldconfig.real: Path `/usr/lib/x86_64-linux-gnu' given more than once /sbin/ldconfig.real: /lib/x86_64-linux-gnu/ld-2.27.so is the dynamic linker, ignoring libcuda.so.1 -> libcuda.so.418.56 libicudata.so.55 -> libicudata.so.55.1 libcuda.so.1 -> libcuda.so.418.56 /sbin/ldconfig.real: Can't create temporary cache file /etc/ld.so.cache~: Permission denied /usr/local/cuda-10.1/lib64: libcudart.so.10.1 -> libcudart.so.10.1 |
代表我们的动态链接库已经设置好了,如果我们要编译samples
的话,还需要安装一些必须的工具:注意:如果你是升级CUDA,那么到这里已经结束了,下面的步骤不需要了。直接跳到CUDNN安装即可。点击到达。
1 2 3 |
sudo apt-get update sudo apt-get install build-essential sudo apt-get install freeglut3-dev libx11-dev libxmu-dev libxi-dev libgl1-mesa-glx libglu1-mesa libglu1-mesa-dev |
要注意最后一行命令的安装,其中freeglut3-dev
和libglu1-mesa-dev
可能会报错,别担心,我们可以分别安装这两个,单独安装是没问题的。安装之后,我们进入samples
目录,测试一下CUDA
是否安装成功。
我们可以选择全部编译,也可以只编译其中一个,这里,我们测试一下NVIDIA_CUDA-10.1_Samples/1_Utilities/deviceQuery
这个例子。
1 2 |
cd /NVIDIA_CUDA-10.1_Samples/1_Utilities/deviceQuery sudo make |
稍等一会,会提示编译成功。下面是之前9.0的结果,10.x类似。
1 2 3 4 5 |
ubuntu@mcj:~/NVIDIA_CUDA-9.0_Samples/1_Utilities/deviceQuery$ sudo make "/usr/local/cuda-9.0"/bin/nvcc -ccbin g++ -I../../common/inc -m64 -gencode arch=compute_30,code=sm_30 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_70,code=compute_70 -o deviceQuery.o -c deviceQuery.cpp "/usr/local/cuda-9.0"/bin/nvcc -ccbin g++ -m64 -gencode arch=compute_30,code=sm_30 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_70,code=compute_70 -o deviceQuery deviceQuery.o mkdir -p ../../bin/x86_64/linux/release cp deviceQuery ../../bin/x86_64/linux/release |
然后执行一下:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 |
ubuntu@mcj:~/NVIDIA_CUDA-9.0_Samples/1_Utilities/deviceQuery$ ./deviceQuery ./deviceQuery Starting... CUDA Device Query (Runtime API) version (CUDART static linking) Detected 2 CUDA Capable device(s) Device 0: "GeForce GTX 1080 Ti" CUDA Driver Version / Runtime Version 9.1 / 9.0 CUDA Capability Major/Minor version number: 6.1 Total amount of global memory: 11175 MBytes (11718230016 bytes) (28) Multiprocessors, (128) CUDA Cores/MP: 3584 CUDA Cores GPU Max Clock rate: 1582 MHz (1.58 GHz) Memory Clock rate: 5505 Mhz Memory Bus Width: 352-bit L2 Cache Size: 2883584 bytes Maximum Texture Dimension Size (x,y,z) 1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384) Maximum Layered 1D Texture Size, (num) layers 1D=(32768), 2048 layers Maximum Layered 2D Texture Size, (num) layers 2D=(32768, 32768), 2048 layers Total amount of constant memory: 65536 bytes Total amount of shared memory per block: 49152 bytes Total number of registers available per block: 65536 Warp size: 32 Maximum number of threads per multiprocessor: 2048 Maximum number of threads per block: 1024 Max dimension size of a thread block (x,y,z): (1024, 1024, 64) Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535) Maximum memory pitch: 2147483647 bytes Texture alignment: 512 bytes Concurrent copy and kernel execution: Yes with 2 copy engine(s) Run time limit on kernels: Yes Integrated GPU sharing Host Memory: No Support host page-locked memory mapping: Yes Alignment requirement for Surfaces: Yes Device has ECC support: Disabled Device supports Unified Addressing (UVA): Yes Supports Cooperative Kernel Launch: Yes Supports MultiDevice Co-op Kernel Launch: Yes Device PCI Domain ID / Bus ID / location ID: 0 / 1 / 0 Compute Mode: < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) > Device 1: "GeForce GTX 1080 Ti" CUDA Driver Version / Runtime Version 9.1 / 9.0 CUDA Capability Major/Minor version number: 6.1 Total amount of global memory: 11178 MBytes (11721506816 bytes) (28) Multiprocessors, (128) CUDA Cores/MP: 3584 CUDA Cores GPU Max Clock rate: 1582 MHz (1.58 GHz) Memory Clock rate: 5505 Mhz Memory Bus Width: 352-bit L2 Cache Size: 2883584 bytes Maximum Texture Dimension Size (x,y,z) 1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384) Maximum Layered 1D Texture Size, (num) layers 1D=(32768), 2048 layers Maximum Layered 2D Texture Size, (num) layers 2D=(32768, 32768), 2048 layers Total amount of constant memory: 65536 bytes Total amount of shared memory per block: 49152 bytes Total number of registers available per block: 65536 Warp size: 32 Maximum number of threads per multiprocessor: 2048 Maximum number of threads per block: 1024 Max dimension size of a thread block (x,y,z): (1024, 1024, 64) Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535) Maximum memory pitch: 2147483647 bytes Texture alignment: 512 bytes Concurrent copy and kernel execution: Yes with 2 copy engine(s) Run time limit on kernels: No Integrated GPU sharing Host Memory: No Support host page-locked memory mapping: Yes Alignment requirement for Surfaces: Yes Device has ECC support: Disabled Device supports Unified Addressing (UVA): Yes Supports Cooperative Kernel Launch: Yes Supports MultiDevice Co-op Kernel Launch: Yes Device PCI Domain ID / Bus ID / location ID: 0 / 2 / 0 Compute Mode: < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) > > Peer access from GeForce GTX 1080 Ti (GPU0) -> GeForce GTX 1080 Ti (GPU1) : Yes > Peer access from GeForce GTX 1080 Ti (GPU1) -> GeForce GTX 1080 Ti (GPU0) : Yes deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 9.1, CUDA Runtime Version = 9.0, NumDevs = 2 Result = PASS |
出现这个界面说明你的 CUDA已经安装成功了。
3 CUDNN7.0
这里,CUDA和cudnn的版本要注意对应,我选择的是:cudnn-10.1-linux-x64-v7.6.5.32.tgz
,还是放在原来的目录,先解压一下:
1 2 3 4 5 6 |
cd /home/ubuntu/nvidia tar -xvf cudnn-10.1-linux-x64-v7.6.5.32.tgz cd cuda sudo cp ./include/cudnn.h /usr/local/cuda-10.1/include # 如果是cudnn8.x,上面这条命令需要改为:sudo cp ./include/cudnn* /usr/local/cuda-11.1/include sudo cp -a ./lib64/libcudnn* /usr/local/cuda-10.1/lib64 |
OK,这样就装好了。
不过要注意的是,这样安装的cudnn其实相当于做了软连接,为了防止以后误删,我建议把它们直接放到对应的目录下。
测试是否安装成功及安装版本:
1 2 3 |
cat /usr/local/cuda/include/cudnn.h | grep CUDNN_MAJOR -A 2 # 对于8.x,需要用以下命令验证: cat /usr/local/cuda/include/cudnn_version.h | grep CUDNN_MAJOR -A 2 |
这代表装的是7.0.5版本的cudnn。如果是7.6.5,结果类似。
或者通过编译测试例子来证明,参考:
https://blog.csdn.net/caicaiatnbu/article/details/87626491
4 其他版本
如果要安装其他版本的CUDA
和cudnn
,也是一样的道理,一个更简单的方法是全部解压到指定目录,然后在/etc/profile
和
ldconfig
中加入路径即可。
比如/etc/profile
:
1 2 3 4 5 6 7 |
# CUDN配置 export PATH=$PATH:/A-pool/cuda/cuda-9.0/bin:/A-pool/cuda/cuda-9.0/include export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/A-pool/cuda/cuda-9.0/lib64 export CUDA_HOME=/A-pool/cuda/cuda-9.0 # cuDNN配置 export PATH=$PATH:/A-pool/cudnn/cuda-9.0-cudnn-7.0/include export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/A-pool/cudnn/cuda-9.0-cudnn-7.0/lib6 |
/etc/ld.so.conf
:
1 2 |
/A-pool/cuda/cuda-9.0/lib64 /A-pool/cudnn/cuda-9.0-cudnn-7.0/lib64 |
天翼云下载
附
件
下
载
牛啊师兄
@毒公子 呃,大兄弟,你是?
@马春杰杰 师兄,我昨天向你请教容器内存满的问题
@毒公子 了解了解~~