这篇文章之前写的时候是装的390.25驱动+CUDA9.0+CUDNN 7.0.5,这里更新到目前我使用的版本,即418.56+CUDA10.1+CUDNN 7.6.5





1 安装NVIDIA显卡驱动





cd /home/ubuntu/nvidia
chmod 777 NVIDIA-Linux-x86_64-418.56.run


sudo service lightdm stop



blacklist nouveau
options nouveau modeset=0


sudo update-initramfs -u

接着重启电脑,然后输入lsmod | grep nouveau,测试一下nouveau是否正确关闭,如果什么都没有输出则代表已经正确关闭了。

然后进入init 3模式并安装驱动:

sudo init 3 # 或者 sudo service lightdm stop
sudo ./NVIDIA-Linux-x86_64-418.56.run
# 注意这里有可能这条命令不好使,如果已经进入init 3了,还是说x服务正在运行,那么运行下面这条:
sudo ./NVIDIA-Linux-x86_64-418.56.run -no-x-check -no-nouveau-check -no-opengl-files

-no-x-check 安装驱动时不检查x服务
-no-nouveau-check 安装驱动时不检查nouveau
-no-opengl-files 只安装驱动文件,不安装openGl文件

了解更多init 启动级别相关,见:



(base) mcj@mcj2080:~$ nvidia-smi
Mon Jan 18 16:42:28 2021       
| NVIDIA-SMI 418.56       Driver Version: 418.56       CUDA Version: 10.1     |
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|   0  GeForce RTX 208...  Off  | 00000000:04:00.0  On |                  N/A |
| 27%   34C    P8    21W / 250W |     53MiB / 10986MiB |      0%      Default |
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|    0      4591      G   /usr/lib/xorg/Xorg                            51MiB |


sudo ./NVIDIA-Linux-x86_64-418.56.run --uninstall

2 安装CUDA 10.1



cd /home/ubuntu/nvidia
chmod 777 cuda_10.1.105_418.39_linux
sudo ./cuda_10.1.105_418.39_linux










接下来选择添加环境变量,不然nvcc -V 没法用:

sudo nano /etc/profile
# 在最后加入
export PATH=/usr/local/cuda-10.1/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda-10.1/lib64:$LD_LIBRARY_PATH
# ctrl + x 保存
source /etc/profile
sudo nano /etc/ld.so.conf 
# 在最后加入
# ctrl + x 保存
sudo ldconfig

OK,检查一下,输入ldconfig -v|grep cuda,我们可以看到:

ubuntu@mcj:~$ ldconfig -v|grep cuda
/sbin/ldconfig.real: Can't stat /usr/local/lib/x86_64-linux-gnu: No such file or directory
/sbin/ldconfig.real: Can't stat /lib32: No such file or directory
/sbin/ldconfig.real: Path `/lib/x86_64-linux-gnu' given more than once
/sbin/ldconfig.real: Path `/usr/lib/x86_64-linux-gnu' given more than once
/sbin/ldconfig.real: /lib/x86_64-linux-gnu/ld-2.27.so is the dynamic linker, ignoring

        libcuda.so.1 -> libcuda.so.418.56
        libicudata.so.55 -> libicudata.so.55.1
        libcuda.so.1 -> libcuda.so.418.56
/sbin/ldconfig.real: Can't create temporary cache file /etc/ld.so.cache~: Permission denied
        libcudart.so.10.1 -> libcudart.so.10.1


sudo apt-get update
sudo apt-get install build-essential
sudo apt-get install freeglut3-dev libx11-dev libxmu-dev libxi-dev libgl1-mesa-glx libglu1-mesa libglu1-mesa-dev



cd /NVIDIA_CUDA-10.1_Samples/1_Utilities/deviceQuery
sudo make


ubuntu@mcj:~/NVIDIA_CUDA-9.0_Samples/1_Utilities/deviceQuery$ sudo make 
"/usr/local/cuda-9.0"/bin/nvcc -ccbin g++ -I../../common/inc  -m64    -gencode arch=compute_30,code=sm_30 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_70,code=compute_70 -o deviceQuery.o -c deviceQuery.cpp
"/usr/local/cuda-9.0"/bin/nvcc -ccbin g++   -m64      -gencode arch=compute_30,code=sm_30 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_70,code=compute_70 -o deviceQuery deviceQuery.o 
mkdir -p ../../bin/x86_64/linux/release
cp deviceQuery ../../bin/x86_64/linux/release


ubuntu@mcj:~/NVIDIA_CUDA-9.0_Samples/1_Utilities/deviceQuery$ ./deviceQuery 
./deviceQuery Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 2 CUDA Capable device(s)

Device 0: "GeForce GTX 1080 Ti"
  CUDA Driver Version / Runtime Version          9.1 / 9.0
  CUDA Capability Major/Minor version number:    6.1
  Total amount of global memory:                 11175 MBytes (11718230016 bytes)
  (28) Multiprocessors, (128) CUDA Cores/MP:     3584 CUDA Cores
  GPU Max Clock rate:                            1582 MHz (1.58 GHz)
  Memory Clock rate:                             5505 Mhz
  Memory Bus Width:                              352-bit
  L2 Cache Size:                                 2883584 bytes
  Maximum Texture Dimension Size (x,y,z)         1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
  Maximum Layered 1D Texture Size, (num) layers  1D=(32768), 2048 layers
  Maximum Layered 2D Texture Size, (num) layers  2D=(32768, 32768), 2048 layers
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       49152 bytes
  Total number of registers available per block: 65536
  Warp size:                                     32
  Maximum number of threads per multiprocessor:  2048
  Maximum number of threads per block:           1024
  Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
  Max dimension size of a grid size    (x,y,z): (2147483647, 65535, 65535)
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                             512 bytes
  Concurrent copy and kernel execution:          Yes with 2 copy engine(s)
  Run time limit on kernels:                     Yes
  Integrated GPU sharing Host Memory:            No
  Support host page-locked memory mapping:       Yes
  Alignment requirement for Surfaces:            Yes
  Device has ECC support:                        Disabled
  Device supports Unified Addressing (UVA):      Yes
  Supports Cooperative Kernel Launch:            Yes
  Supports MultiDevice Co-op Kernel Launch:      Yes
  Device PCI Domain ID / Bus ID / location ID:   0 / 1 / 0
  Compute Mode:
     < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

Device 1: "GeForce GTX 1080 Ti"
  CUDA Driver Version / Runtime Version          9.1 / 9.0
  CUDA Capability Major/Minor version number:    6.1
  Total amount of global memory:                 11178 MBytes (11721506816 bytes)
  (28) Multiprocessors, (128) CUDA Cores/MP:     3584 CUDA Cores
  GPU Max Clock rate:                            1582 MHz (1.58 GHz)
  Memory Clock rate:                             5505 Mhz
  Memory Bus Width:                              352-bit
  L2 Cache Size:                                 2883584 bytes
  Maximum Texture Dimension Size (x,y,z)         1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
  Maximum Layered 1D Texture Size, (num) layers  1D=(32768), 2048 layers
  Maximum Layered 2D Texture Size, (num) layers  2D=(32768, 32768), 2048 layers
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       49152 bytes
  Total number of registers available per block: 65536
  Warp size:                                     32
  Maximum number of threads per multiprocessor:  2048
  Maximum number of threads per block:           1024
  Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
  Max dimension size of a grid size    (x,y,z): (2147483647, 65535, 65535)
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                             512 bytes
  Concurrent copy and kernel execution:          Yes with 2 copy engine(s)
  Run time limit on kernels:                     No
  Integrated GPU sharing Host Memory:            No
  Support host page-locked memory mapping:       Yes
  Alignment requirement for Surfaces:            Yes
  Device has ECC support:                        Disabled
  Device supports Unified Addressing (UVA):      Yes
  Supports Cooperative Kernel Launch:            Yes
  Supports MultiDevice Co-op Kernel Launch:      Yes
  Device PCI Domain ID / Bus ID / location ID:   0 / 2 / 0
  Compute Mode:
     < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >
> Peer access from GeForce GTX 1080 Ti (GPU0) -> GeForce GTX 1080 Ti (GPU1) : Yes
> Peer access from GeForce GTX 1080 Ti (GPU1) -> GeForce GTX 1080 Ti (GPU0) : Yes

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 9.1, CUDA Runtime Version = 9.0, NumDevs = 2
Result = PASS

出现这个界面说明你的 CUDA已经安装成功了。

3 CUDNN7.0


cd /home/ubuntu/nvidia
tar -xvf cudnn-10.1-linux-x64-v7.6.5.32.tgz
cd cuda
sudo cp ./include/cudnn.h /usr/local/cuda-10.1/include
# 如果是cudnn8.x,上面这条命令需要改为:sudo cp ./include/cudnn* /usr/local/cuda-11.1/include
sudo cp -a ./lib64/libcudnn* /usr/local/cuda-10.1/lib64




cat /usr/local/cuda/include/cudnn.h | grep CUDNN_MAJOR -A 2
# 对于8.x,需要用以下命令验证:
cat /usr/local/cuda/include/cudnn_version.h | grep CUDNN_MAJOR -A 2




4 其他版本



# CUDN配置
export PATH=$PATH:/A-pool/cuda/cuda-9.0/bin:/A-pool/cuda/cuda-9.0/include
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/A-pool/cuda/cuda-9.0/lib64
export CUDA_HOME=/A-pool/cuda/cuda-9.0
# cuDNN配置
export PATH=$PATH:/A-pool/cudnn/cuda-9.0-cudnn-7.0/include
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/A-pool/cudnn/cuda-9.0-cudnn-7.0/lib6




