Ubuntu20.04 安装强化学习环境(CUDA、Conda)

1.禁用原驱动

防止黑屏

sudo vim /etc/default/grub

sudo gedit /etc/default/grub

#编辑打开的文件,找到GRUB_CMDLINE_LINUX_DEFAULT那一行,在后面加上(在quiet splash后

Ubuntu20.04 安装强化学习环境(CUDA、Conda)

打一个空格) nomodeset(保险起见,nomodeset后面加多一个空格),保存,然后在终端输入 sudo update-grub 重启后就OK了!!!

2.sudo gedit /etc/modprobe.d/blacklist.conf

最后一行输入

blacklist nouveau

options nouveau modeset=0

终端运行

sudo update-initramfs -u

sudo reboot

3.检验

lsmod | grep nouveau

如果没有显示内容,则表示nouveau被成功禁用

2.安装驱动

1.查看自己系统代号

lspci | grep -i vga

%%

0000:01:00.0 VGA compatible controller: NVIDIA Corporation Device 25a0 (rev a1)

%%

https://admin.pci-ids.ucw.cz/mods/PC/10de?action=help?help=pci网站查询 25a0

返回:
Name: GA107M [GeForce RTX 3050 Ti Mobile]

2.下载驱动

Download The Official NVIDIA Drivers | NVIDIA(科学上网)

根据1的搜索下载对应的驱动

3.安装依赖
sudo apt-get update

sudo apt-get install g++

sudo apt-get install gcc

sudo apt-get install make

4.赋予权限

//赋予可执行文件的权限

sudo chmod a+x ./NVIDIA-Linux-x86_64-570.144.run

5.//运行

sudo ./NVIDIA-Linux-x86_64-570.144.run

6.安装

1.Multiple kernel module types are available for this system. Which would you like to use?

NVIDIA Proprietary | MIT/GPL选择左边

2.There appears to already be a driver installed on your system (version:

570.144). As part of installing this driver (version: 570.144), the

.existing driver will be uninstalled. Are you sure you want to continue?

Continue installation Abort installation 选择左边

3.Install NVIDIA's 32-bit compatibility libraries?

Yes No 选择左边

4.The initramfs will likely need to be rebuilt due to the following condition(s): * Nouveau is present in the initramfs. Would you like to rebuild the initramfs?

Do not rebuild initramfs |Rebuild initramfs选择右边

5.Would you like to run the nvidia-xconfig utility to automatically update your X configuration file so that the NVIDIA X driver will be used when you restart X? Any pre-existing X configuration file will be backed up.

Yes No 选择左边

6.Your X configuration file has been successfully updated. Installation of

the NVIDIA Accelerated Graphics Driver for Linux-x86_64 (version: 570.144)

is now complete.

OK 完成了

7.检验
————————————————

nvidia-smi

输出下面就是成功了

Sun Apr 27 09:41:38 2025

+-----------------------------------------------------------------------------------------+

| NVIDIA-SMI 570.144 Driver Version: 570.144 CUDA Version: 12.8 |

|-----------------------------------------+------------------------+----------------------+

| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |

| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |

| | | MIG M. |

|=========================================+========================+======================|

| 0 NVIDIA GeForce RTX 3050 ... Off | 00000000:01:00.0 Off | N/A |

| N/A 61C P0 17W / 80W | 0MiB / 4096MiB | 0% Default |

| | | N/A |

+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+

| Processes: |

| GPU GI CI PID Type Process name GPU Memory |

| ID ID Usage |

=========================================================================================|

| No running processes found |

+-----------------------------------------------------------------------------------------+

3. CUDA以及cuDnn配置
0.卸载流程

卸载cuda

cd /usr/local/cuda-xx.x/bin

sudo ./cuda-uninstaller

sudo rm -rf /usr/local/cuda-xx.x
卸载cudnn

sudo rm -rf /usr/local/cuda/include/cudnn.h

sudo rm -rf /usr/local/cuda/lib64/libcudnn*
验证

nvcc -V
——找不到就行

1.CUDA Toolkit Archive | NVIDIA Developer 网站安装

需要注意这里下载的版本不能大于上面命令

nvidia-smi
显示的CUDA Version

我是12.8 选择12.8即可

2.查看是否存在已安装的驱动版本

ls /usr/src | grep nvidia
nvidia-570.144

若输出了与刚刚下载的版本一致的nvidia驱动则表示正常,则继续安装

3.查看ubuntu版本

lsb_release -a
%%

No LSB modules are available.

Distributor ID: Ubuntu

Description: Ubuntu 20.04.6 LTS

Release: 20.04

Codename: focal

%%

4.进入刚刚网站进行run安装即可

Ubuntu20.04 安装强化学习环境(CUDA、Conda)

按照提示

wget https://developer.download.nvidia.com/compute/cuda/12.8.0/local_installers/cuda_12.8.0_570.86.10_linux.run
chmod 777 ./cuda_12.8.0_570.86.10_linux.run
sudo ./cuda_12.8.0_570.86.10_linux.run
sudo sh cuda_12.8.0_570.86.10_linux.run

5.进入安装

选择continue

输入accept

在选择勾选的时候,注意需要把nvidia驱动的选项去掉,,因为前面已经手动安装了nvidia的驱动,这里不需要安装。 最后一个nvidia-fs是 NVIDIA 文件系统相关的内核对象可以暂时不安装Ubuntu20.04 安装强化学习环境(CUDA、Conda)

运行需要一段时间,漫长的等待后…

如输出以下信息则表示成功安装

cuda

Ubuntu20.04 安装强化学习环境(CUDA、Conda)6.配置环境

sudo gedit ~/.bashrc
​(注意这里 x 替换成自己的cuda版本)

export PATH=/usr/local/cuda-11.x/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-11.x/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
source ~/.bashrc

7.验证

nvcc -V

输出下面证明安装成功

Ubuntu20.04 安装强化学习环境(CUDA、Conda)

8.配置cuDNN

进入 cuDNN Archive | NVIDIA Developer

上面 CUDA 采用 run 文件进行安装,那么cuDNN推荐使用tar包进行安装:Local Installer for Linux x86_64 (Tar)

解压

tar -xvf cudnn-linux-x86_64-8.9.7.29_cuda12-archive.tar.xz

配置权限(注: 均需要换成自己对应的具体的CUDA 和 cuDNN版本)

sudo cp cudnn-linux-x86_64-8.9.7.29_cuda12-archive/include/cudnn* /usr/local/cuda-12.8/include

sudo cp -P cudnn-linux-x86_64-8.9.7.29_cuda12-archive/lib/libcudnn* /usr/local/cuda-12.8/lib64

sudo chmod a+r /usr/local/cuda-12.8/include/cudnn*.h /usr/local/cuda-12.8/lib64/libcudnn*

验证

cat /usr/local/cuda-xx.x/include/cudnn.h | grep CUDNN_MAJOR -A 2

cat /usr/local/cuda/include/cudnn_version.h | grep CUDNN_MAJOR -A 2
输出下面就是成功

Ubuntu20.04 安装强化学习环境(CUDA、Conda)

4.Conda配置
1.下载anaconda

Download Now | Anaconda 网站下载linux版本

2.安装

bash ./Anaconda3-2024.10-1-Linux-x86_64.sh
一路enter,遇到许可证按q退出查看,再yes一下,然后一路enter

输出以下信息表示安装成功

3.最后激活一下环境变量

source ~/.bashrc
4.验证conda

若输出以下信息,则表示安装成功!
————————————————

Ubuntu20.04 安装强化学习环境(CUDA、Conda)

5.若报错输出,conda:未找到命令

sudo gedit ~/.bashrc
在文件最后添加,/path/to/conda是自己的conda的安装路径,默认在~/anaconda3/bin

export PATH="/path/to/conda/bin:$PATH"
我是:export PATH="~/anaconda3/bin:$PATH"

source ~/.bashrc
6.各个基本指令

conda create -n name python=3.9

查看所有环境

conda env list

激活环境

conda activate env

退出环境

conda deactivate

删除环境

conda remove --name env --all

查看已安装的包

conda list

5.安装torch
进入自己的conda环境,根据自己的CUDA环境在下面网站安装适配的torch包即可

Previous PyTorch Versions | PyTorch

本人是CUDA12.8,安装低于12.8最新的即可

pip install torch==2.6.0 torchvision==0.21.0 torchaudio==2.6.0 --index-url https://download.pytorch.org/whl/cu126
进入终端

conda activate env
进入python交互界面

python

Python 3.9.21 (main, Dec 11 2024, 16:24:11)
[GCC 11.2.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> print(torch.__version__)
2.6.0+cu126
>>> print(torch.cuda.is_available())
True
>>>
按照上述输出,返回信息如下就是安装成功了

阅读剩余
THE END