Select the right version of Ubuntu is important. Softwares for deep learning like CUDA and NCCL provide specific compiles for different Ubuntu versions; not all Ubuntu versions are available. In 2025, Ubuntu 22.04 and Ubuntu 20.04 are the safe choices.
After install the OS, full-upgrade of kernel/packages must be done. Otherwise, newly added packages could clash.
sudo apt update
sudo apt full-upgrade --yes
sudo apt autoremove --yes
sudo apt autoclean --yes
reboot
How to Install and Set Up the Fish Shell | by Saad Jamil | Medium
sudo apt-add-repository ppa:fish-shell/release-3
sudo apt-get update
sudo apt-get install fish
fish
automatically in Bash shellAdd this line to the end of ~/.bashrc
:
# auto launch fish shell
fish
P.S. ~/.bashrc
will be executed whenever a new bash
shell is launched. Please note that all commands after this line will only be executed after you exit
the fish
shell. Therefore make sure that this line is at the end of ~/.bashrc
.
I really love the Pastel Powerline Preset | Starship preset of Starship:
curl -sS https://starship.rs/install.sh | sh
Add the following to the end of ~/.config/fish/config.fish
:
# init starship
starship init fish | source
P.S. If you want to add conda & python information to the command prompt, download my starship.toml and use it to replace ~/.config/starship.toml
:
Then open a new terminal:
starship preset pastel-powerline -o ~/.config/starship.toml
Noted that we need to use the Nerd Fonts. I personally prefer CaskaydiaCove Nerd Font (download. Set it as the font used in terminal after installing it:
CaskaydiaCove Nerd Font
.Download VS Code's Linux version and then:
sudo apt install <file>.deb
sudo apt-get install git
Let's configure its default user name and user email. Noted that when you push commit to GitHub, the email will be used to identify your GitHub account:
git config --global user.name <name>
git config --global user.email <email>
git config --global init.defaultBranch main
To authorize your operation on GitHub, you will also need to generate a ssh key:
ssh-keygen -t rsa -C "<email>"
And then you need to add it to your account: Settings > SSH and GPG keys > Add SSH Key. Fill the title as you like and paste the key with the content of the generated id_rsa.pub
(NOT id_rsa
!!). The content of id_rsa.pub
can be easily accessed from command line:
cat ~/.ssh/id_rsa.pub
# Add Docker's official GPG key:
sudo apt-get update
sudo apt-get install ca-certificates curl
sudo install -m 0755 -d /etc/apt/keyrings
sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc
sudo chmod a+r /etc/apt/keyrings/docker.asc
# Add the repository to Apt sources:
# P.S. Bash shell needed.
echo \
"deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu \
$(. /etc/os-release && echo "${UBUNTU_CODENAME:-$VERSION_CODENAME}") stable" | \
sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt-get update
for pkg in docker.io docker-doc docker-compose docker-compose-v2 podman-docker containerd runc; do sudo apt-get remove $pkg; done
P.S. Run this in bash
shell instead of fish
shell.
sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
sudo docker run hello-world
sudo apt-get update
sudo apt-get install docker-compose-plugin
docker compose version
curl -sL "https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh" > "Miniconda3.sh"
bash Miniconda3.sh
P.S. If you enter fish
shell from bash
shell, the previous commands only automatically initialize conda
for bash
shell. To initialize conda
for fish
shell as well, run the following command in bash
shell:
conda init
Install the newest Nvidia driver compatible with your GPU. You don't need to worry about its compatibility with CUDA, since the driver is designed to be backward-compatible:
Download The Official NVIDIA Drivers | NVIDIA
NVIDIA drivers installation - Ubuntu Server documentation
sudo apt install nvidia-driver-550
reboot
Verify:
lsmod | grep nvidia
nvidia-smi
P.S. If you have multiple GPUs installed, you can test their connection via:
nvidia-smi topo -m
conda create -n pytorch python=3.10
conda activate pytorch
pip3 install torch torchvision torchaudio
It will also install the bundled CUDA for you, thus you don't have to install CUDA yourself. However, commands like nvcc
would not be available. To verify installation:
python
>>> import torch
>>> device = 'cuda' if torch.cuda.is_available() else 'cpu'
>>> torch.rand(5, 3).to(device)
P.S. If you just want to use PyTorch with CUDA, as specified before, you don't need to install CUDA yourself. However, if you want to compile PyTorch yourself or write customized CUDA codeto boost performance, you will need to install the CUDA Toolkit yourself.
P.S. Run nvidia-smi
to see the highest CUDA version your current driver supports.
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-ubuntu2204.pin
sudo mv cuda-ubuntu2204.pin /etc/apt/preferences.d/cuda-repository-pin-600
wget https://developer.download.nvidia.com/compute/cuda/12.4.0/local_installers/cuda-repo-ubuntu2204-12-4-local_12.4.0-550.54.14-1_amd64.deb
sudo dpkg -i cuda-repo-ubuntu2204-12-4-local_12.4.0-550.54.14-1_amd64.deb
sudo cp /var/cuda-repo-ubuntu2204-12-4-local/cuda-*-keyring.gpg /usr/share/keyrings/
sudo apt-get update
sudo apt-get -y install cuda-toolkit-12-4
Launch a new terminal and verify:
nvcc --version
NCCL is for multi-nodes/GPUs operation. Select the appropriate version according to your CUDA version.
NVIDIA Collective Communications Library (NCCL) | NVIDIA Developer
Installation Guide | NVIDIA Deep Learning NCCL Documentation
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-keyring_1.1-1_all.deb
sudo dpkg -i cuda-keyring_1.1-1_all.deb
sudo apt-get update
sudo apt install libnccl2=2.26.5-1+cuda12.4 libnccl-dev=2.26.5-1+cuda12.4
Tests:
sudo apt-get install openssh-server
code /etc/ssh/sshd_config
You could change the port number, say 2222
:
Port 2222
You may also want to enable the password login:
PasswordAuthentication yes
After altering the configuration, restart ssh
server:
sudo systemctl restart ssh
sudo apt install net-tools
ifconfig
Find the IP address. Then on other device connected to the same network, you can SSH into the Ubuntu machine:
ssh <user>@<address>
P.S. If you've changed the port number:
ssh <user>@<address> -p <port>
How To Configure SSH Key-Based Authentication on a Linux Server | DigitalOcean
Having to enter password for each ssh
login is not very convenient. We can make our life a little bit easier by setting up key authentication.
First generate a key:
ssh-keygen
It will prompt you to input the <key path>
and the pass phrase. Then you need to add your key to your local ssh
client:
ssh-add <key path>
Noted that you will need to add it again when you reboot your local machine. You may add this command to ~/.bashrc
(or ~/.zshrc
if you use macOS) for convenience.
Then send the key to the remote server:
ssh-copy-id -i <key path> -p <port> <address>
Noted the the -p
argument is the port number of the remote server you set before. Moreover, the address
is the IP address of the remote server which can be check by ifconfig
.
Before you can log in to the remote server without entering the password, you will need to enable key authentication first:
code /etc/ssh/sshd_config
And then change uncomment the row of PubKeyAuthentication yes
. For security considerations, it's preferable to disable password authentication if you have already setup key authentication. Change the row of PasswordAuthentication yes
to PasswordAuthentication no
.
And then:
sudo systemctl restart sshd
If everything is setup well, you will no longer need to enter the password the next time you ssh
into the remote machine.
ngrok
To access the ssh host from the internet, we need to expose it to the internet. One simple and free way is using ngrok
:
Sign up a ngrok
account and install it following the guide on this link.
Configuration File | ngrok documentation
Version 3 | ngrok documentation
Add ssh
tunnel setting in ~/.config/ngrok/ngrok.yml
:
tunnels:
ssh:
proto: tcp
addr: 22
ngrok service install --config ~/.config/ngrok/ngrok.yml
ngrok service start
tmux
When using ssh
for remote development, all running process will be terminated once you disconnect from the host. This is frustrating when you have something that could take hours or days to complete (e.g. training neural network) or your network is not stable. In that case, you need tmux
.
sudo apt update sudo apt install tmux
Enable mouse in tmux
:
touch ~/.tmux.conf
echo "set -g mouse on" >> ~/.tmux.conf
First start a tmux
session:
tmux
Or start with a custom name:
tmux new -s mysession
It will launch a terminal like the ordinary one. When you disconnect to the host, the process running in that terminal will continue to run. To view the status of all running sessions:
tmux ls
When you reconnect to the host, you can enter that tmux
session by:
tmux attach
tmux attach -t <name>
Kill a specific session:
tmux kill-session -t <name>
Kill all sessions (except the session you are in):
tmux kill-session -a
I use PyVista package quite a lot for 3D mesh rendering. When it's on the remote ssh host, things could get a little bit difficult. Here is a worked recipe:
Installation — PyVista 0.45.2 documentation
python - PyVista plotting issue in Visual Studio Code using WSL 2 with Ubuntu 22.04.4 - Stack Overflow
conda create --n vtk python=3.9
conda activate vtk
pip install pyvista[jupyter] ipykernel
Then in the .ipynb
notebook on VS Code connected to the host via SSH:
import pyvista as pv
pv.set_jupyter_backend('html')
# example
pl = pv.Plotter()
pl.add_mesh(
mesh=pv.read('output/x.obj'),
texture=pv.read_texture('output/texture.png'),
)
pl.show()
wget https://glados.one/tools/clash-verge_1.3.8_amd64.deb
sudo apt install ./clash-verge_1.3.8_amd64.deb
System Proxy
and Auto Launch
Add these line to ~/.bashrc
:
# proxy setting
export https_proxy=http://127.0.0.1:7890 http_proxy=http://127.0.0.1:7890 all_proxy=socks5://127.0.0.1:7890
apt
with proxyAdd this line to /etc/apt/apt.conf
:
Acquire::http::Proxy "http://<address>:<port>";
git
with proxygit config --global http.proxy http://<address>:<port>
Download and run:
./cpuburn
P.S. Monitor GPU frequency:
watch "cat /proc/cpuinfo | grep 'MHz'"
For a more visual monitoring of the system's resources including CPU, RAM, disk, & network usage:
sudo apt install btop
git clone git@github.com:wilicc/gpu-burn.git
cd gpu-burn
make
General test:
./gpu_burn 3600
Tensor core test:
./gpu_burn -tc 3600
P.S. Monitor GPU status:
watch -n 1 nvidia-smi
Or for a more visual monitoring:
sudo add-apt-repository ppa:flexiondotorg/nvtop
sudo apt install nvtop
GitHub - Syllo/nvtop: GPU & Accelerator process monitoring for AMD, Apple, Huawei, Intel, NVIDIA and Qualcomm
安静、高性价比双卡装机【100亿模型计划】