TutorialAI Translated Content
Self-Hosted LLMs for Enterprise #2
2 min read
In the previous part, we installed NVIDIA GPU Driver ready for use on EC2 machine (g5g.xlarge) running Ubuntu 24.04 LTS. In this part, we'll enable the installed GPU to work within Docker containers to prepare for creating LLM API with llama.cpp.
Read part one at Part 1
Install Docker Engine
1. Add Docker's GPG key
sudo apt-get update
sudo apt-get install -y ca-certificates curl
sudo install -m 0755 -d /etc/apt/keyrings
sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc
sudo chmod a+r /etc/apt/keyrings/docker.asc
2. Add Docker repository to apt sources
echo \
"deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu \
$(. /etc/os-release && echo "${UBUNTU_CODENAME:-$VERSION_CODENAME}") stable" | \
sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
# Update repo again
sudo apt-get update
3. Install Docker Engine and Docker CLI
sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
Test installation with:
sudo docker run hello-world
Install NVIDIA Container Toolkit
1. Add NVIDIA Container Toolkit Repository
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | \
sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg
curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
2. Update Package
sudo apt-get update
3. Install NVIDIA Container Toolkit
sudo apt-get install -y nvidia-container-toolkit
4. Configure Docker to See GPU
sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker
5. Test with Docker
sudo docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smi
Part 2 Summary
- Installed Docker Engine from official repository
- Installed NVIDIA Container Toolkit to enable Docker GPU access
- Tested container can run
nvidia-smisuccessfully
In the next part, we'll start running LLMs like llama.cpp through containers and expose them as APIs.