Self-Hosted LLMs for Enterprise #1

In an era where Generative AI has become a daily assistant for many people, whether for writing code, answering questions, or summarizing reports, many organizations are becoming interested in installing and using LLMs internally for privacy, flexibility, and cost control. This series will guide you through setting up the system step by step, from installing drivers to running LLMs via API on your own Ubuntu machine.

The Infrastructure chosen in this article will be from AWS, and we'll use EC2 as the machine for our Demo.

For Instance Type, we'll use g5g.xlarge which has a GPU.

1. Find `$distro` and `$arch` values matching our system

Open the comparison table from Official Document

From our demo machine example:

Ubuntu 24.04 LTS
Architecture: arm64

We get the values:

$distro = ubuntu2404
$arch = sbsa
$arch_ext = sbsa

If using different machine specs, check values to match your machine.

2. Install NVIDIA keyring with `$distro` and `$arch` values from previous step

# Example: If using Ubuntu 24.04 + ARM64 (from step 1)
wget  https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2404/sbsa/cuda-keyring_1.1-1_all.deb
# Install keyring
sudo dpkg -i cuda-keyring_1.1-1_all.deb
# Update apt index
sudo apt update

3. Install NVIDIA Proprietary Driver and CUDA Toolkit

sudo apt install cuda-drivers
sudo apt install cuda-toolkit

4. Check Driver Operation

nvidia-smi

Part 1 Summary

In this part, you will have:

Checked system information to select the correct driver version
Connected Ubuntu to NVIDIA Repository
Installed NVIDIA proprietary GPU driver easily with apt command
Verified GPU operation with nvidia-smi

If you followed this, your machine is now ready for GPU use.

Next: Using GPU with Docker Container

In the next part, we'll look at how to:

Configure Docker to use GPU correctly
Install nvidia-container-toolkit
Prepare environment for running LLM API Work-from-Home or within organizations

Don't miss the next part!

Self-Hosted LLMs for Enterprise #1

1. Find `$distro` and `$arch` values matching our system

2. Install NVIDIA keyring with `$distro` and `$arch` values from previous step

3. Install NVIDIA Proprietary Driver and CUDA Toolkit

4. Check Driver Operation

Part 1 Summary

Next: Using GPU with Docker Container

Related Articles

GPU Monitoring Dashboard

AI-Powered E2E Testing with Midscene.js and Playwright

Nvidia GPU Driver Setup: Essential Steps for AI Developers

Self-Hosted LLMs for Enterprise #1

1. Find $distro and $arch values matching our system

2. Install NVIDIA keyring with $distro and $arch values from previous step

3. Install NVIDIA Proprietary Driver and CUDA Toolkit

4. Check Driver Operation

Part 1 Summary

Next: Using GPU with Docker Container

Related Articles

GPU Monitoring Dashboard

AI-Powered E2E Testing with Midscene.js and Playwright

Nvidia GPU Driver Setup: Essential Steps for AI Developers

1. Find `$distro` and `$arch` values matching our system

2. Install NVIDIA keyring with `$distro` and `$arch` values from previous step