GPU Platform Overview

Float16 Cloud's GPU Instance service provides dedicated GPU resources with full SSH access for AI and machine learning workloads.

Available GPU Types

GPU	Provider	Region	On-Demand	Spot	Storage
H100	Float16	Thailand	$4.32/hr	$2.16/hr	$1.00/GB/mo

View current pricing at GPU Instance > Pricing in the dashboard.

Navigate to GPU Instance > Create Instance
Choose deployment type:
- Base VM - Full SSH access to a GPU instance
- One-Click Deployment - Deploy vLLM models instantly
Configure your instance:
- Project Name (optional) - A friendly name for your project
- Instance Type - Select GPU type (e.g., H100)
- Volume Size - 50GB to 10,000GB persistent storage
Click Create Instance

Base VM provides full SSH access to a GPU instance with pre-configured CUDA and ML frameworks. Ideal for:

Deploy vLLM models instantly with preset or custom models. See One-Click Deployment for details.

GPU instances support lifecycle management:

Action	Description
Start	Launch a new instance
Stop	Pause compute (only storage cost charged)
Resume	Continue from where you left off
Terminate	Permanently delete instance and resources

When an instance is stopped:

This allows you to save costs while preserving your work.

Each instance includes persistent storage that survives instance restarts:

Manage volumes at GPU Instance > Volume:

After creating a Base VM instance:

ssh root@<your-instance-ip>

Access services running on your GPU instances via secure proxy endpoints:

For vLLM deployments, use the OpenAI Python SDK:

from openai import OpenAI

client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://proxy-instance.float16.cloud/{task_id}/3000/v1"
)