Documentation

GPU Platform Overview

Create and manage dedicated GPU instances on Float16 Cloud

GPU Platform Overview

Float16 Cloud's GPU Instance service provides dedicated GPU resources with full SSH access for AI and machine learning workloads.

Available GPU Types

GPU Provider Region On-Demand Spot Storage
H100 Float16 Thailand $4.32/hr $2.16/hr $1.00/GB/mo

View current pricing at GPU Instance > Pricing in the dashboard.

Creating an Instance

Via Dashboard

  1. Navigate to GPU Instance > Create Instance
  2. Choose deployment type:
    • Base VM - Full SSH access to a GPU instance
    • One-Click Deployment - Deploy vLLM models instantly
  3. Configure your instance:
    • Project Name (optional) - A friendly name for your project
    • Instance Type - Select GPU type (e.g., H100)
    • Volume Size - 50GB to 10,000GB persistent storage
  4. Click Create Instance

Base VM

Base VM provides full SSH access to a GPU instance with pre-configured CUDA and ML frameworks. Ideal for:

  • Custom development environments
  • Training jobs
  • Running custom services

One-Click Deployment

Deploy vLLM models instantly with preset or custom models. See One-Click Deployment for details.

Instance Lifecycle

GPU instances support lifecycle management:

Action Description
Start Launch a new instance
Stop Pause compute (only storage cost charged)
Resume Continue from where you left off
Terminate Permanently delete instance and resources

Cost Savings with Stop/Resume

When an instance is stopped:

  • No compute cost is charged
  • Only volume storage cost applies ($1.00/GB/mo)
  • Your data and environment are preserved

This allows you to save costs while preserving your work.

Storage

Persistent Volumes

Each instance includes persistent storage that survives instance restarts:

  • Size: 50GB to 10,000GB
  • Backed by: NetApp Trident
  • Cost: $1.00/GB/month

Volume Management

Manage volumes at GPU Instance > Volume:

  • View total volumes and storage usage
  • Create standalone volumes
  • Monitor volume health status

Connecting to Your Instance

After creating a Base VM instance:

  1. Go to GPU Instance > Instances
  2. Find your running instance
  3. Copy the SSH command provided
  4. Connect via terminal:
ssh root@<your-instance-ip>

Endpoint Proxy

Access services running on your GPU instances via secure proxy endpoints:

  • Format: https://proxy-instance.float16.cloud/{task_id}/{port}/{path}
  • Ports: 3000-4000 supported
  • Compatible with: vLLM, custom APIs, Jupyter, and more

For vLLM deployments, use the OpenAI Python SDK:

from openai import OpenAI

client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://proxy-instance.float16.cloud/{task_id}/3000/v1"
)

Billing

  • Billing starts when the instance is created
  • Billing stops when the instance is terminated
  • Minimum increment: 1 minute for compute
  • Stopped instances: Only storage cost charged

Next Steps

Tags:gpuplatforminfrastructure
Last updated: February 1, 20253 min read