Documentation
New

Serverless GPU

Run AI workloads on-demand without managing infrastructure

Serverless GPU

Float16 Serverless GPU lets you run AI workloads on powerful GPUs, paying only for the time you use. No infrastructure management required.

How It Works

  1. Create a project - Set up a project in the dashboard or via CLI
  2. Write your code - Use Python to run your AI workloads
  3. Run tasks - Execute your code on GPU instances
  4. Pay per second - Only pay for actual compute time

Getting Started

Install Float16 CLI

The easiest way to get started is with the VS Code extension:

  1. Open VS Code
  2. Install the Float16 CLI Tools extension
  3. The extension handles authentication and project setup automatically

Alternatively, install via command line:

Platform Installation
macOS brew install float16
Windows Download from website
npm npm install -g float16

Create a Project

  1. Navigate to Serverless GPU > Projects
  2. Click Create Project
  3. Select a GPU instance type
  4. Enter a project name (optional)
  5. Click Create Project

Pricing

Instance On-Demand Spot (Save 50%) Storage
H100 $4.32/hr ($0.0012/sec) $2.16/hr ($0.0006/sec) $1.00/GB/mo

View current pricing at Serverless GPU > Pricing.

Blueprint Templates

Get started quickly with pre-configured templates at Serverless GPU > Blueprint:

Template Category Description
Qwen3 30B A3B Text Advanced text generation with 30B parameters
Gemma3-27b Text High-performance NLP for summarization, translation, Q&A
Typhoon2.1-gemma3-12b Text Thai-English bilingual model
Thai Document OCR OCR Extract text from Thai/English documents
Parabricks fq2bam Genomics GPU-accelerated FASTQ to BAM alignment

Managing Projects

Projects

View and manage your serverless GPU projects at Serverless GPU > Projects.

Tasks

Track running and completed tasks at Serverless GPU > Tasks:

  • View task history
  • Monitor execution status
  • Access task outputs

Tokens

Manage API tokens for CLI authentication at Serverless GPU > Tokens.

Storage Logs

Track storage operations at Serverless GPU > Storage Logs:

  • Upload/download history
  • Operation statistics
  • 7-day activity charts

Features

Streaming Responses

Get real-time output as your code executes in manual mode.

VS Code Integration

Run and manage your Serverless GPU projects directly from VS Code with the Float16 CLI Tools extension.

Next Steps

Tags:serverlessgpucompute
Last updated: February 1, 20253 min read