Float16.cloud

🌟 New service Serverless GPU → Read more

Serverless GPU

GPU Computing on Demand - Pay Only When You Compute

Transform your AI workflow with instant GPU access - no server management, no hourly fees

Zero Infrastructure Management.: Forget about provisioning, drivers, or docker builds. Just upload your code or model — we handle the setup, drivers, and runtime on high-performance H100s.
Instant Start, No Cold Boots.: Launch jobs in seconds with near-zero init time. All runtimes are optimized to skip cold starts — perfect for both real-time inference and quick experiments.
Efficient, Pay-Per-Use Execution.: Only pay when your code is running. Whether it’s a 10-second inference or multi-hour training, serverless scheduling with on-demand and spot pricing keeps costs under control.

Serverless, But Actually Built for AI

Most serverless platforms weren’t designed for the needs of AI workloads. Float16 changes that with GPU-native, developer-friendly serverless that just works.

Feature	Traditional Serverless	Float16 Serverless
Startup Time	Cold starts (slow, minutes)	⚡ Instant start, no cold boots
GPU Access	Limited or unavailable	✅ High-performance NVIDIA H100s
Code Compatibility	Requires container/image setup	🧠 Run `.py` scripts directly
Model & Weight Handling	Manual load in every run	🪄 Pre-loaded weights and cache
Pricing Model	Flat rate / idle costs	💰 True pay-per-use
Dev Workflow	Designed for generic workloads	🎯 Built for AI training & inference
Batch / Spot Job Support	Rare / manual configuration	🖥️ Built-in spot mode support

Traditional Serverless	Float16 Serverless
Startup Time
Cold starts (slow, minutes)	⚡ Instant start, no cold boots
GPU Access
Limited or unavailable	✅ High-performance NVIDIA H100s
Code Compatibility
Requires container/image setup	🧠 Run `.py` scripts directly
Model & Weight Handling
Manual load in every run	🪄 Pre-loaded weights and cache
Pricing Model
Flat rate / idle costs	💰 True pay-per-use
Dev Workflow
Designed for generic workloads	🎯 Built for AI training & inference
Batch / Spot Job Support
Rare / manual configuration	🖥️ Built-in spot mode support

How It Works

Get Started with Simple Steps

Deploy Mode: Get an endpoint for continuous access

float16 deploy app.py

Run Mode: Quick compute and get results

float16 run app.py

Perfect For

AI Development & Testing

Fast iteration on your ML experiments. Perfect for quick model adjustments and rapid testing cycles without infrastructure overhead.

Periodic Model Inference

Run predictions exactly when needed. Ideal for batch processing and occasional inference tasks without paying for idle time.

Research Projects

Focus on research, not infrastructure. Great for academic work and experiments with varying computational demands.

Prototype Deployment

Test ideas without long-term commitments. Suitable for MVPs and proof-of-concepts that need professional-grade GPU power.

Serverless GPUs withTrue Pay-Per-Use Pricing

Start instantly with per-second billing on H100 GPUs and pay only for what you use — no setup, no idle costs. Whether you're deploying LLMs or running batch training jobs, our pricing is designed to scale with your workload.

Price

GPU Types

On-demand

Spot

H100

$0.006 / sec

$0.0012 / sec

Storage

Free

CPU & Memory

included

Need enterprise, self-hosted, or fully-managed services? Contact us →

Serverless GPU

Serverless, But Actually Built for AI

Perfect For

AI Development & Testing

Periodic Model Inference

Research Projects

Prototype Deployment

Serverless GPUs withTrue Pay-Per-Use Pricing

Price

Your First Serverless GPU Task

Zero setup. Zero idle costs. Pure computing power.

Serverless GPU

ServerlessServerless, But Actually Built for AI

1.Install our CLI

2.Create your project

3.Choose your mode

Perfect For

AI Development & Testing

Periodic Model Inference

Research Projects

Prototype Deployment

Serverless GPUs withTrue Pay-Per-Use PricingTrue Pay-Per-Use Pricing

Price

Your First Serverless GPU Task

Zero setup. Zero idle costs. Pure computing power.

Serverless, But Actually Built for AI

Serverless GPUs withTrue Pay-Per-Use Pricing