Quick Start Guide
This guide will help you deploy your first GPU instance on Float16 Cloud in just a few minutes.
Prerequisites
Before you begin, make sure you have:
- A Google account to sign in to Float16 Cloud
- Basic familiarity with terminal/command line (for GPU Instance SSH access)
- An SSH key pair for GPU Instance access (we'll show you how to create one if needed)
Step 1: Create Your Account
- Visit app.float16.cloud
- Click Sign in with Google to create your account using Google SSO
- Authorize Float16 Cloud to access your Google account
- You'll be automatically redirected to the dashboard
Step 2: Explore the Dashboard
After signing in, you'll see the Float16 dashboard with two main services:
Services
- Serverless GPU - Instantly access powerful GPUs, pay only for usage, and run AI workloads without managing infrastructure
- GPU Instance - Create dedicated GPU instances with SSH access for development and training
Resources
- Colab - Online tool to write and execute Python code through the browser
- Quantize - Compare LLM inference speeds with different quantization techniques
- Chatbot - Chat interface to interact with LLM models
- Prompt - Create, run and share prompts with your colleagues
- Apply Coupon - Redeem coupon codes to get credits
Step 3: Create Your First GPU Instance
Option A: Serverless GPU (Recommended for Quick Start)
- Click Serverless GPU in the sidebar
- Go to Projects tab
- Click Create Project
- Select a GPU instance type from the dropdown
- Enter a project name (optional)
- Click Create Project
Option B: GPU Instance with SSH Access
- Click GPU Instance in the sidebar
- Go to Create Instance tab
- Choose between:
- Base VM - Full SSH access to a GPU instance
- One-Click Deployment - Deploy vLLM models instantly
- Select an instance type (e.g., H100)
- Configure volume size (50-10000 GB)
- Click Create Instance
Step 4: Connect to GPU Instance (Base VM)
For GPU Instance with SSH access:
- Go to GPU Instance > Instances
- Find your running instance
- Copy the SSH command provided
- Connect via terminal:
ssh root@<your-instance-ip>
Step 5: Run Your First GPU Workload
Test that everything is working:
nvidia-smi
You should see your GPU information displayed.
Run a simple PyTorch test
import torch
# Check if CUDA is available
print(f"CUDA available: {torch.cuda.is_available()}")
print(f"GPU: {torch.cuda.get_device_name(0)}")
# Run a simple tensor operation
x = torch.randn(1000, 1000).cuda()
y = torch.matmul(x, x)
print(f"Tensor shape: {y.shape}")
Alternative Quick Starts
Deploy an LLM with One-Click Deployment
Deploy vLLM models instantly without managing infrastructure:
- Navigate to GPU Instance > Create Instance
- Select the One-Click Deployment tab
- Choose from preset models:
- Qwen - Alibaba's large language models (8B to 235B)
- Llama - Meta's Llama 3.3 70B
- Typhoon - SCB10X models including OCR
- GLM - ZAI's GLM 4.7 Flash
- Or add a custom HuggingFace model
- Configure volume size and click Create Instance
Access your model via the proxy endpoint:
from openai import OpenAI
client = OpenAI(
api_key="YOUR_API_KEY",
base_url="https://proxy-instance.float16.cloud/{task_id}/3000/v1"
)
response = client.chat.completions.create(
model="your-model-name",
messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)
Extract Text with OCR
Process documents with Typhoon OCR:
- Navigate to AI Services > OCR
- Upload a document or image
- View extracted text
Or use the API:
curl -X POST https://api.float16.cloud/v1/ocr/extract \
-H "Authorization: Bearer YOUR_API_KEY" \
-F "file=@document.pdf"
Train with ML Frameworks
Launch a pre-configured training environment:
- Navigate to Instances > Create
- Select a framework image:
float16/tao:6.0- Computer visionfloat16/monai:1.5.1- Medical imagingfloat16/nemo:2.6.1- Speech/NLP
- Choose your GPU and launch
Next Steps
Now that you have your first instance running, explore:
- GPU Platform Guide - Learn advanced features
- Serverless Deployment - Deploy models as APIs
- AI Services - LLMs, OCR, and more
- ML Training - TAO, MONAI, NeMo
- One-Click Deployment - Deploy models instantly
- API Reference - Automate with our API
Troubleshooting
Can't connect via SSH?
- Verify your SSH key is added correctly
- Check that the instance is in "Running" state
- Ensure your IP isn't blocked by firewall rules
GPU not detected?
- Try rebooting the instance
- Check the instance logs in the dashboard
- Contact support if the issue persists