Documentation
Popular

Quick Start Guide

Get up and running with Float16 Cloud in under 5 minutes

Quick Start Guide

This guide will help you deploy your first GPU instance on Float16 Cloud in just a few minutes.

Prerequisites

Before you begin, make sure you have:

  • A Google account to sign in to Float16 Cloud
  • Basic familiarity with terminal/command line (for GPU Instance SSH access)
  • An SSH key pair for GPU Instance access (we'll show you how to create one if needed)

Step 1: Create Your Account

  1. Visit app.float16.cloud
  2. Click Sign in with Google to create your account using Google SSO
  3. Authorize Float16 Cloud to access your Google account
  4. You'll be automatically redirected to the dashboard

Step 2: Explore the Dashboard

After signing in, you'll see the Float16 dashboard with two main services:

Services

  • Serverless GPU - Instantly access powerful GPUs, pay only for usage, and run AI workloads without managing infrastructure
  • GPU Instance - Create dedicated GPU instances with SSH access for development and training

Resources

  • Colab - Online tool to write and execute Python code through the browser
  • Quantize - Compare LLM inference speeds with different quantization techniques
  • Chatbot - Chat interface to interact with LLM models
  • Prompt - Create, run and share prompts with your colleagues
  • Apply Coupon - Redeem coupon codes to get credits

Step 3: Create Your First GPU Instance

  1. Click Serverless GPU in the sidebar
  2. Go to Projects tab
  3. Click Create Project
  4. Select a GPU instance type from the dropdown
  5. Enter a project name (optional)
  6. Click Create Project

Option B: GPU Instance with SSH Access

  1. Click GPU Instance in the sidebar
  2. Go to Create Instance tab
  3. Choose between:
    • Base VM - Full SSH access to a GPU instance
    • One-Click Deployment - Deploy vLLM models instantly
  4. Select an instance type (e.g., H100)
  5. Configure volume size (50-10000 GB)
  6. Click Create Instance

Step 4: Connect to GPU Instance (Base VM)

For GPU Instance with SSH access:

  1. Go to GPU Instance > Instances
  2. Find your running instance
  3. Copy the SSH command provided
  4. Connect via terminal:
ssh root@<your-instance-ip>

Step 5: Run Your First GPU Workload

Test that everything is working:

nvidia-smi

You should see your GPU information displayed.

Run a simple PyTorch test

import torch

# Check if CUDA is available
print(f"CUDA available: {torch.cuda.is_available()}")
print(f"GPU: {torch.cuda.get_device_name(0)}")

# Run a simple tensor operation
x = torch.randn(1000, 1000).cuda()
y = torch.matmul(x, x)
print(f"Tensor shape: {y.shape}")

Alternative Quick Starts

Deploy an LLM with One-Click Deployment

Deploy vLLM models instantly without managing infrastructure:

  1. Navigate to GPU Instance > Create Instance
  2. Select the One-Click Deployment tab
  3. Choose from preset models:
    • Qwen - Alibaba's large language models (8B to 235B)
    • Llama - Meta's Llama 3.3 70B
    • Typhoon - SCB10X models including OCR
    • GLM - ZAI's GLM 4.7 Flash
  4. Or add a custom HuggingFace model
  5. Configure volume size and click Create Instance

Access your model via the proxy endpoint:

from openai import OpenAI

client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://proxy-instance.float16.cloud/{task_id}/3000/v1"
)

response = client.chat.completions.create(
    model="your-model-name",
    messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)

Full LLM Deployment Guide

Extract Text with OCR

Process documents with Typhoon OCR:

  1. Navigate to AI Services > OCR
  2. Upload a document or image
  3. View extracted text

Or use the API:

curl -X POST https://api.float16.cloud/v1/ocr/extract \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -F "file=@document.pdf"

OCR Documentation

Train with ML Frameworks

Launch a pre-configured training environment:

  1. Navigate to Instances > Create
  2. Select a framework image:
    • float16/tao:6.0 - Computer vision
    • float16/monai:1.5.1 - Medical imaging
    • float16/nemo:2.6.1 - Speech/NLP
  3. Choose your GPU and launch

ML Training Overview

Next Steps

Now that you have your first instance running, explore:

Troubleshooting

Can't connect via SSH?

  • Verify your SSH key is added correctly
  • Check that the instance is in "Running" state
  • Ensure your IP isn't blocked by firewall rules

GPU not detected?

  • Try rebooting the instance
  • Check the instance logs in the dashboard
  • Contact support if the issue persists
Tags:quickstartbeginnertutorial
Last updated: February 1, 20254 min read