Simple, Transparent Pricing

Choose the plan that fits your needs. All plans include our core features with no hidden costs.

Available Models

Llama 2

General Purpose

Mistral

Efficient

CodeLlama

Code Generation

Vicuna

Conversational

Alpaca

Instruction Tuned

WizardLM

Advanced Reasoning

MPT

High Performance

Falcon

Open Source

Infrastructure Options

Cloud Providers

AWS
Google Cloud
Microsoft Azure
DigitalOcean

Popular Regions

US East (N. Virginia)
us-east-1
US West (Oregon)
us-west-2
Europe (Ireland)
eu-west-1
Asia Pacific (Tokyo)
ap-northeast-1

Choose Your Plan

Starter

$49
per month
CPU4 vCPUs
RAM16 GB
Storage100 GB
Requests/sec100
Auto-scaling
Health monitoring
Basic support
Get Started
Most Popular

Professional

$149
per month
CPU8 vCPUs
RAM32 GB
Storage500 GB
Requests/sec500
Everything in Starter
Advanced monitoring
Priority support
Custom domains
Get Started

Enterprise

$499
per month
CPU16 vCPUs
RAM64 GB
Storage2 TB
Requests/sec2000+
Everything in Professional
Dedicated support
SLA guarantees
Custom integrations
Get Started

Feature Comparison

FeatureStarterProfessionalEnterprise
Hardware Specs4 vCPU, 16GB RAM8 vCPU, 32GB RAM16 vCPU, 64GB RAM
Storage100 GB500 GB2 TB
Performance100 req/s500 req/s2000+ req/s
Auto-scaling
Health MonitoringBasicAdvancedEnterprise
SupportEmailPriorityDedicated
SLA--99.9%

Frequently Asked Questions

What's the difference between plans?

Plans differ in hardware specifications, performance limits, and support levels. Starter is great for development and testing, Professional for production workloads, and Enterprise for high-scale applications.

Is there a free tier?

We don't offer a free tier. All plans are paid and start immediately, ensuring you get the full performance and features from day one.

Can I change plans later?

Yes, you can upgrade or downgrade your plan at any time. You'll be billed hourly when changing plans with prorated charges.

What cloud providers do you support?

We support AWS, Google Cloud, Microsoft Azure, and DigitalOcean. You can choose your preferred provider and region for deployment.

How does scaling work?

All plans include auto-scaling. You can scale your infrastructure up or down based on demand, and we'll automatically adjust your billing accordingly.

Ready to get started?

Deploy your first model in under 2 minutes