Clarifai Pricing | Production-ready AI API for Developers and Enterprises

Are your LLMs too slow? Clarifai them.

Benchmark your models on the world's fastest inference engine with a
free 14-day trial. Our self-optimizing technology is built to accelerate complex reasoning tasks on GPUs.

Talk to us about a 14-day Free Trial

Pay As You Go

Explore AI using dedicated deployments, serverless pre-trained models in the cloud, our robust API and low-code UIs.

Run models directly or via dedicated compute instance
Clarifai Reasoning Engine. Accelerate Agentic AI workloads and large reasoning models.
Full platform access
Promotional access to Local Runners
Up to 100 requests per second

No monthly commitment!

Hybrid-Cloud AI Enterprise

Unlimited SaaS or VPC AI development and production workloads.

Unlimited API calls
Clarifai’s SaaS or private control plane
Multi-cloud and multi-region compute planes with top GPUs
Optional air-gapped deployments and private data planes
Full model exports, leaderboards
Custom rate limits
Multiple Organizations
Role-based access and Teams
Enterprise 99.99% SLAs
24/7 dedicated support

Custom Pricing

Talk to our AI experts

Dedicated Node Pricing

Only pay for the compute you use, down to the minute

Run on modern, high-performance GPUs optimized for GenAI inference, agentic workflows, and multimodal AI.

Additional node sizes and legacy accelerators (e.g. T4, A10G) are available during deployment for specialized workloads.

Cloud Provider

Provider Region

Node name	Cloud instance name	Price per min
NVIDIA L4 24GB 16XL	g6.16xlarge	$0.0708
NVIDIA L40S 48GB 16XL	g6e.16xlarge	$0.1579
NVIDIA H100 80GB 48XL	p5.48xlarge	$1.1467
NVIDIA L4 24GB 16XL	g6.16xlarge	$0.0708
NVIDIA L40S 48GB 16XL	g6e.16xlarge	$0.1579
NVIDIA H100 80GB 48XL	p5.48xlarge	$1.1467
NVIDIA L4 24GB 2XL	g2-standard-8	$0.0178
NVIDIA A100 80GB XL	a2-ultragpu-1g	$0.1189
NVIDIA H100 80GB XL	a3-highgpu-1g	$0.2304
NVIDIA L40S XL 48GB	vcg-l40s-16c-180g-48vram	$0.0348
NVIDIA A100 XL 80GB	vcg-a100-12c-120g-80vram	$0.0348
NVIDIA H100 80GB XL	vcg-h100-80g	$0.0415
Xeon Platinum 8000 S	t3.small	$0.0006

Dedicated Node Pricing

Only pay for the compute you use.

Best-in-class model performance, effortless autoscaling, and blazing fast cold starts mean you get the most out of each GPU, saving money along the way.

Node name	Cloud provider	Price per hour
8x NVIDIA B200 192GB *	Clarifai	$47.92
NVIDIA B200 192GB *	Clarifai	$5.99
NVIDIA GH200 96GB *	Clarifai	$2.29
48x NVIDIA RTX 6000 PRO 96GB 2XL	AWS	$33.144
24x NVIDIA RTX 6000 PRO 96GB 2XL	AWS	$16.572
12x NVIDIA RTX 6000 PRO 96GB 2XL	AWS	$8.286
8x NVIDIA RTX 6000 PRO 96GB 2XL	AWS	$5.268
NVIDIA RTX 6000 PRO 96GB 2XL	AWS	$3.998
NVIDIA RTX6000 PRO 96GB	GCP	$5.652
AMD MI355X	Clarifai	$4.99
8X NVIDIA H100 80GB 48XL *	AWS	$68.80
8X NVIDIA H100 80GB 48XL *	GCP	$110.6125
NVIDIA H100 80GB XL *	Clarifai	$2.49
NVIDIA H100 80GB XL *	GCP	$13.8263
NVIDIA L40S 48GB XL	AWS	$2.3263
NVIDIA A100 80GB XL	GCP	$7.1347
NVIDIA L4 24GB XL	GCP	$0.8835
NVIDIA L4 24GB XL	AWS	$1.006

* Requires a one-year commitment.

Dedicated Node Pricing

Only pay for the compute you use, down to the minute

Best-in-class model performance, effortless autoscaling, and blazing fast cold starts mean you get the most out of each GPU, saving money along the way.

Cloud Provider

Provider Region

Node name	Cloud instance name	Price per min
NVIDIA L40S 48GB XL	g6e.xlarge	$0.0388
NVIDIA L40S 192GB 12XL	g6e.12xlarge	$0.2186
AMD EPYC 7000 M	t3a.medium	$0.0012
AMD EPYC 7000 L	t3a.large	$0.0016
AMD EPYC 7000 XL	t3a.xlarge	$0.0031
AMD EPYC 7000 2XL	t3a.2xlarge	$0.0063
NVIDIA L4 24GB XL	g2-standard-4	$0.0147
NVIDIA L4 24GB 2XL	g2-standard-8	$0.0178
NVIDIA L4 24GB 3XL	g2-standard-12	$0.0208
NVIDIA L4 24GB 4XL	g2-standard-16	$0.0239
NVIDIA L4 24GB 5XL	g2-standard-32	$0.0361
NVIDIA A100 80GB XL	a2-ultragpu-1g	$0.1189
NVIDIA H100 80GB XL	a3-highgpu-1g	$0.2304
Intel ICL/CSL CPU 8GB	n2-standard-2	$0.0023
Intel ICL/CSL CPU 16GB	n2-standard-4	$0.0046
Intel ICL/CSL CPU 32GB	n2-standard-8	$0.0091
Intel ICL/CSL CPU 64GB	n2-standard-16	$0.0182
NVIDIA T4 16GB XL	g4dn.xlarge	$0.011
NVIDIA A10G 24GB XL	g5.xlarge	$0.021
NVIDIA A10G 24GB 2XL	g5.2xlarge	$0.0253
NVIDIA L4 24GB XL	g6.xlarge	$0.0168
NVIDIA L4 24GB 2XL	g6.2xlarge	$0.0204
NVIDIA L40S 48GB XL	g6e.xlarge	$0.0388
NVIDIA L40S 192GB 12XL	g6e.12xlarge	$0.2186
AMD EPYC 7000 M	t3a.medium	$0.0012
AMD EPYC 7000 L	t3a.large	$0.0016
AMD EPYC 7000 XL	t3a.xlarge	$0.0031
AMD EPYC 7000 2XL	t3a.2xlarge	$0.0063
NVIDIA A16 M 16GB	vcg-a16-6c-64g-16vram	$0.0098
NVIDIA L40S XL 48GB	vcg-l40s-16c-180g-48vram	$0.0348
NVIDIA A100 XL 80GB	vcg-a100-12c-120g-80vram	$0.0499

Inference Pricing

Over 500 leading open-source and closed-source language, multimodal, image, code, and embedding models are available for serverless and dedicated inference.

Large Language Models (per 1M tokens)

Kimi-K2-Thinking

Input $1.50 / Output $1.50

Qwen3-Next-80B-A3B-Thinking

Input $1.09 / Output $1.08

Qwen3-Coder-30B-A3B-Instruct

Input $0.36 / Output $1.30

MiniCPM4-8B

Input $0.86 / Output $1.43

GPT OSS 120B

Input $0.09 / Output $0.36

Llama-3_2-3B-Instruct

Input $0.13 / Output $0.63

Vision Language Models (per 1M tokens)

GPT-5_1

Input $1.5625 / Output $12.50

Claude-Opus-4_5

Input $6.25 / Output $31.25

Grok-4-1-Fast

Input $0.25 / Output $0.625

MiniCPM-o-2_6-language

Input $0.66 / Output $1.11

Ministral-3-14B-Reasoning-2512

Input $2.50 / Output $1.70

Qwen2_5-VL-7B-Instruct

Input $0.44 / Output $1.32

Small Image Models

Pre-trained classification model

$0.0012 /request

Pre-trained detection model

$0.002 /request

Pre-trained segmentation model

$0.0032 /request

Custom classification model

$0.0032 /request

Custom detection model

$0.005/request

Custom segmentation model

$0.008 /request

Small Language Models

Pre-trained classification model

$0.0012 /request

Pre-trained NER model

$0.002 /request

Custom classification model

$0.0032 /request

Inputs and Vectors

Input and vector ingest general embeddings

$0.0012 /request

Input and vector ingest detection embeddings

$0.002 /request

Active inputs index (monthly)

$0.001 /input

Input and vector search

$0.0012 /request

Model training

Model training job (single GPU)

$4 /hr

Model training containers (multi-GPU)

Compare Plans

Compare features and benefits across every plan.

	Pay As You Go Get Pay As You Go	Enterprise Get Enterprise
Usage & limits
Monthly requests	100,000	Unlimited
Requests per second	100	1000+
SDK & API access
Compute
Deployment types	SaaS, Local Dev, Hybrid Cloud (Self-Hosted)	SaaS, Local Dev, Hybrid Cloud (Self-Hosted), VPC, On-Prem, Air Gapped
NVIDIA GPUs	A10G, L4, L40S, A100	H100, H200, B200
Intel & AMD CPUs
Inference
Pre-trained model access
Batch requests
Realtime bi-directional streaming
GPU fractioning
Scale to zero
Spot instances
Development & training
Custom model training	Train & deploy	Enterprise AI
Model evaluation
Model upload
Model export
Dataset management
Vector search
Automated data labeling

Compare Plans

Compare features and benefits across every plan.

	Community Get Community	Essential Get Essential	Professional Get Professional	Hybrid AI Enterprise Contact Us	Private AI Enterprise Contact Us
Usage & limits
Monthly requests	Limited	30,000	100,000	Unlimited	Unlimited
Requests per second	1	15	100	1000+	1000+
SDK & API access
Compute
Deployment types	SaaS, Local Dev	+ Hybrid Cloud (Self-Hosted)	+ Hybrid Cloud (Self-Hosted)	+ Hybrid Cloud (Self-Hosted)	+ VPC, On-Prem, Air Gapped
NVIDIA GPUs	Serverless	A10G, L4	+ L40S	+ A100, H100, H200, B200	+ A100, H100, H200, B200
Intel & AMD CPUs
Inference
Pre-trained model access
Batch requests
Realtime bi-directional streaming
GPU fractioning
Scale to zero
Spot instances
Development & training
Custom model training		Fine-tune	Train & deploy	Full training	Enterprise AI
Model evaluation
Model upload
Model export
Dataset management
Vector search
Automated data labeling

Frequently Asked Questions

How does pricing work?

Each plan provides a monthly credit that can be applied toward various operations, such as model predictions, training, and data storage. If your usage exceeds the included credit, additional charges will apply based on the specific operations performed.

What are the maximum operations and inputs stored per package?

Each plan has a maximum number of operations and inputs that can be stored monthly. The Essential Plan allows up to 30,000 operations, while the Professional Plan permits up to 100,000. To exceed these operations, contact sales@clarifai.com for a custom package that includes committed volume discounts.

What if I need to cancel my subscription?

You can cancel your subscription any time by downgrading to the Community plan.

Do you offer Professional Services? How much do you charge?

Clarifai has a team of AI experts that will help you implement AI projects. We can also offer custom development, depending on the project's need.

Please contact us to share more about your project.

Are your LLMs too slow? Clarifai them.

Pricing

Pay As You Go

Hybrid-Cloud AI Enterprise

Dedicated Node Pricing

Dedicated Node Pricing

Dedicated Node Pricing

Inference Pricing

Compare Plans

Your LLMs Too Slow? Clarifai them.

Compare Plans

Frequently Asked Questions

Unlock Peak Model Performance

Why Clarifai

The Platform

Solutions

Resources

Company