llama-3_3-70b-instruct model | Clarifai

llama-3_3-70b-instruct

Llama 3.3 (70B) is a multilingual instruction-tuned LLM optimized for dialogue, trained on 15T+ tokens, supporting 8 languages, and incorporating strong safety measures

Input

Prompt:

Press Ctrl + Enter to submit

Output

Submit a prompt for a response.

Notes

Note

Model Weight
Framework: SgLang

Introduction

Llama-3.3-70B-Instruct is a multilingual large language model (LLM) developed by Meta. It is a pretrained and instruction-tuned generative model optimized for multilingual dialogue and text generation. The model achieves state-of-the-art performance across multiple industry benchmarks, surpassing many open-source and proprietary chat models.

Llama-3.3-70B-Instruct Model Details

Model Developer: Meta
Architecture: Auto-regressive transformer model with supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF)
Parameters: 70 billion
Input Modality: Multilingual text
Output Modality: Multilingual text and code
Context Length: 128k tokens
Grouped-Query Attention (GQA): Yes
Pretraining Data Size: 15+ trillion tokens
Knowledge Cutoff: December 2023
Release Date: December 6, 2024
License: Llama 3.3 Community License Agreement
Supported Languages: English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai

The model is a static release, trained on an offline dataset. Future iterations will integrate improved safety mechanisms based on community feedback.

Run Llama 3.3 with an API

Running the API with Clarifai's Python SDK

You can run the Llama 3.3 Model API using Clarifai’s Python SDK.

Export your PAT as an environment variable. Then, import and initialize the API Client.

Find your PAT in your security settings.

export CLARIFAI_PAT={your personal access token}

from clarifai.client.model import Model

prompt = "what's the future of AI?"
inference_params = dict(temperature=0.7, max_tokens=200, top_k = 50, top_p= 0.95)

# Model Predict
model_prediction = Model("https://clarifai.com/meta/Llama-3/models/llama-3_3-70b-instruct").predict_by_bytes(prompt.encode(), input_type="text", inference_params=inference_params)

print(model_prediction.outputs[0].data.text.raw)

Use Cases

Llama-3.3-70B-Instruct is designed for both commercial and research applications. Some primary use cases include:

Conversational AI: Enhanced chatbot interactions across multiple languages
Content Generation: Generating high-quality text for various domains
Code Generation: Supporting developers in writing and debugging code
Multilingual Assistance: Providing language-specific responses for different regions
Synthetic Data Generation: Facilitating model distillation and fine-tuning
Knowledge-based Question Answering: Answering domain-specific and general knowledge questions

Out-of-Scope Uses

Applications violating laws, trade compliance regulations, or the Acceptable Use Policy
Deployment in unsupported languages without additional fine-tuning

Evaluation and Benchmark Results

Llama-3.3-70B-Instruct demonstrates significant improvements in key benchmarks:

Category	Benchmark	# Shots	Metric	Llama 3.1 8B Instruct	Llama 3.1 70B Instruct	Llama-3.3 70B Instruct	Llama 3.1 405B Instruct
General	MMLU (CoT)	0	macro_avg/acc	73.0	86.0	86.0	88.6
	MMLU Pro (CoT)	5	macro_avg/acc	48.3	66.4	68.9	73.3
Steerability	IFEval	-	-	80.4	87.5	92.1	88.6
Reasoning	GPQA Diamond (CoT)	0	acc	31.8	48.0	50.5	49.0
Code	HumanEval	0	pass@1	72.6	80.5	88.4	89.0
	MBPP EvalPlus (base)	0	pass@1	72.8	86.0	87.6	88.6
Math	MATH (CoT)	0	sympy_intersection_score	51.9	68.0	77.0	73.8
Tool Use	BFCL v2	0	overall_ast_summary/macro_avg/valid	65.4	77.5	77.3	81.1
Multilingual	MGSM	0	em	68.9	86.9	91.1	91.6

Dataset

Llama-3.3-70B-Instruct was trained on a new mix of publicly available online data.

Pretraining Data: ~15 trillion tokens from publicly available sources
Fine-tuning Data: Over 25 million synthetically generated instruction examples
Data Freshness: Training data cutoff in December 2023

Advantages

State-of-the-art performance on multilingual benchmarks
Extended context length of 128k tokens for improved long-form reasoning
Advanced instruction tuning using RLHF for better alignment with human intent
Improved multilingual capabilities in 8 languages
Optimized for dialogue and task-specific prompting
Efficient inference with GQA for scalable deployments

Limitations

Limited to 8 officially supported languages, though it may generate text in other languages with varying quality
Potential for hallucination, especially on topics beyond its training data
Not designed for real-time updating, as it is a static model with a fixed knowledge cutoff
Requires external safeguards when integrated into production systems to mitigate risks
Biases in training data may lead to unintended outputs, requiring careful evaluation before deployment

ID
Model Type ID
Text To Text
Input Type
text
Output Type
text
Description
Llama 3.3 (70B) is a multilingual instruction-tuned LLM optimized for dialogue, trained on 15T+ tokens, supporting 8 languages, and incorporating strong safety measures
Last Updated
Feb 10, 2025
Privacy
PUBLIC
Use Case
Toolkit
License
Share
Badge