The maximum number of tokens to generate. Shorter token lengths will provide faster performance.
A decimal number that determines the degree of randomness in the response
An alternative to sampling with temperature, where the model considers the results of the tokens with top_p probability mass.
The top-k parameter limits the model's predictions to the top k most probable tokens at each step of generation.
num_beams parameter is integral to a method called beam search, which impacts the quality and diversity of generated text
Template for formatting the prompt. Can be an arbitrary string, but must contain the substring `{prompt}`.
A system prompt sets the behavior and context for an AI assistant in a conversation, such as modifying its personality.
ResetGenerate
Output
Submit a prompt for a response.
Notes
Introduction
Llama-3.2-3B-Instruct SLM is part of Meta's Llama 3.2 collection of multilingual large language models (LLMs). This instruction-tuned model is optimized for a wide range of multilingual dialogue applications, including retrieval, summarization, and conversational agents. With 3.21 billion parameters, this model is tuned for tasks requiring natural language understanding and generation in multiple languages, outperforming many other models on industry benchmarks. It supports both text input and output in various languages, making it highly versatile for global applications.
Llama-3.2-11B-Vision-Instruct
Although the Llama-3.2 family also includes multimodal models such as the Llama-3.2-11B-Vision-Instruct, which handles text and image inputs, the Llama-3.2-3B-Instruct SLM is a text-only model. It specializes in multilingual natural language processing tasks, allowing developers to fine-tune it for text-based use cases that require high-quality, nuanced language understanding and generation.
Run Llama 3.2 with an API
Running the API with Clarifai's Python SDK
You can run the Llama 3.2 Model API using Clarifai’s Python SDK.
Export your PAT as an environment variable. Then, import and initialize the API Client.
from clarifai.client.model import Model
prompt ="what's the future of AI?"inference_params =dict(temperature=0.7, max_tokens=200, top_k =50, top_p=0.95)# Model Predictmodel_prediction = Model("https://clarifai.com/meta/Llama-3/models/llama-3_2-3b-instruct").predict_by_bytes(prompt.encode(), input_type="text", inference_params=inference_params)print(model_prediction.outputs[0].data.text.raw)
You can also run Llama 3.2-3b-Instruct API using other Clarifai Client Libraries like Java, cURL, NodeJS, PHP, etc here.
Aliases: Llama 3.2-3b-Instruct, llama 3.2
Use Cases
Llama-3.2-3B-Instruct SLM is suitable for a variety of use cases, especially in multilingual environments:
Multilingual Customer Support: Real-time responses to customer inquiries across multiple languages.
Summarization: Condensing large text bodies in multiple languages into concise, coherent summaries.
Knowledge Retrieval: Extracting relevant information from large text databases for research, customer service, or academic use.
Conversational Agents: Powering intelligent chatbots capable of interacting in multiple languages with coherent and contextually appropriate responses.
Content Rewriting and Translation: Rewriting or translating text content into different languages while maintaining the original meaning and style.
Evaluation and Benchmark Results
English Text Benchmarks (Base Pretrained Models)
Category
Benchmark
# Shots
Metric
Llama 3.2 1B
Llama 3.2 3B
Llama 3.1 8B
General
MMLU
5
macro_avg/acc_char
32.2
58.0
66.7
AGIEval English
3-5
average/acc_char
23.3
39.2
47.8
ARC-Challenge
25
acc_char
32.8
69.1
79.7
Reading comprehension
SQuAD
1
em
49.2
67.7
77.0
QuAC
1
f1
37.9
42.9
44.9
DROP
3
f1
28.0
45.2
59.5
Long Context
Needle in Haystack
0
em
96.8
1.0
1.0
English Text Benchmarks (Instruction-Tuned Models)
Llama-3.2 models were trained on a new mix of publicly available online data. The text-only version of the model uses up to 9 trillion tokens across various languages, enabling it to excel in multilingual understanding. The dataset includes multilingual text and code, making the model proficient in generating both natural language and programming languages.
Advantages
Multilingual Support: Officially supports 8 languages, with the ability to fine-tune for more. This allows for extensive applicability in multilingual dialogue systems and global contexts.
Optimized for Instruction Following: Llama-3.2-3B-Instruct SLM excels at tasks involving clear instruction-following, making it useful for tasks like question-answering, summarization, and content rewriting.
High Performance on Industry Benchmarks: Outperforms several open-source and proprietary models on common benchmarks, demonstrating its high utility in both general and instruction-tuned scenarios.
Scalable: The 3.21B parameter architecture balances performance with scalability, making it ideal for developers and businesses looking for a high-performing yet efficient model.
Limitations
Knowledge Cutoff: As the training data is current up to December 2023, the model may not be aware of events or data after this period, limiting its use in real-time knowledge applications.
Multimodal Limitations: While the Llama-3.2 family includes multimodal models, the 3B-Instruct SLM is a text-only model, limiting its utility in applications requiring image or multimodal input.
Language Coverage: Though officially supporting 8 languages, performance may degrade for languages that fall outside this core set. Fine-tuning may be required for optimal performance in additional languages.
ID
Model Type ID
Text To Text
Input Type
text
Output Type
text
Description
Llama-3.2-3B-Instruct SLM is a multilingual, instruction-tuned LLM optimized for dialogue and text generation