The Llama 3 instruction tuned llm are optimized for dialogue use cases and outperform many of the available open source chat llm on common industry benchmarks
The maximum number of tokens to generate. Shorter token lengths will provide faster performance.
A decimal number that determines the degree of randomness in the response
An alternative to sampling with temperature, where the model considers the results of the tokens with top_p probability mass.
The top-k parameter limits the model's predictions to the top k most probable tokens at each step of generation.
num_beams parameter is integral to a method called beam search, which impacts the quality and diversity of generated text
Template for formatting the prompt. Can be an arbitrary string, but must contain the substring `{prompt}`.
A system prompt sets the behavior and context for an AI assistant in a conversation, such as modifying its personality.
ResetModel loading...
Output
Notes
ID
Model Type ID
Text To Text
Input Type
text
Output Type
text
Description
The Llama 3 instruction tuned llm are optimized for dialogue use cases and outperform many of the available open source chat llm on common industry benchmarks