The maximum number of tokens to generate. Shorter token lengths will provide faster performance.
A decimal number that determines the degree of randomness in the response
An alternative to sampling with temperature, where samples from the top p percentage of most likely tokens.
ResetGenerate
Output
Submit a prompt for a response.
Notes
Introduction
GPT-4o Mini is the most cost-efficient small model released by OpenAI. Designed to significantly expand the range of applications built with AI, GPT-4o Mini makes intelligence much more affordable. It achieves an impressive 82% score on the MMLU benchmark and currently outperforms GPT-4 in chat preferences on the LMSYS leaderboard.
GPT-4o Mini LLM
GPT-4o Mini is a small model with superior textual intelligence and multimodal reasoning capabilities. It surpasses GPT-3.5 Turbo and other small models on academic benchmarks across both textual intelligence and multimodal reasoning. It supports the same range of languages as GPT-4o. The model supports text and vision in the API. It features a context window of 128K tokens, supports up to 16K output tokens per request, and possesses knowledge up to October 2023.
Run GPT-4o Mini with an API
Running the API with Clarifai's Python SDK
You can run the GPT-4o-mini Model API using Clarifai’s Python SDK.
Export your PAT as an environment variable. Then, import and initialize the API Client.
from clarifai.client.model import Model
from clarifai.client.inputimport Inputs
prompt ="What time of day is it?"image_url ="https://samples.clarifai.com/metro-north.jpg"inference_params =dict(temperature=0.2, max_tokens=100, top_p=0.9,)model_prediction = Model("https://clarifai.com/openai/chat-completion/models/gpt-4o-mini").predict(inputs =[Inputs.get_multimodal_input(input_id="",image_url=image_url, raw_text=prompt)],inference_params=inference_params)print(model_prediction.outputs[0].data.text.raw)
You can also run GPT-4o API using other Clarifai Client Libraries like Java, cURL, NodeJS, PHP, etc here.
Predict via local Image
from clarifai.client.model import Model
from clarifai.client.inputimport Inputs
IMAGE_FILE_LOCATION ='LOCAL IMAGE PATH'withopen(IMAGE_FILE_LOCATION,"rb")as f:file_bytes = f.read()prompt ="What time of day is it?"inference_params =dict(temperature=0.2, max_tokens=100, top_p=0.9,)model_prediction = Model("https://clarifai.com/openai/chat-completion/models/gpt-4o-mini").predict(inputs =[Inputs.get_multimodal_input(input_id="", image_bytes = file_bytes, raw_text=prompt)], inference_params=inference_params)print(model_prediction.outputs[0].data.text.raw)
Use Cases
GPT-4o Mini's low cost and latency enable a broad range of tasks, including:
Applications that chain or parallelize multiple model calls (e.g., calling multiple APIs)
Handling large volumes of context (e.g., full code base or conversation history)
Interacting with customers through fast, real-time text responses (e.g., customer support chatbots)
Today, GPT-4o Mini supports text and vision in the API, with future support for text, image, video, and audio inputs and outputs.
Evaluation and Benchmark Results
GPT-4o Mini has been evaluated across several key benchmarks, demonstrating its proficiency in reasoning tasks, math, coding, and multimodal reasoning:
Reasoning tasks: Scored 82.0% on MMLU, outperforming Gemini Flash (77.9%) and Claude Haiku (73.8%).
Math and coding proficiency: Scored 87.0% on MGSM (math reasoning), compared to Gemini Flash (75.5%) and Claude Haiku (71.7%). Scored 87.2% on HumanEval (coding performance), compared to Gemini Flash (71.5%) and Claude Haiku (75.9%).
Multimodal reasoning: Scored 59.4% on MMMU, compared to Gemini Flash (56.1%) and Claude Haiku (50.2%).
Advantages
Cost-efficiency: Significantly more affordable than previous models, making AI applications accessible to a broader audience.
High performance: Excels in textual intelligence, math, coding, and multimodal reasoning tasks.
Wide range of languages: Supports the same languages as GPT-4o, making it versatile for global applications.
Long-context handling: Improved performance in handling long-context tasks compared to GPT-3.5 Turbo.
Limitations
Knowledge cut-off: The model's knowledge is up to date only until October 2023, which may limit its effectiveness for more recent information.
Limited to text and vision currently: While it supports text and vision, support for image, video, and audio inputs and outputs is still in development.
Built-in Safety Measures
Safety is a core focus in GPT-4o Mini’s development:
Pre-Training Filtering: Excludes harmful content such as hate speech and spam.
Post-Training Alignment: Uses techniques like RLHF to ensure the model aligns with safety policies.
Expert Evaluations: More than 70 external experts have tested the model to identify and mitigate potential risks.
Instruction Hierarchy Method: Improves resistance to jailbreaks and prompt injections, making the model safer for scalable applications.
ID
Model Type ID
Multimodal To Text
Input Type
image
Output Type
text
Description
GPT-4o Mini: An affordable, high-performing small model excelling in text and vision tasks with extensive context support