gpt-4o-mini

GPT-4o Mini: An affordable, high-performing small model excelling in text and vision tasks with extensive context support

Input

Prompt:

Press Ctrl + Enter to submit
The maximum number of tokens to generate. Shorter token lengths will provide faster performance.
A decimal number that determines the degree of randomness in the response
An alternative to sampling with temperature, where samples from the top p percentage of most likely tokens.

Output

Submit a prompt for a response.

Notes

Introduction

GPT-4o Mini is the most cost-efficient small model released by OpenAI. Designed to significantly expand the range of applications built with AI, GPT-4o Mini makes intelligence much more affordable. It achieves an impressive 82% score on the MMLU benchmark and currently outperforms GPT-4 in chat preferences on the LMSYS leaderboard.

GPT-4o Mini LLM

GPT-4o Mini is a small model with superior textual intelligence and multimodal reasoning capabilities. It surpasses GPT-3.5 Turbo and other small models on academic benchmarks across both textual intelligence and multimodal reasoning. It supports the same range of languages as GPT-4o. The model supports text and vision in the API. It features a context window of 128K tokens, supports up to 16K output tokens per request, and possesses knowledge up to October 2023.

Run GPT-4o Mini with an API

Running the API with Clarifai's Python SDK

You can run the GPT-4o-mini Model API using Clarifai’s Python SDK.

Export your PAT as an environment variable. Then, import and initialize the API Client.

Find your PAT in your security settings.

export CLARIFAI_PAT={your personal access token}

Predict via Image URL

from clarifai.client.model import Model
from clarifai.client.input import Inputs

prompt = "What time of day is it?"
image_url = "https://samples.clarifai.com/metro-north.jpg"
inference_params = dict(temperature=0.2, max_tokens=100, top_p=0.9,)

model_prediction = Model("https://clarifai.com/openai/chat-completion/models/gpt-4o-mini").predict(inputs = [Inputs.get_multimodal_input(input_id="",image_url=image_url, raw_text=prompt)],inference_params=inference_params)

print(model_prediction.outputs[0].data.text.raw)

You can also run GPT-4o API using other Clarifai Client Libraries like Java, cURL, NodeJS, PHP, etc here.

Predict via local Image

from clarifai.client.model import Model
from clarifai.client.input import Inputs

IMAGE_FILE_LOCATION = 'LOCAL IMAGE PATH'
with open(IMAGE_FILE_LOCATION, "rb") as f:
file_bytes = f.read()

prompt = "What time of day is it?"
inference_params = dict(temperature=0.2, max_tokens=100, top_p=0.9,)

model_prediction = Model("https://clarifai.com/openai/chat-completion/models/gpt-4o-mini").predict(inputs = [Inputs.get_multimodal_input(input_id="", image_bytes = file_bytes, raw_text=prompt)], inference_params=inference_params)
print(model_prediction.outputs[0].data.text.raw)

Use Cases

GPT-4o Mini's low cost and latency enable a broad range of tasks, including:

  • Applications that chain or parallelize multiple model calls (e.g., calling multiple APIs)
  • Handling large volumes of context (e.g., full code base or conversation history)
  • Interacting with customers through fast, real-time text responses (e.g., customer support chatbots)

Today, GPT-4o Mini supports text and vision in the API, with future support for text, image, video, and audio inputs and outputs.

Evaluation and Benchmark Results

GPT-4o Mini has been evaluated across several key benchmarks, demonstrating its proficiency in reasoning tasks, math, coding, and multimodal reasoning:

  • Reasoning tasks: Scored 82.0% on MMLU, outperforming Gemini Flash (77.9%) and Claude Haiku (73.8%).
  • Math and coding proficiency: Scored 87.0% on MGSM (math reasoning), compared to Gemini Flash (75.5%) and Claude Haiku (71.7%). Scored 87.2% on HumanEval (coding performance), compared to Gemini Flash (71.5%) and Claude Haiku (75.9%).
  • Multimodal reasoning: Scored 59.4% on MMMU, compared to Gemini Flash (56.1%) and Claude Haiku (50.2%).

Advantages

  • Cost-efficiency: Significantly more affordable than previous models, making AI applications accessible to a broader audience.
  • High performance: Excels in textual intelligence, math, coding, and multimodal reasoning tasks.
  • Wide range of languages: Supports the same languages as GPT-4o, making it versatile for global applications.
  • Long-context handling: Improved performance in handling long-context tasks compared to GPT-3.5 Turbo.

Limitations

  • Knowledge cut-off: The model's knowledge is up to date only until October 2023, which may limit its effectiveness for more recent information.
  • Limited to text and vision currently: While it supports text and vision, support for image, video, and audio inputs and outputs is still in development.

Built-in Safety Measures

Safety is a core focus in GPT-4o Mini’s development:

  • Pre-Training Filtering: Excludes harmful content such as hate speech and spam.
  • Post-Training Alignment: Uses techniques like RLHF to ensure the model aligns with safety policies.
  • Expert Evaluations: More than 70 external experts have tested the model to identify and mitigate potential risks.
  • Instruction Hierarchy Method: Improves resistance to jailbreaks and prompt injections, making the model safer for scalable applications.
  • ID
  • Model Type ID
    Multimodal To Text
  • Input Type
    image
  • Output Type
    text
  • Description
    GPT-4o Mini: An affordable, high-performing small model excelling in text and vision tasks with extensive context support
  • Last Updated
    Oct 17, 2024
  • Privacy
    PUBLIC
  • Use Case
  • Toolkit
  • License
  • Share
  • Badge
    gpt-4o-mini