MiniCPM3-4B

MiniCPM3-4B is the 3rd generation of MiniCPM series. The overall performance of MiniCPM3-4B surpasses Phi-3.5-mini-Instruct and GPT-3.5-Turbo-0125, being comparable with many recent 7B~9B models.

Input

Prompt:

Press Ctrl + Enter to submit
The maximum number of tokens to generate. Shorter token lengths will provide faster performance.
A decimal number that determines the degree of randomness in the response
An alternative to sampling with temperature, where the model considers the results of the tokens with top_p probability mass.

Output

Submit a prompt for a response.

Notes

MiniCPM3-4B

Model source

Introduction

MiniCPM3-4B is the 3rd generation of MiniCPM series. The overall performance of MiniCPM3-4B surpasses Phi-3.5-mini-Instruct and GPT-3.5-Turbo-0125, being comparable with many recent 7B~9B models.

Compared to MiniCPM1.0/MiniCPM2.0, MiniCPM3-4B has a more powerful and versatile skill set to enable more general usage. MiniCPM3-4B supports function call, along with code interpreter. Please refer to Advanced Features for usage guidelines.

MiniCPM3-4B has a 32k context window. Equipped with LLMxMapReduce, MiniCPM3-4B can handle infinite context theoretically, without requiring huge amount of memory.

Usage

Set your PAT

Export your PAT as an environment variable. Then, import and initialize the API Client.

Find your PAT in your security settings.

  • Linux/Mac: export CLARIFAI_PAT="your personal access token"

  • Windows (Powershell): $env:CLARIFAI_PAT="your personal access token"

Running the API with Clarifai's Python SDK

# Please run `pip install -U clarifai` before running this script

from clarifai.client import Model
from clarifai_grpc.grpc.api.status import status_code_pb2


model = Model(url="https://clarifai.com/openbmb/miniCPM/models/MiniCPM3-4B")
prompt = "What’s the future of AI?"

results = model.generate_by_bytes(prompt.encode("utf-8"), "text")

for res in results:
  if res.status.code == status_code_pb2.SUCCESS:
    print(res.outputs[0].data.text.raw, end='', flush=True)

Evaluation Results

BenchmarkQwen2-7B-InstructGLM-4-9B-ChatGemma2-9B-itLlama3.1-8B-InstructGPT-3.5-Turbo-0125Phi-3.5-mini-Instruct(3.8B)MiniCPM3-4B
English
MMLU70.572.472.669.469.268.467.2
BBH64.976.365.267.870.368.670.2
MT-Bench8.418.357.888.288.178.608.41
IFEVAL (Prompt Strict-Acc.)51.064.571.971.558.849.468.4
Chinese
CMMLU80.971.559.555.854.546.973.3
CEVAL77.275.656.755.252.846.173.6
AlignBench v1.17.106.617.105.685.825.736.74
FollowBench-zh (SSR)63.056.457.050.664.658.166.8
Math
MATH49.650.646.051.941.846.446.6
GSM8K82.379.679.784.576.482.781.1
MathBench63.459.445.854.348.954.965.6
Code
HumanEval+70.167.161.662.866.568.968.3
MBPP+57.162.264.355.371.455.863.2
LiveCodeBench v322.220.219.220.424.019.622.6
Function Call
BFCL v271.670.119.273.375.448.476.0
Overall
Average65.365.057.960.861.057.266.3

Statement

  • As a language model, MiniCPM3-4B generates content by learning from a vast amount of text.
  • However, it does not possess the ability to comprehend or express personal opinions or value judgments.
  • Any content generated by MiniCPM3-4B does not represent the viewpoints or positions of the model developers.
  • Therefore, when using content generated by MiniCPM3-4B, users should take full responsibility for evaluating and verifying it on their own.

LICENSE

  • This repository is released under the Apache-2.0 License.
  • The usage of MiniCPM3-4B model weights must strictly follow MiniCPM Model License.md.
  • The models and weights of MiniCPM3-4B are completely free for academic research. after filling out a "questionnaire" for registration, are also available for free commercial use.

Citation

@article{hu2024minicpm,
  title={MiniCPM: Unveiling the Potential of Small Language Models with Scalable Training Strategies},
  author={Hu, Shengding and Tu, Yuge and Han, Xu and He, Chaoqun and Cui, Ganqu and Long, Xiang and Zheng, Zhi and Fang, Yewei and Huang, Yuxiang and Zhao, Weilin and others},
  journal={arXiv preprint arXiv:2404.06395},
  year={2024}
}
  • ID
  • Model Type ID
    Text To Text
  • Input Type
    text
  • Output Type
    text
  • Description
    MiniCPM3-4B is the 3rd generation of MiniCPM series. The overall performance of MiniCPM3-4B surpasses Phi-3.5-mini-Instruct and GPT-3.5-Turbo-0125, being comparable with many recent 7B~9B models.
  • Last Updated
    Mar 14, 2025
  • Privacy
    PUBLIC
  • License
  • Share
  • Badge
    MiniCPM3-4B