lmdeploy-minicpm2d6-visual-classifier

--

Notes

MiniCPM-V 2.6

MiniCPM-V 2.6 is the latest and most capable model in the MiniCPM-V series, built on SigLip-400M and Qwen2-7B with 8 billion parameters. It delivers major improvements over previous versions in both performance and functionality, especially for vision-language tasks.

Key Features

  • High Performance
    Achieves top scores on benchmarks like OpenCompass, outperforming models such as GPT-4V and Gemini 1.5 Pro for single-image understanding.

  • Multi-Image Reasoning
    Supports conversations and reasoning across multiple images with strong results on Mantis-Eval, BLINK, and more.

  • Video Understanding
    Processes video inputs for dialogue and detailed spatial-temporal captions, exceeding leading models on Video-MME.

  • Advanced OCR
    Handles high-resolution and varied aspect ratio images with best-in-class results on OCRBench and robust multilingual support.

  • Efficient Tokenization
    Uses 75% fewer tokens for high-resolution images, enabling faster inference and lower resource use.

  • Flexible Deployment
    Available in formats like int4, GGUF, and llama.cpp, with support for vLLM and quick WebUI setup for easy integration and fine-tuning.

Evaluation

Single image results on OpenCompass, MME, MMVet, OCRBench, MMMU, MathVista, MMB, AI2D, TextVQA, DocVQA, HallusionBench, Object HalBench: image

Model Usage

The MiniCPM-V 2.6 model provided in this application is specifically configured for open-world object classification.

Running the API with Clarifai's Python SDK

You can run the  MiniCPM-V 2.6 Model API using Clarifai’s Python SDK. Export your PAT as an environment variable. Then, import and initialize the API Client. Find your PAT in your security settings.

export CLARIFAI_PAT={your personal access token}

Predict via Image URL

from clarifai.client.model import Model

image_url = "https://s3.amazonaws.com/samples.clarifai.com/people_walking2.jpeg"

model_url = "https://clarifai.com/clarifai/open-world/models/lmdeploy-minicpm2d6-visual-classifier"
model_prediction = Model(url=model_url,pat="").predict_by_url(image_url)

print(model_prediction.outputs[0].data.concepts)
  • ID
  • Model Type ID
    Visual Classifier
  • Description
    --
  • Last Updated
    May 02, 2025
  • Privacy
    PUBLIC
  • License
  • Share
    • Badge
      lmdeploy-minicpm2d6-visual-classifier