lmdeploy-minicpm2d6-visual-classifier
--
Notes
MiniCPM-V 2.6
MiniCPM-V 2.6 is the latest and most capable model in the MiniCPM-V series, built on SigLip-400M and Qwen2-7B with 8 billion parameters. It delivers major improvements over previous versions in both performance and functionality, especially for vision-language tasks.
Key Features
High Performance
Achieves top scores on benchmarks like OpenCompass, outperforming models such as GPT-4V and Gemini 1.5 Pro for single-image understanding.Multi-Image Reasoning
Supports conversations and reasoning across multiple images with strong results on Mantis-Eval, BLINK, and more.Video Understanding
Processes video inputs for dialogue and detailed spatial-temporal captions, exceeding leading models on Video-MME.Advanced OCR
Handles high-resolution and varied aspect ratio images with best-in-class results on OCRBench and robust multilingual support.Efficient Tokenization
Uses 75% fewer tokens for high-resolution images, enabling faster inference and lower resource use.Flexible Deployment
Available in formats like int4, GGUF, and llama.cpp, with support for vLLM and quick WebUI setup for easy integration and fine-tuning.
Evaluation
Single image results on OpenCompass, MME, MMVet, OCRBench, MMMU, MathVista, MMB, AI2D, TextVQA, DocVQA, HallusionBench, Object HalBench:
Model Usage
The MiniCPM-V 2.6 model provided in this application is specifically configured for open-world object classification.
Running the API with Clarifai's Python SDK
You can run the MiniCPM-V 2.6 Model API using Clarifai’s Python SDK. Export your PAT as an environment variable. Then, import and initialize the API Client. Find your PAT in your security settings.
export CLARIFAI_PAT={your personal access token}
Predict via Image URL
from clarifai.client.model import Model
image_url = "https://s3.amazonaws.com/samples.clarifai.com/people_walking2.jpeg"
model_url = "https://clarifai.com/clarifai/open-world/models/lmdeploy-minicpm2d6-visual-classifier"
model_prediction = Model(url=model_url,pat="").predict_by_url(image_url)
print(model_prediction.outputs[0].data.concepts)
- ID
- Model Type IDVisual Classifier
- Description--
- Last UpdatedMay 02, 2025
- PrivacyPUBLIC
- License
- Share
- Badge
Concept | Date |
---|