multilingual-multimodal-clip-embed model by clarifai

clarifai
main

multilingual-multimodal-clip-embed

CLIP-based multilingual multimodal embedding model.

No input available.

Notes

Multilingual Multimodal CLIP

From https://huggingface.co/sentence-transformers/clip-ViT-B-32-multilingual-v1

This is a multi-lingual version of the OpenAI CLIP-ViT-B32 model. You can map text (in 50+ languages) and images to a common dense vector space such that images and the matching texts are close. This model can be used for image search (users search through a large collection of images) and for multi-lingual zero-shot image classification (image labels are defined as text).

This model is used in the Universal-Multilingual workflow. Use that as the app's base workflow to enable vector search with different languages!

ID
Model Type ID
Multimodal Embedder
Input Type
any
Output Type
embeddings
Description
CLIP-based multilingual multimodal embedding model.
Last Updated
Oct 25, 2024
Privacy
PUBLIC
License
Share
Badge