• Community
  • Model
  • multilingual-multimodal-clip-embed

multilingual-multimodal-clip-embed

CLIP-based multilingual multimodal embedding model.

Notes

Multilingual Multimodal CLIP

From https://huggingface.co/sentence-transformers/clip-ViT-B-32-multilingual-v1

This is a multi-lingual version of the OpenAI CLIP-ViT-B32 model. You can map text (in 50+ languages) and images to a common dense vector space such that images and the matching texts are close. This model can be used for image search (users search through a large collection of images) and for multi-lingual zero-shot image classification (image labels are defined as text).

This model is used in the Universal-Multilingual workflow. Use that as the app's base workflow to enable vector search with different languages!

  • ID
  • Name
    Multilingual Multimodal Clip Embedder
  • Model Type ID
    Multimodal Embedder
  • Description
    CLIP-based multilingual multimodal embedding model.
  • Last Updated
    Oct 25, 2024
  • Privacy
    PUBLIC
  • License
  • Share
    • Badge
      multilingual-multimodal-clip-embed