salesforce/blip/multimodal-embedder-blip-2
4
BLIP-2, a scalable multimodal pre-training method that enables any Large Language Models (LLMs) to ingest and understand images, unlocks the capabilities of zero-shot image-to-text generation. BLIP-2 is quick, efficient, and accurate.
  • ID
    multimodal-embedder-blip-2
  • Type
    multimodal-embedder
  • Updated
    Oct 17, 2024
  • Input
  • Output
  • Config
  • Privacy
    Public
  • License
  • Toolkit
    HuggingFace
  • Use Case
  • Share
    • Badge
      multimodal-embedder-blip-2