• Community
  • Model
  • multimodal-embedder-blip-2

multimodal-embedder-blip-2

BLIP-2, a scalable multimodal pre-training method that enables any Large Language Models (LLMs) to ingest and understand images, unlocks the capabilities of zero-shot image-to-text generation. BLIP-2 is quick, efficient, and accurate.