- Community
- Model
- cohere-multilingual-text-to-embeddings
Notes
Note
This model has a special pricing applied. Find out more
Introduction
The Embedding model can be used to generate embeddings from text. Embeddings can be used for estimating semantic similarity between two sentences, choosing a sentence which is most likely to follow another sentence, or categorizing user feedback.
Model
embed-multilingual-v2.0
The multilingual representation model embeddings have 768 dimensions.
Multilingual Sentence Embeddings
Most word and sentence embeddings are dependent on the language that the model is trained on. If you were to try to fit the French sentence “Bonjour, comment ça va?” (meaning: hello, how are you?) in the embedding from the previous section, it will struggle to understand that it should be close to the sentence “Hello, how are you?” in English. For the purpose of unifying many languages into one, and being able to understand text in all these languages, Cohere has trained a large multilingual model, that has showed wonderful results with more than 100 languages. Here is a small example, with the following sentences in English, French, and Spanish.
- The bear lives in the woods
- El oso vive en el bosque
- L’ours vit dans la foret
- The world cup is in Qatar
- El mundial es en Qatar
- La coupe du monde est au Qatar
- An apple is a fruit
- Una manzana es una fruta
- Une pomme est un fruit
- El cielo es azul
- The sky is blue
- Le ciel est bleu
The model returned the following embedding. Notice that the model managed to identify the sentences about the bear, soccer, an apple, and the sky, even if they are in different languages.
Supported Languages
The multilingual embedding model supports over 100 languages, including Chinese, Spanish, and French. For a full list of languages it support, please reference this page.
Disclaimer
Please be advised that this model utilizes wrapped Artificial Intelligence (AI) provided by Cohere (the "Vendor"). These AI models may collect, process, and store data as part of their operations. By using our website and accessing these AI models, you hereby consent to the data practices of the Vendor. We do not have control over the data collection, processing, and storage practices of the Vendor. Therefore, we cannot be held responsible or liable for any data handling practices, data loss, or breaches that may occur. It is your responsibility to review the privacy policies and terms of service of the Vendor to understand their data practices. You can access the Vendor's privacy policy and terms of service at https://cohere.city/privacy-policy/.
We disclaim all liability with respect to the actions or omissions of the Vendor, and we encourage you to exercise caution and to ensure that you are comfortable with these practices before utilizing the AI models hosted on our site.
- ID
- Namecohere-multilingual-text-to-embeddings
- Model Type IDText Embedder
- DescriptionCohere's Multilingual embedding model empowers language generation in LLM, capturing semantic relationships for coherent and contextually relevant text. It enhances generative power, improving the quality of generated content.
- Last UpdatedOct 17, 2024
- PrivacyPUBLIC
- Use Case
- Toolkit
- License
- Share
- Badge