October 18, 2023

Run Zephyr 7B with an API

Table of Contents:
Run Zephyr 7B with an API

Zephyr-7B-alpha is a new open-source language model from HuggingFace and is based on Mistral-7B. This model surpasses Llama 2 70B Chat on the MT Bench.

You can now try out zephyr-7B-alpha in the Clarifai Platform and access it through the API.

Table of Contents

  1. Introduction
  2. Prompt Template
  3. Running Zephyr 7B with Python
  4. Running Zephyr 7B with Javascript
  5. Best Use cases
  6. Limitations

Introduction

Zephyr-7B-alpha is the first model in the Zephyr series and is based on Mistral-7B. It has been fine-tuned using Direct Preference Optimization (DPO) on a mix of publicly available and synthetic datasets. Notably, the in-built alignment of these datasets was removed to boost performance on the MT Bench and make the model more helpful.

Prompt Template

To interact effectively with the Zephyr-7B-alpha model, use the prompt template below.

<|system|>

{system_prompt}</s>

<|user|>

{prompt}</s>

<|assistant|>

Here's an example of how to use the prompt template:

<|system|>
You are a friendly chatbot who always responds in the style of a pirate.</s>
<|user|>
What's the easiest way to peel all the cloves from a head of garlic?</s>
<|assistant|>

Running Zephyr 7B with Python

You can run Zephyr 7B with our Python SDK with just a few lines of code.

To get started, Signup to Clarifai here and get your Personal Access Token(PAT) under the security section in settings. 

Export your PAT as an environment variable:

export CLARIFAI_PAT={your personal access token}

Check out the Code Below:

from clarifai.client.model import Model
system_message = "You are a friendly chatbot who always responds in the style of a pirate."
prompt = "Write a tweet on future of AI"
prompt_template = f"<|system|> \
{system_message}\
</s>\
<|user|>\
{prompt}</s>\
<|assistant|>"
# Model Predict
model_prediction = Model("https://clarifai.com/huggingface-research/zephyr/models/zephyr-7B-alpha").predict_by_bytes(prompt_template.encode(), "text")
print(model_prediction.outputs[0].data.text.raw)
view raw zephyr7b.py hosted with ❤ by GitHub

Running Zephyr 7B with Javascript

////////////////////////////////////////////////////////////////////////////////////////////////////
// In this section, we set the user authentication, user and app ID, model details, and the URL
// of the text we want as an input. Change these strings to run your own example.
///////////////////////////////////////////////////////////////////////////////////////////////////
// Your PAT (Personal Access Token) can be found in the portal under Authentification
const PAT = '';
// Specify the correct user_id/app_id pairings
// Since you're making inferences outside your app's scope
const USER_ID = 'huggingface-research';
const APP_ID = 'zephyr';
// Change these to whatever model and text URL you want to use
const MODEL_ID = 'zephyr-7B-alpha';
const MODEL_VERSION_ID = '806443491944476a9ac1678e8a9c4a9b';
const RAW_TEXT = 'I love your product very much';
// To use a hosted text file, assign the url variable
// const TEXT_FILE_URL = 'https://samples.clarifai.com/negative_sentence_12.txt';
// Or, to use a local text file, assign the url variable
// const TEXT_FILE_BYTES = 'YOUR_TEXT_FILE_BYTES_HERE';
///////////////////////////////////////////////////////////////////////////////////
// YOU DO NOT NEED TO CHANGE ANYTHING BELOW THIS LINE TO RUN THIS EXAMPLE
///////////////////////////////////////////////////////////////////////////////////
const raw = JSON.stringify({
"user_app_id": {
"user_id": USER_ID,
"app_id": APP_ID
},
"inputs": [
{
"data": {
"text": {
"raw": RAW_TEXT
// url: TEXT_URL, allow_duplicate_url: true
// raw: fileBytes
}
}
}
]
});
const requestOptions = {
method: 'POST',
headers: {
'Accept': 'application/json',
'Authorization': 'Key ' + PAT
},
body: raw
};
// NOTE: MODEL_VERSION_ID is optional, you can also call prediction with the MODEL_ID only
// https://api.clarifai.com/v2/models/{YOUR_MODEL_ID}/outputs
// this will default to the latest version_id
fetch("https://api.clarifai.com/v2/models/" + MODEL_ID + "/versions/" + MODEL_VERSION_ID + "/outputs", requestOptions)
.then((response) => {
return response.json();
})
.then((data) => {
if(data.status.code != 10000) console.log(data.status);
else console.log(data['outputs'][0]['data']['text']['raw']);
}).catch(error => console.log('error', error));
view raw zephyr7b.js hosted with ❤ by GitHub

You can also run Zephyr Model using other Clarifai Client Libraries like Java, cURL, NodeJS, PHP, etc here.

Model Demo in the Clarifai Platform:

Try out the zephyr-7B-alpha model here: https://clarifai.com/huggingface-research/zephyr/models/zephyr-7B-alpha

Best Use Cases

Chat applications

The Zephyr-7B-alpha model is well-suited for chat applications. It was initially fine-tuned on a version of the UltraChat dataset, which includes synthetic dialogues generated by ChatGPT. Further refinement was achieved by employing huggingface TRL’s DPOTrainer on the openbmb/UltraFeedback dataset. This dataset contains prompts and model completions ranked by GPT-4. This extensive training process ensures that the model performs exceptionally well in chat applications.

Limitations

Zephyr-7B-alpha has not been aligned to human preferences using techniques like Reinforcement Learning from Human Feedback (RLHF). As a result, it can produce outputs that may be problematic, especially when intentionally prompted.

Keep up to speed with AI

  • Follow us on Twitter X to get the latest from the LLMs

  • Join us in our Discord to talk LLMs