gemma-2b-it

The Gemma-2b-it, a part of Google DeepMind's lightweight, Gemma family LLM, offers exceptional AI performance on diverse tasks by leveraging a training dataset of 6 trillion tokens, with a focus on safety and responsible output.

Input

Prompt:

Press Ctrl + Enter to submit
The maximum number of tokens to generate. Shorter token lengths will provide faster performance.
A decimal number that determines the degree of randomness in the response
The top-k parameter limits the model's predictions to the top k most probable tokens at each step of generation.
An alternative to sampling with temperature, where the model considers the results of the tokens with top_p probability mass.

Output

Submit a prompt for a response.

Notes

Note

  • Datatype: 4-bit bitandbytes NF4 quantised model
  • Framework: flash_attention_2
  • Model Source

Introduction

The Gemma-2b-it Language Model is part of the Gemma family of lightweight, state-of-the-art open models developed by Google DeepMind and other teams across Google. Inspired by the Gemini models, Gemma models aim to provide best-in-class performance while adhering to rigorous standards for safety and responsible AI development.

Gemma-2b-it Model

Gemma-2b-it is an instruct version of the Gemma model and it is one of the two sizes of Gemma models released, along with Gemma-7b. Both sizes come with pre-trained and instruction-tuned variants, offering state-of-the-art performance relative to their sizes. The Gemma models share technical and infrastructure components with Gemini, enabling them to achieve high performance directly on developer laptops or desktop computers.

Run Gemma with an API

Running the API with Clarifai's Python SDK

You can run the Gemma-2b Model API using Clarifai’s Python SDK.

Export your PAT as an environment variable. Then, import and initialize the API Client.

Find your PAT in your security settings.

export CLARIFAI_PAT={your personal access token}
from clarifai.client.model import Model

prompt = '''<start_of_turn>user
What will be the future of AI? <end_of_turn>
<start_of_turn>model'''

inference_params = dict(temperature=0.7, max_tokens=200, top_k = 50, top_p= 0.95)

# Model Predict
model_prediction = Model("https://clarifai.com/gcp/generate/models/gemma-2b-it").predict_by_bytes(prompt.encode(), input_type="text", inference_params=inference_params)

print(model_prediction.outputs[0].data.text.raw)

You can also run Gemma-2b API using other Clarifai Client Libraries like Java, cURL, NodeJS, PHP, etc here.

Aliases: Gemma-2b, gemma, gemma 2b,

Prompt Format

This format must be strictly respected, otherwise the model will generate sub-optimal outputs.

The template used to build a prompt for the Instruct model is defined as follows:

<start_of_turn>user
{prompt}<end_of_turn>
<start_of_turn>model

Each turn is preceeded by a <start_of_turn> delimiter and then the role of the entity (either user, for content supplied by the user, or model for LLM responses). Turns finish with the <end_of_turn> token.

Use Cases

The Gemma-2b-it model is versatile and capable of handling a wide array of tasks, including but not limited to:

  • Natural language understanding and generation
  • Code generation and interpretation
  • Addressing mathematical queries
  • Text summarization and translation Its ability to process diverse types of information makes it a valuable tool for developers, researchers, and creators across various disciplines.

Evaluation

The Gemma-2b-it model has been rigorously evaluated against key benchmarks to ensure its performance meets the high standards expected of modern AI models.

 For specific metrics and performance details, please refer to technical report.

Dataset

The Gemma models were trained on a rich dataset comprising 6 trillion tokens from a variety of sources, including:

  • Web Documents: Ensuring exposure to a wide range of linguistic styles, topics, and vocabularies.
  • Code: To facilitate learning of programming languages' syntax and patterns.
  • Mathematics: For enhancing the model's capabilities in logical reasoning and symbolic representation.

The dataset's diversity is crucial for the model's ability to tackle a broad spectrum of tasks and text formats effectively.

Advantages

  • Accessibility: Can run on standard developer hardware.
  • Versatility: Handles a broad range of tasks efficiently.
  • State-of-the-Art Performance: Achieves top-notch results for its size.

Limitations

  • Language Bias: Primarily trained on English-language content, which may limit its effectiveness with other languages.
  • Data Sensitivity: Despite rigorous filtering, the potential for unforeseen biases or sensitivities in the data cannot be entirely eliminated.
  • ID
  • Model Type ID
    Text To Text
  • Input Type
    text
  • Output Type
    text
  • Description
    The Gemma-2b-it, a part of Google DeepMind's lightweight, Gemma family LLM, offers exceptional AI performance on diverse tasks by leveraging a training dataset of 6 trillion tokens, with a focus on safety and responsible output.
  • Last Updated
    Oct 17, 2024
  • Privacy
    PUBLIC
  • Use Case
  • Toolkit
  • License
  • Share
  • Badge
    gemma-2b-it