Chat with Github

Notes

GitHub Repository Chat

A Streamlit application that allows you to chat with any GitHub repository using Retrieval-Augmented Generation (RAG).

Features

Repository Loading: Load any public GitHub repository by its URL.
Intelligent Chat: Ask questions about code, structure, or functionality of the repository.
Persistent Context: The application maintains context throughout your conversation.
Single Instance Architecture: Optimized to use a single embedchain instance for better performance.

How It Works

This application uses:

Embedchain: For creating a knowledge base from GitHub repositories
Clarifai: For AI model hosting and embeddings
Streamlit: For the web interface
DeepSeek-R1-Distill-Qwen-32B: As the large language model backend

The app employs RAG (Retrieval-Augmented Generation) technology to:

Index the repository contents
Retrieve relevant information when you ask questions
Generate informative responses based on the repository's code and documentation

Getting Started

Prerequisites

Python 3.8+
Streamlit
Embedchain
GitHub Personal Access Token (for repository access)
Clarifai API Token

Installation

# Install required packages
pip install streamlit embedchain clarifai

# Run the application
streamlit run app.py

Environment Setup

The application requires the following environment variables:

CLARIFAI_PAT: Your Clarifai Personal Access Token
GITHUB_TOKEN: Your GitHub Personal Access Token with repository read access

Usage

Enter a GitHub repository URL in the sidebar (e.g., https://github.com/owner/repo)
Click "Load Repository"
Wait for the repository to be processed (this may take a few moments)
Start asking questions about the repository in the chat interface

Example Questions

"What are the main components of this repository?"
"How does the authentication system work?"
"Explain the data flow in this application"
"What dependencies does this project have?"
"Show me how error handling is implemented"

Technical Details

The application uses a single Embedchain App instance throughout its lifecycle to improve performance and reduce resource usage. The vector database is stored in a temporary directory and the application maintains session state to preserve your conversation history.

Limitations

Very large repositories may take longer to process
The app works best with well-documented repositories
For repositories with limited documentation, the responses might be less detailed

Module ID
chat_with_github
Latest Version ID
0_0_1
Description
Chat with Github
Last Updated
Mar 06, 2025
Repository
github.com/Sumanth077/chat_with_github
Commit
f56d15f
Share
Badge

chat_with_github