Notes
GitHub Repository Chat
A Streamlit application that allows you to chat with any GitHub repository using Retrieval-Augmented Generation (RAG).
Features
- Repository Loading: Load any public GitHub repository by its URL.
- Intelligent Chat: Ask questions about code, structure, or functionality of the repository.
- Persistent Context: The application maintains context throughout your conversation.
- Single Instance Architecture: Optimized to use a single embedchain instance for better performance.
How It Works
This application uses:
- Embedchain: For creating a knowledge base from GitHub repositories
- Clarifai: For AI model hosting and embeddings
- Streamlit: For the web interface
- DeepSeek-R1-Distill-Qwen-32B: As the large language model backend
The app employs RAG (Retrieval-Augmented Generation) technology to:
- Index the repository contents
- Retrieve relevant information when you ask questions
- Generate informative responses based on the repository's code and documentation
Getting Started
Prerequisites
- Python 3.8+
- Streamlit
- Embedchain
- GitHub Personal Access Token (for repository access)
- Clarifai API Token
Installation
# Install required packages
pip install streamlit embedchain clarifai
# Run the application
streamlit run app.py
Environment Setup
The application requires the following environment variables:
- CLARIFAI_PAT: Your Clarifai Personal Access Token
- GITHUB_TOKEN: Your GitHub Personal Access Token with repository read access
Usage
- Enter a GitHub repository URL in the sidebar (e.g., https://github.com/owner/repo)
- Click "Load Repository"
- Wait for the repository to be processed (this may take a few moments)
- Start asking questions about the repository in the chat interface
Example Questions
- "What are the main components of this repository?"
- "How does the authentication system work?"
- "Explain the data flow in this application"
- "What dependencies does this project have?"
- "Show me how error handling is implemented"
Technical Details
The application uses a single Embedchain App instance throughout its lifecycle to improve performance and reduce resource usage. The vector database is stored in a temporary directory and the application maintains session state to preserve your conversation history.
Limitations
- Very large repositories may take longer to process
- The app works best with well-documented repositories
- For repositories with limited documentation, the responses might be less detailed
- Module IDchat_with_github
- Latest Version ID0_0_1
- DescriptionChat with Github
- Last UpdatedMar 06, 2025
- Repository
- Commit
- Share
- Badge
coming soon