Author: Zeev Weizmann
This project implements an API service that takes a public GitHub repository URL and returns a structured summary of the project using an LLM (Nebius Token Factory API).
The API extracts key repository information (README, file structure, recent commits) and sends a condensed context to the LLM to generate a human-readable summary.
This high-level repository metadata provides a strong signal for the LLM to produce an informative overview. Inspecting the full contents of source files was intentionally avoided, as it would significantly increase complexity and token usage without proportionally improving summary quality for this task.
Another reason for prioritizing key repository information is the preference for human-written content, which typically provides clearer intent and higher-level semantics than auto-generated artifacts.
Assume Python 3.10+ is installed.
git clone https://github.com/ZeevWeizmann/github-repo-summarizer.git
cd github-repo-summarizer
python -m venv venv
source venv/bin/activate # macOS/Linux
# On Windows:
# venv\Scripts\activate
pip install -r requirements.txt
macOS/Linux:
export NEBIUS_API_KEY=your_api_key_here
export OPENAI_BASE_URL=https://api.studio.nebius.ai/v1
export MODEL=meta-llama/Meta-Llama-3.1-8B-Instruct
Windows (PowerShell):
setx NEBIUS_API_KEY "your_api_key_here"
setx OPENAI_BASE_URL "https://api.studio.nebius.ai/v1"
setx MODEL "meta-llama/Meta-Llama-3.1-8B-Instruct"
Do not hardcode API keys.
uvicorn main:app --reload
The server will start at:
http://localhost:8000
curl -X POST http://localhost:8000/summarize \
-H "Content-Type: application/json" \
-d '{"github_url": "https://github.com/psf/requests"}'
The API returns a structured JSON response containing a concise project summary, detected technologies, and repository structure overview.
Expected response:
{
"summary": "Requests is a simple, yet elegant, HTTP library for Python that allows you to send HTTP/1.1 requests extremely easily.",
"technologies": ["Python", "HTTP", "GitHub"],
"structure": "The repository contains a Makefile, a README.md, and various directories for documentation, tests, and source code."
}
The model used is meta-llama/Meta-Llama-3.1-8B-Instruct via Nebius
Token Factory.
The 8B parameter size offers enough representational capacity for accurate high-level summarization while maintaining reasonable cost for an API service.
Repositories can be large, so sending everything to the LLM is not feasible.
This ensures the context window is respected while preserving the most informative signals.
The API handles the following error scenarios:
All errors are returned as structured JSON responses in the following format:
{
"status": "error",
"message": "Description of what went wrong"
}
In cases where partial GitHub metadata cannot be retrieved (e.g., commits or file tree), the service degrades gracefully by returning empty sections instead of failing entirely.
main.py – FastAPI application entry point and /summarize endpointmodels.py – Pydantic request schema for validating inputgithub_service.py – GitHub API interaction and repository filtering logicllm_service.py – LLM integration via Nebius Token Factory (OpenAI-compatible API)requirements.txt – Project dependencies.gitignore – Excludes virtual environments, cache files, and sensitive dataGitHub Repository:
https://github.com/ZeevWeizmann/github-repo-summarizer
Project Page (GitHub Pages):
https://zeevweizmann.github.io/github-repo-summarizer/