Back to projects

RepoScope

Github Repo Analyzer

AI-Powered GitHub Repository Intelligence & Architecture Visualizer

Overview

RepoScope is a production-deployed developer tool that analyzes GitHub repositories and generates structured architectural insights, API mappings, database schemas, dependency graphs, and AI-generated documentation. It helps contributors understand unfamiliar codebases instantly.

Problem

Understanding a new repository requires manually exploring: 1. Routes and controllers 2. Frontend–backend connections 3. Database models and ORM usage 4. Environment variables 5. External integrations 6. Contribution opportunities This process is slow and error-prone, especially in large projects.

Solution

I built an AI-assisted repository analysis engine that: 1. Fetches source code using GitHub API 2. Extracts structural insights from file trees and key files 3. Uses structured LLM output with strict JSON schema enforcement 4. Visualizes architecture using ReactFlow 5. Generates README and Mermaid diagrams automatically 6. Suggests contribution opportunities based on code analysis Caches analyzed repositories per user

System Architecture

System Architecture
View Diagram

Core Features

  • Automatic API route and endpoint mapping from source code
  • Database schema extraction with relationship visualization
  • Frontend-backend connection mapping and data flow analysis
  • Dependency graph generation showing project structure
  • AI-generated README and technical documentation
  • Interactive architecture visualization using ReactFlow
  • Mermaid diagram generation for architecture and ER diagrams
  • Contribution opportunity suggestions based on code analysis
  • Environment variable detection and documentation
  • External integration discovery (APIs, services, libraries)
  • Repository caching per user for instant re-access
  • Export diagrams to PNG, SVG, PDF, and Mermaid formats

Key Engineering Decisions

  • Enforced application/json response MIME type to ensure deterministic parsing.
  • Designed a strict JSON schema contract for LLM output.
  • Implemented sanitization for malformed JSON (trailing commas, markdown fences).
  • Built token-limit detection and automatic retry with reduced context.
  • Implemented partial JSON recovery for truncated responses.
  • Decoupled repository analysis logic from visualization.
  • Cached repository analyses in PostgreSQL to avoid redundant processing.
  • Added export support (PNG, SVG, PDF, Mermaid) for sharing architecture diagrams.
  • Integrated AI-driven contribution suggestion engine.

Tech Stack

Frontend

React, ReactFlow, Tailwind CSS, TanStack Query, TypeScript

Backend

Express, Drizzle ORM,

Database

PostgreSQL (Neon)

AI

Google Gemini (Structured JSON Output Mode)

Integration

GitHub API (Octokit)

Deployment

Render

Demo

Play

Interested in the system design or implementation details? Check out the source code or try the live app.