REPOMIND

Open-source repo-scale coding agent on AMD MI300X.

Ingest a git repository (up to 256K tokens, FP8) on a single GPU and reason across the whole codebase with multi-step tool use.

📦 GitHub: SRKRZ23/repomind · MIT 🏆 Built for the AMD Developer Hackathon 2026 🤗 HF Special Prize candidate · 🛡 Conservative claim discipline applied

Why AMD MI300X (verified 2026-05-05 on real hardware)

  • Qwen3-Coder-Next-FP8 weights = 77.29 GiB in VRAM (verified)
  • 256K KV cache @ FP8 = 94.58 GiB available (2,065,744 tokens, verified)
  • Activations + framework overhead → peak 176/191.7 GiB ≈ 92% utilization
  • NVIDIA H100 80 GB cannot accommodate this on a single card by VRAM accounting (~143 GB > 80 GB); MI300X 192 GB has the headroom

Status

Backend right now: 🟡 Mock backend (CPU-basic, demo mode)

Set the Space secrets VLLM_BASE_URL + MODEL_NAME to wire a real MI300X backend.

Paste any GitHub URL or owner/repo shorthand. REPOMIND clones it, parses the source files, and chunks them into priority-ranked sections (README first, then top-level symbols, then nested code, then tests).

256 4096

Examples that work on a single MI300X: pallets/flask (408K tokens, fits in 256K window with priority chunking) · pytorch/vision (1.3M tokens, trimmed to 180K of highest-priority content via the chunker) · this repo SRKRZ23/repomind (~68K tokens, fits whole).