
Jiho Kim
MULTILANG
Benchmark & Infra
PhD at Georgia Tech

Patching Vulnerabilities with Coding Agents in 2026
LLM-based patch generation has become a practical approach to fixing software vulnerabilities. Tools like Codex, Claude Code, and Gemini can read code, reason about bugs, and produce patches — often in seconds. But how well do they actually perform, in 2026? To find out, we (Team Atlanta folks at Georgia Tech) tested 10 agent configurations — combining four agent frameworks with five frontier models — on 63 real crashes from the DARPA AIxCC final competition.