Introducing SWE-Atlas. We built SWE-Atlas as the next evolution of SWE-Bench Pro, expanding agent evaluation beyond change accuracy to better reflect the real, interactive workflows that define software development. Results for Codebase QnA, the first eval under SWE-Atlas that
Mar 4, 2026
Views58.0k
Comments18
Reposts53

