From 8fb5a54cd971f2f76465f52571c87b7893dc8c4c Mon Sep 17 00:00:00 2001 From: John Yang Date: Thu, 27 Jun 2024 14:10:02 -0400 Subject: [PATCH] Add SWE-bench eval upgrade announcement --- index.html | 6 +++--- template/template.html | 6 +++--- 2 files changed, 6 insertions(+), 6 deletions(-) diff --git a/index.html b/index.html index bd19131..3dbf830 100644 --- a/index.html +++ b/index.html @@ -94,9 +94,9 @@

ICLR 2024

- 🎉 Check out our latest work, - SWE-agent, - which achieves a 12.47% resolve rate on SWE-bench! + 🔥 Evaluating on SWE-bench just became a lot more reliable! + SWE-bench evaluation now uses Docker for easier, containerized, reproducible evaluation. + [Report]
diff --git a/template/template.html b/template/template.html index 0ffdee6..ba94286 100644 --- a/template/template.html +++ b/template/template.html @@ -94,9 +94,9 @@

ICLR 2024

- 🎉 Check out our latest work, - SWE-agent, - which achieves a 12.47% resolve rate on SWE-bench! + 🔥 Evaluating on SWE-bench just became a lot more reliable! + SWE-bench evaluation now uses Docker for easier, containerized, reproducible evaluation. + [Report]