diff --git a/img/swe-bench_lite_results.png b/img/swe-bench_lite_results.png new file mode 100644 index 0000000..9abe172 Binary files /dev/null and b/img/swe-bench_lite_results.png differ diff --git a/lite.html b/lite.html index ed0bc18..e909b97 100644 --- a/lite.html +++ b/lite.html @@ -69,7 +69,12 @@

A Canonical Subset for Efficient Evaluation of Language Models as Software E

- SWE-bench lite distribution across repositories. Compare to the full SWE-bench in Figure 3 of the SWE-bench paper. + SWE-bench Lite distribution across repositories. Compare to the full SWE-bench in Figure 3 of the SWE-bench paper. +

+
+ +

+ SWE-bench Lite performance for our baselines. Compare to the full SWE-bench baseline performance in Table 5 of the SWE-bench paper.