-
Notifications
You must be signed in to change notification settings - Fork 3
/
Copy pathmultimodal.html
161 lines (160 loc) · 7.54 KB
/
multimodal.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8" />
<title>SWE-bench</title>
<meta
name="description"
content="SWE-bench Multimodal: Do AI Systems Generalize to Visual Software Domains?"
/>
<meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1" />
<meta
name="viewport"
content="width=device-width, initial-scale=1, maximum-scale=1, user-scalable=no"
/>
<meta property="og:image" content="/logo.png" />
<link rel="shortcut icon" href="favicon_mm.ico" type="image/x-icon" />
<link rel="icon" href="favicon_mm.ico" type="image/x-icon" />
<link rel="stylesheet" href="css/normalize.css" />
<link rel="stylesheet" href="css/fonts.css" />
<link rel="stylesheet" href="css/styles.css" />
<link
rel="stylesheet"
href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/6.2.0/css/all.min.css"
integrity="..."
crossorigin="anonymous"
/>
<!-- Google tag (gtag.js) -->
<script
async
src="https://www.googletagmanager.com/gtag/js?id=G-H9XFCMDPNS"
></script>
<script>
window.dataLayer = window.dataLayer || [];
function gtag() {
dataLayer.push(arguments);
}
gtag("js", new Date());
gtag("config", "G-H9XFCMDPNS");
</script>
</head>
<body>
<div style="padding-bottom: 50px">
<section style="
background:
radial-gradient(circle at 20% 30%, rgba(255, 99, 71, 1), transparent 40%),
radial-gradient(circle at 80% 40%, rgba(0, 255, 127, 1), transparent 40%),
radial-gradient(circle at 20% 70%, rgba(70, 130, 180, 1), transparent 40%),
radial-gradient(circle at 70% 90%, rgba(255, 165, 0, 0.8), transparent 40%),
radial-gradient(circle at 50% 50%, rgba(138, 43, 226, 0.8), transparent 40%);
background-blend-mode: screen;
">
<div
class="content-wrapper title-wrapper"
style="flex-direction: column"
>
<div
style="
display: flex;
flex-direction: row;
align-items: center;
padding-bottom: 15px;
"
>
<h1 style="font-size: 60px; padding-top: 0.4em; color: #2F4F4F;">SWE-bench Multimodal</h1>
<img
src="img/swellamamm.png"
style="height: 100px; padding-top: 0em; padding-left: 0.5em"
/>
</div>
<h3 style="color: #2F4F4F;">Do AI Systems Generalize to Visual Software Domains?</h3>
<h3 style="font-size: 20px; padding-top: 1.2em">ICLR 2025</h3>
<p style="text-align: center;margin-top:1em; color: #2F4F4F;">
John Yang*, Carlos E. Jimenez*,<br />
Alex L. Zhang, Kilian Lieret, Joyce Yang, Xindi Wu, Ori Press, Niklas Muennighoff,<br />
Gabriel Synnaeve, Karthik Narasimhan, Diyi Yang, Sida I. Wang, Ofir Press
</p>
<div class="content-wrapper" style="margin-top: 2em;">
<a href="index.html">
<button
class="outline multimodal"
style="flex-direction: row; display: flex; justify-content: center; align-items: center;">
<img
src="img/swellama.png"
style="height: 1.3em; margin-right: 0.4em; margin-bottom: 0.3em;" />
Home
</button>
</a>
<a href="https://arxiv.org/abs/2410.03859">
<button class="outline multimodal">
<i class="fa fa-paperclip"></i> Paper
</button>
</a>
<a href="https://huggingface.co/datasets/princeton-nlp/SWE-bench_Multimodal">
<button class="outline multimodal">
<i class="fa fa-database"></i> Dataset
</button>
</a>
</div>
</div>
</section>
<section class="main-container">
<div class="content-wrapper">
<div class="content-box">
<h2 class="text-title">About</h2>
<img src="img/teaser_mm.png" style="width:80%;margin:auto;display:block;"/>
<p class="text-content">
SWE-bench Multimodal is a dataset for evaluating AI systems on visual software engineering tasks.
It contains 619 task instances from 17 popular JavaScript repositories, each featuring images crucial to problem-solving.
The dataset covers a range of challenges including UI glitches, map rendering problems, or data visualization bugs.
SWE-bench Multimodal challenges AI systems to tackle the diverse, multimodal nature of modern software development.
</p class="text-content">
<h3 class="text-title" style="margin-bottom:0.5em">Citation</h3>
<pre id="citation" style="border-color: #2F4F4F;"><code>@misc{yang2024swebenchmultimodalaisystems,
title={SWE-bench Multimodal: Do AI Systems Generalize to Visual Software Domains?},
author={John Yang and Carlos E. Jimenez and Alex L. Zhang and Kilian Lieret and Joyce Yang and Xindi Wu and Ori Press and Niklas Muennighoff and Gabriel Synnaeve and Karthik R. Narasimhan and Diyi Yang and Sida I. Wang and Ofir Press},
year={2024},
eprint={2410.03859},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2410.03859},
}</code></pre>
<p class="text-content" style="margin-bottom: 0;">
<b>Disclaimer:</b> SWE-bench Multimodal is for research purposes only. Models
trained and evaluated on SWE-bench Multimodal can produce unexpected results.
We are not responsible for any damages caused by the use of
SWE-bench Multimodal, including but not limited to, any loss of profit, data,
or use of data.
<p style="line-height: 1.6667em;">
<b>Usage:</b> If you would like to use this website template for your
own leaderboard, please send Carlos & John an email requesting permission.
If granted, please make sure to acknowledge the SWE-bench team and link to
this leaderboard on the home page of the website.
</p>
<p class="text-content">
Correspondence to: <a href="mailto:carlosej@princeton.edu">carlosej@princeton.edu</a>,
<a href="mailto:johnby@stanford.edu">johnby@stanford.edu</a>
</p>
<div class="content-wrapper" style="display: flex; flex-direction: row; margin-top: 0.5em;">
<a href="https://www.cs.stanford.edu/">
<img src="img/stanford_seal.png" style="height: 3em;padding-top:0.5em;padding-right: 1em" />
</a>
<a href="https://princeton-nlp.github.io/">
<img src="img/princeton_seal.svg" style="height: 3em;padding-top:0.5em;padding-right: 1em" />
</a>
<a href="https://pli.princeton.edu/">
<img src="img/pli_logo.svg" style="height: 3em;padding-top:0.5em;padding-right: 1em" />
</a>
<a href="https://www.cs.cornell.edu/">
<img src="img/cornell_seal.png" style="height: 3em;padding-top:0.5em;padding-right: 1em" />
</a>
<a href="https://uni-tuebingen.de/en/faculties/faculty-of-science/departments/computer-science/department/">
<img src="img/tubingen_logo.png" style="height: 2.5em;padding-top:0.5em;padding-right: 1em" />
</a>
</div>
</div>
</div>
</section>
</div>
</body>
</html>