-
Notifications
You must be signed in to change notification settings - Fork 1
/
leader_board.html
151 lines (137 loc) Β· 7.52 KB
/
leader_board.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>CodeMind Leaderboard</title>
<script src="https://cdnjs.cloudflare.com/ajax/libs/PapaParse/5.3.0/papaparse.min.js"></script>
<script src="https://cdn.jsdelivr.net/npm/[email protected]/dist/echarts.min.js"></script>
<!-- favicon.svg -->
<!-- <link rel="icon" href="data:image/svg+xml,<svg xmlns=%22http://www.w3.org/2000/svg%22 viewBox=%220 0 100 100%22><text y=%22.9em%22 font-size=%2290%22>π</text></svg>"> -->
<!-- <link rel="icon" href="/favicon.svg" /> -->
<link rel="icon" href="https://images.emojiterra.com/google/noto-emoji/unicode-15/color/1024px/1f9d1-1f4bb.png">
<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/[email protected]/dist/css/bootstrap.min.css">
<style>
#content {
width: 85%;
}
th, td {
text-align: center;
}
.citation-box {
border-left: 5px solid #333;
padding: 10px;
margin: 10px;
background-color: #f9f9f9;
font-style: italic;
}
#notes {
font-size: 1em;
}
#notes h3 {
margin-top: 1em;
font-size: 2em;
text-align: center;
}
#notes li {
font-size: 1.2em;
font-weight: 300;
margin: 1em;
}
</style>
</head>
<body>
<div id="content" class="container-fluid d-flex flex-column align-items-center gap-3">
<div id="content" class="container-fluid d-flex flex-column align-items-center gap-3">
<h1 class="text-nowrap mt-5">π CodeMind Leaderboard π</h1>
<h3 class="fw-light text-nowrap"><small id="warning">A Framework to Challenge Large Language Models for Code Reasoning.<br></small>
</h3>
<div class="d-flex flex-row justify-content-center gap-3">
<a href="https://github.com/Intelligent-CAT-Lab/CodeMind"><img
src="https://img.shields.io/badge/github-%23121011.svg?style=for-the-badge&logo=github&logoColor=white"
alt="github" class="img-fluid"></a>
<a href="https://arxiv.org/abs/2402.09664"><img
src="https://img.shields.io/badge/arXiv-2402.09664-b31b1b.svg"
alt="paper"
class="img-fluid"></a>
</div>
<img src="CodeMind-Logo.jpg" alt="Girl in a jacket" width="20%">
<div class="container-fluid d-flex flex-column align-items-center">
<h3 class="fw-light justify-content-center">π Please cite our paper if you are using this leaderboard in your work π</h3>
</div>
<code class="citation-box">
<strong><span style="color: rgb(0, 0, 130);">@article</span><span style="color: black;">{</span><span style="color: darkred;">liu2024codemind</span><span style="color: black;">,</span><br>
<span style="color: teal;">title</span><span style="color: black;"> = </span>{CodeMind: A Framework to Challenge Large Language Models for Code Reasoning}<span style="color: black;">,</span><br>
<span style="color: teal;">author</span><span style="color: black;"> = </span>{Liu, Changshu and Zhang, Shizhuo Dylan and Jabbarvand, Reyhaneh}<span style="color: black;">,</span><br>
<span style="color: teal;">booktitle</span><span style="color: black;"> = </span>{arXiv preprint arXiv:2402.09664}<span style="color: black;">,</span><br>
<span style="color: teal;">year<span style="color: black;"> = </span></span>{2024}<span style="color: black;">,</span><br>
<span style="color: black;">}</span></strong>
</code>
<div class="d-flex flex-row justify-content-center gap-3">
<label for="datasetL">Dataset:</label>
<select id="dataset" onchange="filterData()">
<option value="CodeNet">CodeNet</option>
<option value="MBPP">MBPP</option>
<option value="HumanEval">HumanEval</option>
<option value="Avatar">Avatar</option>
<option value="CruxEval">CruxEval</option>
</select>
<!-- Role Dropdown -->
<label for="taskL">Tasks:</label>
<select id="task" onchange="handleDropDown(); filterData()">
<option value="ier">IER</option>
<option value="der">DER</option>
<option value="sr">SR</option>
</select>
<label for="gtask" style="display: none;">Generation Task:</label>
<select id="gtask" style="display: none;"onchange="handleDropDown(); filterData()">
<!-- <option value="">Select a Code Generation Task</option> -->
<option value="synthesis">Code Synthesis</option>
<option value="translate">Code Translation</option>
</select>
<label for="source" style="display: none;">Source PL</label>
<select id="source" style="display: none;" onchange="filterData()">
<!-- <option value="">Select the source PL</option> -->
<option value="Java">Java</option>
<option value="Python">Python</option>
</select>
<label for="target" style="display: none;">Target PL</label>
<select id="target" style="display: none;" onchange="filterData()">
<!-- <option value="">Select the target PL</option> -->
<option value="Java">Java</option>
<option value="Python">Python</option>
</select>
<a href="detailsPage.html" id="detailsLink">More Details</a>
</div>
<table id="data-table"
class="table table-responsive table-striped table-bordered flex-shrink-1 border border-dark border-3">
<thead>
<tr></tr> <!-- Headers are set dynamically -->
</thead>
<tbody></tbody>
</table>
<div id="notes">
<h3>π€ More Leaderboards</h3>
<p style="font-size: large;">In addition to <strong>Code Mind</strong> leaderboards, it is recommended to comprehensively understand LLM coding ability through a
diverse set of benchmarks and leaderboards, such as:</p>
<p class="inline-block mt-3">
<ol>
<li><a href="https://codetlingua.github.io/leaderboard.html">Code Lingua</a></li>
<li><a href="https://evalplus.github.io/leaderboard.html">EvalPlus</a></li>
<li><a href="https://huggingface.co/spaces/bigcode/bigcode-models-leaderboard">Big Code Models Leaderboard</a></li>
<li><a href="https://huggingface.co/spaces/lmsys/chatbot-arena-leaderboard">Chatbot Arena Leaderboard</a></li>
<li><a href="https://github.com/amazon-science/cceval">CrossCodeEval</a></li>
<li><a href="https://fudanselab-classeval.github.io/">ClassEval</a></li>
<li><a href="https://crux-eval.github.io/leaderboard.html">CRUXEval</a></li>
<li><a href="https://evo-eval.github.io/">Evo-Eval</a></li>
<li><a href="https://github.com/01-ai/HumanEval.jl">HumanEval.jl</a></li>
<li><a href="https://infi-coder.github.io/inficoder-eval/">InfiCoder-Eval</a></li>
<li><a href="https://livecodebench.github.io/leaderboard.html">LiveCodeBench</a></li>
<li><a href="https://github.com/Leolty/repobench">RepoBench</a></li>
<li><a href="https://leaderboard.tabbyml.com/">TabbyML Leaderboard</a></li>
</ol>
</p>
</div>
</div>
<script src="script.js"></script>
</body>
</html>