Skip to content

Commit

Permalink
Small fixes for EMNLP paper
Browse files Browse the repository at this point in the history
  • Loading branch information
yahskapar committed Sep 20, 2024
1 parent d7d9db1 commit 7529a22
Showing 1 changed file with 3 additions and 3 deletions.
6 changes: 3 additions & 3 deletions index.html
Original file line number Diff line number Diff line change
Expand Up @@ -131,8 +131,8 @@
<br>
<p>This website is a work in progress. Please see my <a href="https://scholar.google.com/citations?hl=en&user=tc0M7WwAAAAJ&view_op=list_works&sortby=pubdate">Google Scholar</a> for a full, more up-to-date list of my publications!</p>

<div class="row content-summary pt-4 pb-2"></div>

<div class="row content-summary pt-4 pb-2">
<div class="d-none d-sm-block col-sm-3 m-0 p-0">

<img src=/media/papers/prob_reasoning_in_LLMs/prob_reasoning_in_LLMs.png class="img-fluid summary-image drop-shadow" alt="teaser img">
Expand All @@ -146,7 +146,7 @@
EMNLP 2024 (Main)
</p>

<p class="summary-text">Probabilistic reasoning is a key challenge for large language models (LLMs) that requires understanding and interpreting numerical data across distributions. In our paper, we systematically evaluate LLMs on three core tasks—percentile estimation, sampling, and probability calculation—across both real-world and idealized distributions. By incorporating techniques such as within-distribution anchoring, real-world context, and simplifying assumptions (e.g., Normal approximations), we demonstrate performance improvements of up to 70% over baseline methods. We will release our benchmark dataset to encourage further development of the reasoning capabilities of LLMs, allowing them to become more useful, safer, and more reliable.</p>
<p class="summary-text">Probabilistic reasoning is a key challenge for large language models (LLMs). Our paper evaluates LLMs on three tasks of estimating percentiles, drawing samples, and calculating probabilties using real-world and idealized distributions. Techniques such as within-distribution anchoring, real-world context, and simplifying assumptions (e.g., Normal approximations) improved performance by up to 70%.</p>

<div class="d-flex flex-row flex-wrap">

Expand Down

0 comments on commit 7529a22

Please sign in to comment.