From 7529a2266d598b8168a7b1a9732c288dde2d271d Mon Sep 17 00:00:00 2001 From: Akshay Paruchuri Date: Fri, 20 Sep 2024 17:15:05 -0400 Subject: [PATCH] Small fixes for EMNLP paper --- index.html | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/index.html b/index.html index 753cd90..2b351a3 100644 --- a/index.html +++ b/index.html @@ -131,8 +131,8 @@

This website is a work in progress. Please see my Google Scholar for a full, more up-to-date list of my publications!

-
- +
+
teaser img @@ -146,7 +146,7 @@ EMNLP 2024 (Main)

-

Probabilistic reasoning is a key challenge for large language models (LLMs) that requires understanding and interpreting numerical data across distributions. In our paper, we systematically evaluate LLMs on three core tasks—percentile estimation, sampling, and probability calculation—across both real-world and idealized distributions. By incorporating techniques such as within-distribution anchoring, real-world context, and simplifying assumptions (e.g., Normal approximations), we demonstrate performance improvements of up to 70% over baseline methods. We will release our benchmark dataset to encourage further development of the reasoning capabilities of LLMs, allowing them to become more useful, safer, and more reliable.

+

Probabilistic reasoning is a key challenge for large language models (LLMs). Our paper evaluates LLMs on three tasks of estimating percentiles, drawing samples, and calculating probabilties using real-world and idealized distributions. Techniques such as within-distribution anchoring, real-world context, and simplifying assumptions (e.g., Normal approximations) improved performance by up to 70%.