diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml index ecb8144..a095f9e 100644 --- a/.github/workflows/main.yml +++ b/.github/workflows/main.yml @@ -53,7 +53,10 @@ jobs: run: | git config --global user.name "github-actions" git config --global user.email "github-actions@github.com" - git add README.md paper_by_env paper_by_author update_template_or_data/statistics + git add README.md + git add paper_by_env + git add paper_by_author + git add update_template_or_data/statistics git commit -m "Auto Updates" git push env: diff --git a/update_template_or_data/update_paper_list.md b/update_template_or_data/update_paper_list.md index b31dda7..ddd16fc 100644 --- a/update_template_or_data/update_paper_list.md +++ b/update_template_or_data/update_paper_list.md @@ -1,5 +1,12 @@ - +- [Proposer-Agent-Evaluator (PAE): Autonomous Skill Discovery For Foundation Model Internet Agents](https://arxiv.org/abs/2412.13194) + - Yifei Zhou, Qianlan Yang, Kaixiang Lin, Min Bai, Xiong Zhou, Yu-Xiong Wang, Sergey Levine, Erran Li + - 🏛️ Institutions: MSR, UIUC, Amazon + - 📅 Date: December 17, 2024 + - 📑 Publisher: arXiv + - 💻 Env: [Web] + - 🔑 Key: [framework], [reinforcement learning], [skill discovery], [PAE] + - 📖 TLDR: This paper introduces the Proposer-Agent-Evaluator (PAE) system, enabling foundation model agents to autonomously discover and practice skills in real-world web environments. PAE comprises a context-aware task proposer, an agent policy for task execution, and a vision-language model-based success evaluator. Validated on vision-based web navigation tasks, PAE significantly enhances zero-shot generalization capabilities of vision-language model Internet agents, achieving over 30% relative improvement on unseen tasks and websites, and surpassing state-of-the-art open-source agents by more than 10%. - [UI Layout Generation with LLMs Guided by UI Grammar](https://arxiv.org/abs/2310.15455) - Yuwen Lu, Ziang Tong, Qinyi Zhao, Chengzhi Zhang, Toby Jia-Jun Li