Skip to content

rllm-project/rllm_datasets

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 

Repository files navigation

SJTUTables: A Benchmark for Relational Table Learning

SJTUTables is a benchmark dataset collection designed for Relational Table Learning (RTL). Released as part of the rLLM project, it includes three enhanced relational table datasets: TML1M, TLF2K, and TACM12K. Derived from well-known classical datasets, each dataset is paired with a standard classification task. Their simple, easy-to-use, and well-organized structure makes them an ideal choice for quickly evaluating and developing RTL methods.

  • TML1M is derived from the classical MovieLens1M dataset and contains three relational tables related to movie recommendation: users, movies, and ratings.
  • TLF2K is derived from the classical LastFM2K dataset and includes three relational tables related to music preferences: artists, user-artist interactions, and user-friend relationships.
  • TACM12K is derived from the ACM heterogeneous graph dataset and contains four relational tables for academic publications: papers, authors, writing relationships, and citation relationships.

Citation

@article{rllm2024,
      title={rLLM: Relational Table Learning with LLMs}, 
      author={Weichen Li and Xiaotong Huang and Jianwu Zheng and Zheng Wang and Chaokun Wang and Li Pan and Jianhua Li},
      year={2024},
      eprint={2407.20157},
      archivePrefix={arXiv},
      primaryClass={cs.AI},
      url={https://arxiv.org/abs/2407.20157}, 
}

About

SJTUTables: A Benchmark for Relational Table Learning

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published