-
University of Macau
- Macau
- https://ruiyang-061x.github.io/
- @Ruiyang_061X
Stars
AutoHallusion Codebase (EMNLP 2024)
[CVPR'24] HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(ision), LLaVA-1.5, and Other Multi-modality Models
[ICLR'24] Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning
Code for "AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling"
An Open-source Framework for Data-centric, Self-evolving Autonomous Language Agents
[AAAI 2024] Official PyTorch Implementation of ''BAT: Behavior-Aware Human-Like Trajectory Prediction for Autonomous Driving''.
✨A curated list of papers on the uncertainty in multi-modal large language model (MLLM).
[ACL 2024] Logical Closed Loop: Uncovering Object Hallucinations in Large Vision-Language Models. Detect and mitigate object hallucinations in LVLMs by itself through logical closed loops.
Unsolvable Problem Detection: Evaluating Trustworthiness of Vision Language Models
Benchmarking LLMs via Uncertainty Quantification
VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs
[ACL2023 Area Chair Award] Official repo for the paper "Tell2Design: A Dataset for Language-Guided Floor Plan Generation".
[CVPR 2024 Highlight] Mitigating Object Hallucinations in Large Vision-Language Models through Visual Contrastive Decoding
✨✨The Curse of Multi-Modalities (CMM): Evaluating Hallucinations of Large Multimodal Models across Language, Visual, and Audio
Code and data of "Unified Triplet-Level Hallucination Evaluation for Large Vision-Language Models".
😎 Awesome lists about all kinds of interesting topics
🚀 [NeurIPS24] Make Vision Matter in Visual-Question-Answering (VQA)! Introducing NaturalBench, a vision-centric VQA benchmark (NeurIPS'24) that challenges vision-language models with simple questio…
Auto Interpretation Pipeline and many other functionalities for Multimodal SAE Analysis.
up-to-date curated list of state-of-the-art Large vision language models hallucinations research work, papers & resources
🔎Official code for our paper: "VL-Uncertainty: Detecting Hallucination in Large Vision-Language Model via Uncertainty Estimation".
Image augmentation for machine learning experiments.
[ICLR 2024] Fine-tuning LLaMA to follow Instructions within 1 Hour and 1.2M Parameters
PyTorch implementation of MAR+DiffLoss https://arxiv.org/abs/2406.11838
[ICLR24] Official implementation of the paper “MagicDrive: Street View Generation with Diverse 3D Geometry Control”
An open-source framework for training large multimodal models.
[ECCV 2024] ShareGPT4V: Improving Large Multi-modal Models with Better Captions
[NeurIPS'24 Oral] HydraLoRA: An Asymmetric LoRA Architecture for Efficient Fine-Tuning
List of papers on hallucination detection in LLMs.