
Xuehai He
Xuehai He is a Ph.D. in Computer Science at the University of California, Santa Cruz working with Xin Eric Wang. His research work mainly revolves around Multimodal Learning and Machine Learning. Before this, he was at the University of California, San Diego working with Prof. Pengtao Xie. Xuehai began his research from the University of Electronic Science and Technology of China.
Selected Publications
Jiachen Li, Qiaozi Gao, Michael Johnston, Xiaofeng Gao, Xuehai He, Suhaila Shakiah, Hangjie Shi, Reza Ghanadan, William Yang Wang. Mastering robot manipulation with multimodal prompts through pretraining and multi-task fine-tuning. ICML, 2024.
Kenan Jiang*, Xuehai He*, Ruize Xu, Xin Eric Wang. ComCLIP: Training-Free Compositional Image and Text Matching. NAACL, 2024.
Kaizhi Zheng, Xiaotong Chen, Xuehai He, Jing Gu, Linjie Li, Zhengyuan Yang, Kevin Lin, Jianfeng Wang, Lijuan Wang, Xin Eric Wang. EditRoom: LLM-parameterized Graph Diffusion for Composable 3D Room Layout Editing. ICLR, 2024.
Xuehai He, Weixi Feng, Kaizhi Zheng, Yujie Lu, Wanrong Zhu, Jiachen Li, Yue Fan, Jianfeng Wang, Linjie Li, Zhengyuan Yang, Kevin Lin, William Yang Wang, Lijuan Wang, Xin Eric Wang. MMWorld: Towards Multi-discipline Multi-faceted World Model Evaluation in Videos. ICLR, 2024.
Xuehai He, Jian Zheng, Jacob Zhiyuan Fang, Robinson Piramuthu, Mohit Bansal, Vicente Ordonez, Gunnar A Sigurdsson, Nanyun Peng, Xin Eric Wang. FlexEControl: Flexible and Efficient Multimodal Control for Text-to-Image Generation. TMLR, 2024.
Pengtao Xie, Xingchen Zhao, Xuehai He. Simultaneous Selection and Adaptation of Source Data via Four-Level Optimization. TACL, 2024.
Xuehai He, Weixi Feng, Tsu-Jui Fu, Varun Jampani, Arjun Akula, Pradyumna Narayana, Sugato Basu, William Yang Wang, Xin Eric Wang. Discffusion: Discriminative Diffusion Models as Few-shot Vision and Language Learners. TMLR, 2024.
Weixi Feng*, Wanrong Zhu*, Tsu-Jui Fu, Varun Jampani, Arjun Reddy Akula, Xuehai He, Sugato Basu, Xin Eric Wang, William Yang Wang. LayoutGPT: Compositional Visual Planning and Generation with Large Language Models. NeurIPS, 2023.
Pengtao Xie, Xingchen Zhao, Xuehai He. Improve the Performance of CT-based Pneumonia Classification via Source Data Reweighting. Nature Scientific Reports.
Xuehai He, Xin Eric Wang. Multimodal Graph Transformer for Multimodal Question Answering. EACL, 2023.
Weixi Feng, Xuehai He, Tsu-Jui Fu, Varun Jampani, Arjun Reddy Akula, Pradyumna Narayana, Sugato Basu, Xin Eric Wang, William Yang Wang. Training-Free Structured Diffusion Guidance for Compositional Text-to-Image Synthesis. ICLR, 2023.
Xuehai He, Chunyuan Li, Pengchuan Zhang, Jianwei Yang, Xin Eric Wang. Parameter-efficient Model Adaptation for Vision Transformers. AAAI, 2023.
Xuehai He, Diji Yang, Weixi Feng, Tsu-Jui Fu, Arjun Akula, Varun Jampani, Pradyumna Narayana, Sugato Basu, William Yang Wang, Xin Eric Wang. CPL: Counterfactual Prompt Learning for Vision and Language Models. EMNLP, 2022.
Tarun Gupta, Xuehai He, Mostofa Rafid Uddin, Xiangrui Zeng, Andrew Zhou, Jing Zhang, Zachary Freyberg, Min Xu. Self-supervised learning for macromolecular structure classification based on cryo-electron tomograms. Frontiers in Physiology.
Xuehai He*, Zhuo Cai*, Wenlan Wei, Yichen Zhang, Luntian Mou, Eric Xing, Pengtao Xie. Towards Visual Question Answering on Pathology Images. ACL, 2021.
Wenmian Yang, Guangtao Zeng, Bowen Tan, Zeqian Ju, Subrato Chakravorty, Xuehai He, Shu Chen, Xingyi Yang, Qingyang Wu, Zhou Yu, Eric Xing, Pengtao Xie. On the Generation of Medical Dialogues for COVID-19. ACL, 2021.
Selected Preprints
Xuehai He, Shuohang Wang, Jianwei Yang, Xiaoxia Wu, Yiping Wang, Kuan Wang, Zheng Zhan, Olatunji Ruwase, Yelong Shen, Xin Eric Wang. Mojito: Motion Trajectory and Intensity Control for Video Generation. Under reviewing. [PDF]
Kaizhi Zheng*, Xuehai He*, Xin Eric Wang. MiniGPT-5: Interleaved Vision-and-Language Generation via Generative Vokens. Under reviewing. [PDF]
Kaizhi Zheng*, Kaiwen Zhou*, Jing Gu*, Yue Fan*, Jialu Wang*, Zonglin Di, Xuehai He, Xin Eric Wang. JARVIS: A Neuro-Symbolic Commonsense Reasoning Framework for Conversational Embodied Agents. Under reviewing.
Xuehai He*, Xingyi Yang*, Shanghang Zhang, Jinyu Zhao, Yichen Zhang, Eric Xing, Pengtao Xie. Sample-Efficient Deep Learning for COVID-19 Diagnosis Based on CT Scans. Under reviewing. [PDF]
Xuehai He*, Xingyi Yang*, Yue Yang, Ruofan Guo, Yuxiao Liang, Shanghang Zhang, Li Du, Pengtao Xie. Supervised Pretraining or Self-supervised Pretraining? A Tale of Two Transfer Learning Paradigms. Preprint arXiv:2007.04234. [PDF]
Xingyi Yang, Xuehai He, Jinyu Zhao, Yichen Zhang, Shanghang Zhang, Pengtao Xie. COVID-CT Dataset: A CT Scan Dataset about COVID-19. Preprint arXiv:2003.13865. [PDF]
Academic activities
- Conference Reviewer: ICASSP'19, IJCAI'21, AAAI'21, CVPR'21, ICCV'21, CVPR'22, ECCV'22, NeurIPS'22, EMNLP'22, EACL'23, CVPR'23, ACL'23, ICML'23, ICCV'23, NeurIPS'23, EMNLP'23, ACL Rolling'24, ICML'24.
- Journal Reviewer:
- IEEE Access'19'20;
- Transactions on Pattern Analysis and Machine Intelligence'24;
- Program Committee Member: NeurIPS 2021 Workshop: Self-Supervised Learning -- Theory and Practice.
- Workshop Co-organizer:
- AAAI 2021 Workshop -- Trustworthy AI for Healthcare.
- ECCV 2022 Workshop -- Workshop on Computer Vision in the Wild;
- CVPR 2024 Workshop -- 4th Workshop on Computer Vision in the Wild;
- Workshop Reviewer: