Zhaoxuan Tan

Zhaoxuan Tan 「谭兆轩」

How to pronounce my name?

Zhaoxuan -> jow sh-yen.
Tan -> tæn.
You can just call my last name Tan, which is much easier to pronounce.

Hi there, thanks for visiting my website! I'm a (Now() - 08/2023).ceil().ordinal()-year CSE PhD student at the University of Notre Dame, working with Prof. Meng Jiang. Prior to that, I obtained my bachelor's degree in computer science and technology at Xi'an Jiaotong University (2019-2023) and had wonderful time doing research at the LUD lab, where I worked closely with Shangbin Feng and Prof. Minnan Luo.

I play with user data, including user-generated content and user behavior data, to personalize and enhance Large Language Models, as well as to detect suspicious user behavior.

Please feel free to drop me an Email for any form of communication or collaboration!☘️

Email: ztan3 [at] nd [dot] edu / tanzx9 [at] gmail [dot] com

CV / Google Scholar / Semantic Scholar / X (Twitter) / Github / LinkedIn

🔥What's New

[2025.05] 4 papers were accepted to ACL 2025!👏 Main: PUGC, StepCo, MANU. Findings: CodeTaxo. See you in Vienna!🇦🇹
[2025.02] Excited to join Microsoft Office of Applied Research as a part-time research intern this spring and work with Dr. Pei Zhou and Dr. Mengting Wan!
[2025.01] PerRecBench is live on arXiv! We remove the user rating bias and item quality in user rating prediction and specifically focus on evlauating personalization. Come check out our work!
[2025.01] 2 papers were accepted to NAACL 2025 Main!👏 MLLMU-Bench, IHEval. See you in New Mexico!
[2024.09] 4 papers were accepted to EMNLP 2024!👏 Main: OPPU, Per-Pcs, ProCo. Findings: NLGift. See you in Miami!🏖️
[2024.07] Chain-of-Layer was accepted to CIKM 2024!👏
[2024.06] Per-Pcs is live on arxiv. Come check out our framework of personalizing LLM with collaborative efforts!
[2024.05] 4 papers were accepted to ACL 2024👏! Main: BotSay. Findings: SKU, DELL, and K-Crosswords.
[2024.02] OPPU is live on arxiv. Welcome to check out our work!
[2024.01] KGQuiz was accepted to WWW 2024. Huge congrats to Yuyang!👏
[2024.01] I will join Amazon as an applied scientist intern this summer. See you in Palo Alto!
[2023.12] LLM-UM was accepted to DEBULL and live on arxiv. Welcome to check out our work and the reading list!🤗
[2023.10] LMBot was accepted to WSDM 2024. Huge congrats to Zijian!👏
[2023.10] BotPercent and MVSD were accepted to EMNLP 2023. Congrats to co-authors!
[2023.09] NLGraph was accepted to NeurIPS 2023 as a spotlight. Huge congrats to Heng!👏
[2023.07] I graduated from XJTU and stepped down as the LUD lab director. Be the Light of the World!🎓
[2023.05] KALM was accepted to ACL 2023, kudos to coauthors!👏
[2023.04] MVSD is live on arxiv, welcome to check out our work!
[2023.04] BotMoE was accepted to SIGIR 2023. Huge congrats to Yuhan!
[2023.03] I will join the University of Notre Dame to work with Prof. Meng Jiang this fall. Thank you for seeing my potential and looking forward to the incoming PhD journey!🥳
[2023.02] BotPercent is alive on arxiv, welcome to check out our work!
[2023.01] KRACL was accepted by WWW 2023, cheers!🍻
[2022.11] Our team CogDL-kgTransformer won the 4-th place in the OGB-LSC@NeurIPS2022 competition WikiKG90Mv2 track!

Selected Publications (* indicates equal contribution) [Google Scholar]

2025

3DSP

Aligning Large Language Models with Implicit Preferences from User-Generated Content
Zhaoxuan Tan, Zheng Li, Tianyi Liu, Haodong Wang, Hyokun Yun, Ming Zeng, Pei Chen, Zhihan Zhang, Yifan Gao, Ruijie Wang, Priyanka Nigam, Bing Yin, Meng Jiang
Proceedings of ACL 2025.
project page

We proposed PUGC, a framework leveraging implicit human preferences in unlabeled user-generated content (UGC) for scalable and domain-specific alignment of LLMs, achieving significant performance improvements over traditional methods on alignment benchmarks.

3DSP

Can Large Language Models Understand Preferences in Personalized Recommendation?
Zhaoxuan Tan, Zinan Zeng, Qingkai Zeng, Zhenyu Wu, Zheyuan Liu, Fengran Mo, Meng Jiang
arXiv preprint 2025.

We proposed PerRecBench, a benchmark that isolates user rating bias and item quality to evaluate recommendation techniques in a grouped ranking manner, revealing that while larger LLMs perform better overall, they struggle with personalized recommendation, emphasizing the need for improved fine-tuning strategies and understanding of user preferences.

2024

	Personalized Pieces: Efficient Personalized Large Language Models through Collaborative Efforts Zhaoxuan Tan, Zheyuan Liu, Meng Jiang Proceedings of EMNLP 2024. We proposed Personalized Pieces (Per-Pcs) for personalizing large language models, where users can safely share and assemble personalized PEFT modules efficiently through collaborative efforts. Per-Pcs outperforms non-personalized and PEFT retrieval baselines, offering performance comparable to OPPU with significantly lower resource use, promoting safe sharing and making LLM personalization more efficient, effective, and widely accessible.
	Democratizing Large Language Models via Personalized Parameter-Efficient Fine-tuning Zhaoxuan Tan, Qingkai Zeng, Yijun Tian, Zheyuan Liu, Bing Yin, Meng Jiang Proceedings of EMNLP 2024. We proposed One PEFT Per User (OPPU) for personalizing large language models, where each user is equipped a personal PEFT module that can be plugged in base LLM to obtain their personal LLM. OPPU exhibits model ownership and enhanced generalization in capturing user behavior patterns compared to existing prompt-based LLM personalization methods.
	LMBot: Distilling Graph Knowledge into Language Model for Graph-less Deployment in Twitter Bot Detection Zijian Cai, Zhaoxuan Tan, Zhenyu Lei, Zifeng Zhu, Hongrui Wang, Qinghua Zheng, Minnan Luo Proceedings of WSDM, 2024. We propose LMBot, which utilizes a language model with graph-aware knowledge distillation to act as a proxy for graph-less Twitter bot detection inference. This approach effectively resolves graph data dependency and sampling bias issues.

2023

	User Modeling in the Era of Large Language Models: Current Research and Future Directions Zhaoxuan Tan, Meng Jiang IEEE Data Engineering Bulletin (DEBULL), 2023. reading list We summarize existing research about how and why LLMs are great tools of modeling and understanding UGC. Then we review a few categories of large language models for user modeling (LLM-UM) approaches that integrate the LLMs with text and graph-based methods in different ways. Then we introduce specific LLM-UM techniques for a variety of UM applications. Finally, we present remaining challenges and future directions in the LLM-UM research.
	BotPercent: Estimating Bot Populations in Twitter Communities Zhaoxuan Tan, Shangbin Feng, Melanie Sclar, Herun Wan, Minnan Luo, Yejin Choi, Yulia Tsvetkov Proceedings of EMNLP-Findings, 2023. demo / tweet We introduce the concept of community-level Twitter bot detection and develope BotPercent, a multi-dataset, multi-model Twitter bot detection pipeline. Utilizing BotPercent, we investigate the presence of bots in various Twitter communities and discovered that bot distribution is heterogeneous in both space and time.
	Can Language Models Solve Graph Problems in Natural Language? Heng Wang, Shangbin Feng, Tianxing He, Zhaoxuan Tan, Xiaochuang Han, Yulia Tsvetkov Proceedings of NeurIPS 2023 (Spotlight) code Are language models graph reasoners? We propose the NLGraph benchmark, a test bed for graph-based reasoning designed for language models in natural language. We find that LLMs are preliminary graph thinkers while the most advanced graph reasoning tasks remain an open research question.
	HOFA: Twitter Bot Detection with Homophily-Oriented Augmentation and Frequency Adaptive Attention Sen Ye, Zhaoxuan Tan, Zhenyu Lei, Ruijie He, Hongrui Wang, Qinghua Zheng, Minnan Luo arXiv preprint 2023. We identify the heterophilous disguise challenge in Twitter bot detection and proposed HOFA, a novel framework equipped with Homophily-Oriented Augmentation and Frequency Adaptive Attention to demystify the heterophilous disguise challenge.
	BotMoE: Twitter Bot Detection with Community-Aware Mixtures of Modal-Specific Experts Yuhan Liu, Zhaoxuan Tan, Heng Wang, Shangbin Feng, Qinghua Zheng, Minnan Luo Proceedings of SIGIR 2023. We propose community-aware mixture-of-experts to address two challenges in detecting advanced Twitter bots: manipulated features and diverse communities.
	KRACL: Contrastive Learning with Graph Context Modeling for Sparse Knowledge Graph Completion Zhaoxuan Tan, Zilong Chen, Shangbin Feng, Qingyue Zhang, Qinghua Zheng, Jundong Li, Minnan Luo Proceedings of The Web Conference (WWW), 2023. code / talk We adopt contrastive learning and knowledge relational attention network to alleviate the widespread sparsity problem in knowledge graphs.

2022

3DSP

TwiBot-22: Towards Graph-Based Twitter Bot Detection
Shangbin Feng*, Zhaoxuan Tan*, Herun Wan*, Ningnan Wang*, Zilong Chen*, Binchi Zhang*, Qinghua Zheng, Wenqian Zhang, Zhenyu Lei, Shujie Yang, Xinshun Feng, Qingyue Zhang, Hongrui Wang, Yuhan Liu, Yuyang Bai, Heng Wang, Zijian Cai, Yanbo Wang, Lijing Zheng, Zihan Ma, Jundong Li, Minnan Luo
Proceedings of NeurIPS, Datasets and Benchmarks Track, 2022.
website / GitHub / bibtex / poster

We present TwiBot-22, the largest graph-based Twitter bot detection benchmark to date, which provides diversified entities and relations in Twittersphere and has considerably better annotation quality.

3DSP

Heterogeneity-Aware Twitter Bot Detection with Relational Graph Transformers
Shangbin Feng, Zhaoxuan Tan, Rui Li, Minnan Luo
Proceedings of AAAI 2022.
slides / code / bibtex

We propose the relational graph transformers GNN architecture to leverage the intrinsic relation heterogeneity and influence heterogeneity in Twitter network.

Industrial Experience

	Google 2025.10 - 2025.12 Student Researcher @ Core ML Host: Dr. Ao Liu, Dr. Yan Zhu Remote
	Amazon Science 2025.06 - 2025.10 Applied Scientist Intern @ Rufus Host: Dr. Zixuan Zhang, Dr. Zheng Li Palo Alto, CA
	Microsoft 2025.03 - 2025.06 Research Intern @ Office of Applied Research Host: Dr. Pei Zhou, Dr. Mengting Wan Remote
	Amazon Science 2024.05 - 2024.10 Applied Scientist Intern @ Rufus Host: Dr. Zheng Li, Dr. Tianyi Liu Palo Alto, CA

Education

	University of Notre Dame 2023.08 - present Ph.D. in Computer Science and Engineering Advisor: Prof. Meng Jiang
	Xi'an Jiaotong University 2019.08 - 2023.07 B.E. in Computer Science and Technology GPA: 89.1 (+3) / 100.0 [top 5%] Advisor: Prof. Minnan Luo

Service

Area Chair: ARR (Feb 2025-)
Reviewer: TMLR (2025), ICML (2025), AISTATS (2025), COLM (2024, 2025), KDD (2024, 2025), ARR (Dec 2023-), WWW (2024, 2025), ICLR (2024, 2025), TKDE (2023-), TNNLS (2023-), ICWSM (2024), NeurIPS (2023-2025), NeurIPS D&B (2022), LoG (2022, 2023, 2024), SRW@ACL (2025), AGI@ICLR (2024), TGL@NeurIPS (2023), GCLR@AAAI (2024), KnowledgeNLP@ACL (2024), WiNLP@EMNLP (2024).
Volunteer: EMNLP 2023 (virtual), EMNLP 2022 (virtual).
Director of the LUD lab (promote undergraduate research) 2022-2023.

Miscellaneous

I have the fortune to work with brilliant mentors, collaborators, and advisors during my research journey and I am truly grateful for their guidance and help. If you feel like I can be of some help to your research career, welcome to reach out!☕
I enjoy playing trumpet🎺 and served as the principal trumpet player in my primary and high school band🎼.
I am a big fan of Jules Verne, and especially fascinated with In Search of the Castaways, From the Earth to the Moon, and Five Weeks in a Balloon.
I also love jogging🏃 and playing table tennis🏓.
My favorite singers are Khalil Fong🕯️ and Jonathan Lee.

Template courtesy: Jon Barron.