Zhaoxuan Tan

How to pronounce my name?

Zhaoxuan -> jow sh-yen.
Tan -> tæn.
You can call me Joshua, which is pronounced similarly to Zhaoxuan.

Hi there, thanks for visiting my website! I'm a first-year CSE PhD student at the University of Notre Dame, where I am fortunate to be advised by Prof. Meng Jiang and affiliated with the DM2 lab. Prior to that, I obtained my bachelor's degree in computer science and technology at Xi'an Jiaotong University (2019-2023) and had wonderful time conducting research at the LUD lab, where I worked closely with Shangbin Feng and Prof. Minnan Luo.

My primary research interest lies at the intersection of graph mining (especially knowledge graphs and social networks) and natural language processing, with a particular focus on computation for social good and LLMs for user modeling.

Please feel free to drop me an Email or book a chat for any form of communication or collaboration!☘️

Email:  ztan3 [at] nd [dot] edu  /  tanzx9 [at] gmail [dot] com

CV  /  Google Scholar  /  Semantic Scholar  /  Twitter  /  Github  /  LinkedIn

profile photo
🔥What's New
  • [2024.06] Per-Pcs is alive on arxiv. Come check out our framework of personalizing LLM with collaborative efforts!
  • [2024.05] 4 papers accepted to ACL 2024👏! Main: Bot-LLM, Findings: SKU, DELL, and K-Crosswords.
  • [2024.02] OPPU is alive on arxiv. Welcome to check out our work!
  • [2024.01] KGQuiz was accepted to WWW 2024. Huge congrats to Yuyang!👏
  • [2024.01] I will join Amazon as an applied scientist intern this summer. See you in Palo Alto!
  • [2023.12] LLM-UM was accepted to DEBULL and live on arxiv. Welcome to check out our work and the reading list!🤗
  • [2023.10] LMBot was accepted to WSDM 2024. Huge congrats to Zijian!👏
  • [2023.10] BotPercent and MVSD were accepted to EMNLP 2023. Congrats to co-authors!
  • [2023.09] NLGraph was accepted to NeurIPS 2023 as a spotlight. Huge congrats to Heng!👏
  • [2023.07] I graduated from XJTU and stepped down as the LUD lab director. Be the Light of the World!🎓
  • [2023.05] KALM was accepted to ACL 2023, kudos to coauthors!👏
  • [2023.04] MVSD is live on arxiv, welcome to check out our work!
  • [2023.04] BotMoE was accepted to SIGIR 2023. Huge congrats to Yuhan!
  • [2023.03] I will join the University of Notre Dame to work with Prof. Meng Jiang this fall. Thank you for seeing my potential and looking forward to the incoming PhD journey!🥳
  • [2023.02] BotPercent is live on arxiv, welcome to check out our work!
  • [2023.01] KRACL was accepted by WWW 2023, cheers!🍻
  • [2022.11] Our team CogDL-kgTransformer won the 4-th place in the OGB-LSC@NeurIPS2022 competition WikiKG90Mv2 track!
Selected Publications (* indicates equal contribution) [Google Scholar]
3DSP Personalized Pieces: Efficient Personalized Large Language Models through Collaborative Efforts
Zhaoxuan Tan, Zheyuan Liu, Meng Jiang
arxiv preprint 2024.

We proposed Personalized Pieces (Per-Pcs) for personalizing large language models, where users can safely share and assemble personalized PEFT modules efficiently through collaborative efforts. Per-Pcs outperforms non-personalized and PEFT retrieval baselines, offering performance comparable to OPPU with significantly lower resource use, promoting safe sharing and making LLM personalization more efficient, effective, and widely accessible.

3DSP Democratizing Large Language Models via Personalized Parameter-Efficient Fine-tuning
Zhaoxuan Tan, Qingkai Zeng, Yijun Tian, Zheyuan Liu, Bing Yin, Meng Jiang
arxiv preprint 2024.

We proposed One PEFT Per User (OPPU) for personalizing large language models, where each user is equipped a personal PEFT module that can be plugged in base LLM to obtain their personal LLM. OPPU exhibits model ownership and enhanced generalization in capturing user behavior patterns compared to existing prompt-based LLM personalization methods.

3DSP LMBot: Distilling Graph Knowledge into Language Model for Graph-less Deployment in Twitter Bot Detection
Zijian Cai, Zhaoxuan Tan, Zhenyu Lei, Zifeng Zhu, Hongrui Wang, Qinghua Zheng, Minnan Luo
Proceedings of WSDM, 2024.

We propose LMBot, which utilizes a language model with graph-aware knowledge distillation to act as a proxy for graph-less Twitter bot detection inference. This approach effectively resolves graph data dependency and sampling bias issues.

3DSP User Modeling in the Era of Large Language Models: Current Research and Future Directions
Zhaoxuan Tan, Meng Jiang
IEEE Data Engineering Bulletin (DEBULL), 2023.
reading list

We summarize existing research about how and why LLMs are great tools of modeling and understanding UGC. Then we review a few categories of large language models for user modeling (LLM-UM) approaches that integrate the LLMs with text and graph-based methods in different ways. Then we introduce specific LLM-UM techniques for a variety of UM applications. Finally, we present remaining challenges and future directions in the LLM-UM research.

3DSP BotPercent: Estimating Bot Populations in Twitter Communities
Zhaoxuan Tan*, Shangbin Feng*, Melanie Sclar, Herun Wan, Minnan Luo, Yejin Choi, Yulia Tsvetkov
Proceedings of EMNLP-Findings, 2023.
demo / tweet

We introduce the concept of community-level Twitter bot detection and develope BotPercent, a multi-dataset, multi-model Twitter bot detection pipeline. Utilizing BotPercent, we investigate the presence of bots in various Twitter communities and discovered that bot distribution is heterogeneous in both space and time.

3DSP Can Language Models Solve Graph Problems in Natural Language?
Heng Wang*, Shangbin Feng*, Tianxing He, Zhaoxuan Tan, Xiaochuang Han, Yulia Tsvetkov
Proceedings of NeurIPS 2023 (Spotlight)

Are language models graph reasoners? We propose the NLGraph benchmark, a test bed for graph-based reasoning designed for language models in natural language. We find that LLMs are preliminary graph thinkers while the most advanced graph reasoning tasks remain an open research question.

3DSP HOFA: Twitter Bot Detection with Homophily-Oriented Augmentation and Frequency Adaptive Attention
Sen Ye, Zhaoxuan Tan, Zhenyu Lei, Ruijie He, Hongrui Wang, Qinghua Zheng, Minnan Luo
arXiv preprint 2023.

We identify the heterophilous disguise challenge in Twitter bot detection and proposed HOFA, a novel framework equipped with Homophily-Oriented Augmentation and Frequency Adaptive Attention to demystify the heterophilous disguise challenge.

3DSP BotMoE: Twitter Bot Detection with Community-Aware Mixtures of Modal-Specific Experts
Yuhan Liu, Zhaoxuan Tan, Heng Wang, Shangbin Feng, Qinghua Zheng, Minnan Luo
Proceedings of SIGIR 2023.

We propose community-aware mixture-of-experts to address two challenges in detecting advanced Twitter bots: manipulated features and diverse communities.

3DSP KRACL: Contrastive Learning with Graph Context Modeling for Sparse Knowledge Graph Completion
Zhaoxuan Tan, Zilong Chen, Shangbin Feng, Qingyue Zhang, Qinghua Zheng, Jundong Li, Minnan Luo
Proceedings of The Web Conference (WWW), 2023.
code / talk

We adopt contrastive learning and knowledge relational attention network to alleviate the widespread sparsity problem in knowledge graphs.

3DSP TwiBot-22: Towards Graph-Based Twitter Bot Detection
Shangbin Feng*, Zhaoxuan Tan*, Herun Wan*, Ningnan Wang*, Zilong Chen*, Binchi Zhang*, Qinghua Zheng, Wenqian Zhang, Zhenyu Lei, Shujie Yang, Xinshun Feng, Qingyue Zhang, Hongrui Wang, Yuhan Liu, Yuyang Bai, Heng Wang, Zijian Cai, Yanbo Wang, Lijing Zheng, Zihan Ma, Jundong Li, Minnan Luo
Proceedings of NeurIPS, Datasets and Benchmarks Track, 2022.
website / GitHub / bibtex / poster

We present TwiBot-22, the largest graph-based Twitter bot detection benchmark to date, which provides diversified entities and relations in Twittersphere and has considerably better annotation quality.

3DSP Heterogeneity-Aware Twitter Bot Detection with Relational Graph Transformers
Shangbin Feng, Zhaoxuan Tan, Rui Li, Minnan Luo
Proceedings of AAAI 2022.
slides / code / bibtex

We propose the relational graph transformers GNN architecture to leverage the intrinsic relation heterogeneity and influence heterogeneity in Twitter network.

Industrial Experience
Amazon Science
2024.05 - 2024.08

Applied Scientist Intern @ Rufus
Host: Dr. Zheng Li
Palo Alto, CA
University of Notre Dame
2023.08 - present

Ph.D. in Computer Science and Engineering
Advisor: Prof. Meng Jiang
Xi'an Jiaotong University
2019.08 - 2023.07

B.E. in Computer Science and Technology
GPA: 89.1 (+3) / 100.0 [top 5%]
Advisor: Prof. Minnan Luo
  • Reviewer: COLM (2024), KDD (2024), ARR (Dec 2023-), WWW (2024), ICLR (2024), TKDE (2023-), TNNLS (2023-), ICWSM (2024), NeurIPS (2023, 2024), NeurIPS dataset and benchmark track (2022), LoG (2022, 2023, 2024), AGI@ICLR (2024), TGL@NeurIPS (2023), GCLR@AAAI (2024), KnowledgeNLP@ACL (2024), WiNLP@EMNLP (2024).
  • Volunteer: EMNLP 2023 (virtual), EMNLP 2022 (virtual).
  • Director of the LUD lab (promote undergraduate research) 2022-2023.
  • I have the fortune to work with brilliant mentors, collaborators, and advisors during my research journey and I am truly grateful for their guidance and help. If you feel like I can be of some help to your research career, welcome to reach out!☕
  • My Chinese name is 谭兆轩 (Tan, Zhaoxuan).
  • I enjoy playing trumpet🎺 and served as the principal trumpet player in my primary and high school band🎼.
  • I am a big fan of Jules Verne, and especially fascinated with In Search of the Castaways, From the Earth to the Moon, and Five Weeks in a Balloon.
  • I also love jogging🏃 and playing table tennis🏓.

Template courtesy: Jon Barron.