Publications

$^\dagger$: Corresponding author.

You can also find my publications on Google Scholar.

2026

6DAttack: Backdoor Attacks in the 6DoF Pose Estimation

Jihui Guo, Zongmin Zhang, Zhen Sun, Yuhao Yang, Jinlin Wu, Fu Zhang$^\dagger$, and Xinlei He$^\dagger$; AAAI 2026 (Oral)

An Improved Privacy and Utility Analysis of Differentially Private SGD with Bounded Domain and Smooth Losses

Hao Liang, Wanrong Zhang, Xinlei He, Kaishun Wu, and Hong Xing; AAAI 2026 (Poster)

2025

CHASM: Unveiling Covert Advertisements on Chinese Social Media

Jingyi Zheng, Tianyi Hu, Yule Liu, Zhen Sun, Zongmin Zhang, Wenhan Dong, Zifan Peng, and Xinlei He; NeurIPS 2025

FacLens: Transferable Probe for Foreseeing Non-Factuality in Fact-Seeking Question Answering of Large Language Models

Yanling Wang, Haoyang Li, Hao Zou, Jing Zhang, Xinlei He, Qi Li, and Ke Xu; EMNLP 2025

FC-Attack: Jailbreaking Multimodal Large Language Models via Auto-Generated Flowcharts

Ziyi Zhang, Zhen Sun, Zongmin Zhang, Jihui Guo, Xinlei He$^\dagger$; EMNLP 2025 (Findings)Zeren Luo, Zifan Peng, Yule Liu, Zhen Sun, Mingchen Li, Jingyi Zheng, Xinlei He$^\dagger$; USENIX Security 2025

On the Generalization and Adaptation Ability of Machine-Generated Text Detectors in Academic Writing

Yule Liu, Zhiyuan Zhong, Yifan Liao, Zhen Sun, Jingyi Zheng, Jiaheng Wei, Qingyuan Gong, Fenghua Tong, Yang Chen, Yang Zhang, and Xinlei He$^\dagger$; KDD 2025 (Datasets and Benchmarks Track)

TH-Bench: Evaluating Evading Attacks via Humanizing AI Text on Machine-Generated Text Detector

Jingyi Zheng, Junfeng Wang, Zhen Sun, Wenhan Dong, Yule Liu, and Xinlei He$^\dagger$; KDD 2025 (Datasets and Benchmarks Track)

Are We in the AI-Generated Text World Already? Quantifying and Monitoring AIGT on Social Media

Zhen Sun, Zongmin Zhang, Xinyue Shen, Ziyi Zhang, Yule Liu, Michael Backes, Yang Zhang, and Xinlei He$^\dagger$; ACL 2025

Beyond the Tip of Efficiency: Uncovering the Submerged Threats of Jailbreak Attacks in Small Language Models

Sibo Yi, Tianshuo Cong, Xinlei He, Qi Li, and Jiaxing Song; ACL 2025 (Findings)

Neeko: Model Hijacking Attacks Against Generative Adversarial Networks

Junjie Chu, Yugeng Liu, Xinlei He, Michael Backes, Yang Zhang, and Ahmed Salem; IEEE ICME 2025

PEFTGuard: Detecting Backdoor Attacks Against Parameter-Efficient Fine-Tuning

Zhen Sun, Tianshuo Cong, Yule Liu, Chenhao Lin, Xinlei He$^\dagger$, Rongmao Chen, Xingshuo Han, and Xinyi Huang; IEEE S&P 2025

From Purity to Peril: Backdooring Merged Models From “Harmless” Benign Components

Lijin Wang, Jingjing Wang, Tianshuo Cong$^\dagger$, Xinlei He$^\dagger$, Zhan Qin, and Xinyi Huang; USENIX Security 2025

Safety Misalignment Against Large Language Models

Yichen Gong, Delong Ran, Xinlei He, Tianshuo Cong, Anyu Wang, and Xiaoyun Wang NDSS Symposium 2025

(AR: 211/1311=16.1%, AR Fall: 14.5%) 🎖️ Artifact Badges: Available, Functional, Reproduced

CL-Attack: Textual Backdoor Attacks via Cross-Lingual Triggers

Jingyi Zheng, Tianyi Hu, Tianshuo Cong, and Xinlei He$^\dagger$; AAAI 2025

Have You Merged My Model? On The Robustness of Large Language Model IP Protection Methods Against Model Merging

Tianshuo Cong, Delong Ran, Zesen Liu, Xinlei He, Jinyuan Liu, Yichen Gong, Qi Li, Anyu Wang, Xiaoyun Wang; 1st ACM CCS Workshop on Large AI Systems and Models with Privacy and Safety Analysis (LAMPS)

🏆 Best Paper Award

2024

MGTBench: Benchmarking Machine-Generated Text Detection

Xinlei He, Xinyue Shen, Zeyuan Chen, Michael Backes, Yang Zhang; CCS 2024

img img

SecurityNet: Assessing Machine Learning Vulnerabilities on Public Models

Boyang Zhang, Zheng Li, Ziqing Yang,Xinlei He, Michael Backes, Mario Fritz, Yang Zhang; USENIX Security 2024

img img

You Only Prompt Once: On the Capabilities of Prompt Learning on Large Language Models to Tackle Toxic Content

Xinlei He, Savvas Zannettou, Yun Shen, Yang Zhang; S&P 2024

img img

Test-Time Poisoning Attacks Against Test-Time Adaptation Models

Tianshuo Cong, Xinlei He, Yun Shen, Yang Zhang; S&P 2024

img img

Yixin Wu, Xinlei He, Pascal Berrang, Mathias Humbert, Michael Backes, Neil Zhenqiang Gong, Yang Zhang; PoPETS 2024

2023

Unsafe Diffusion: On the Generation of Unsafe Images and Hateful Memes From Text-To-Image Models

Yiting Qu, Xinyue Shen, Xinlei He, Michael Backes, Savvas Zannettou, Yang Zhang; CCS 2023

img img

Data Poisoning Attacks Against Multimodal Encoders

Ziqing Yang, Xinlei He, Zheng Li, Michael Backes, Mathias Humbert, Pascal Berrang, Yang Zhang; ICML 2023

img img

Generated Graph Detection

Yihan Ma, Zhikun Zhang, Ning Yu, Xinlei He, Michael Backes, Yun Shen, Yang Zhang; ICML 2023

img img

Can't Steal? Cont-Steal! Contrastive Stealing Attacks Against Image Encoders

Zeyang Sha, Xinlei He, Ning Yu, Michael Backes, Yang Zhang; CVPR 2023

img img

A Plot is Worth a Thousand Words: Model Information Stealing Attacks via Scientific Plots

Boyang Zhang, Xinlei He, Yun Shen, Tianhao Wang, Yang Zhang; USENIX Security 2023

img img

On the Evolution of (Hateful) Memes by Means of Multimodal Contrastive Learning

Yiting Qu, Xinlei He, Shannon Pierson, Michael Backes, Yang Zhang, Savvas Zannettou; S&P 2023

img img

2022

Semi-Leak: Membership Inference Attacks Against Semi-supervised Learning

Xinlei He, Hongbin Liu, Neil Zhenqiang Gong, Yang Zhang; ECCV 2022

img img

SSLGuard: A Watermarking Scheme for Self-supervised Learning Pre-trained Encoders

Tianshuo Cong, Xinlei He, Yang Zhang; CCS 2022

img img

Auditing Membership Leakages of Multi-Exit Networks

Zheng Li, Yiyong Liu, Xinlei He, Ning Yu, Michael Backes, Yang Zhang; CCS 2022

img img

Model Stealing Attacks Against Inductive Graph Neural Networks

Yun Shen*, Xinlei He*, Yufei Han, Yang Zhang (* Equal Contribution); S&P 2022

img img

ML-Doctor: Holistic Risk Assessment of Inference Attacks Against Machine Learning Models

Yugeng Liu, Rui Wen, Xinlei He, Ahmed Salem, Zhikun Zhang, Michael Backes, Emiliano De Cristofaro, Mario Fritz, Yang Zhang; USENIX Security 2022

img img

On Xing Tian and the Perseverance of Anti-China Sentiment Online

Xinyue Shen, Xinlei He, Michael Backes, Jeremy Blackburn, Savvas Zannettou, Yang Zhang; ICWSM 2022

img

2021

Quantifying and Mitigating Privacy Risks of Contrastive Learning

Xinlei He, Yang Zhang; CCS 2021

img img

Xinlei He, Jinyuan Jia, Michael Backes, Neil Zhenqiang Gong, Yang Zhang; USENIX Security 2021

img img

Trimming Mobile Applications for Bandwidth-Challenged Networks in Developing Regions

Qinge Xie, Qingyuan Gong, Xinlei He, Yang Chen, Xin Wang, Haitao Zheng, Ben Y. Zhao; IEEE Transactions on Mobile Computing (TMC)

img

DatingSec: Detecting Malicious Accounts in Dating Apps Using a Content-Based Attention Network

Xinlei He, Qingyuan Gong, Yang Chen, Yang Zhang, Xin Wang, Xiaoming Fu; IEEE Transactions on Dependable and Secure Computing (TDSC)

img

Cross-Site Prediction on Social Influence for Cold-Start Users in Online Social Networks

Qingyuan Gong, Yang Chen, Xinlei He, Yu Xiao, Pan Hui, Xin Wang, Xiaoming Fu; ACM Transactions on the Web (TWEB)

img

DeepScan: Exploiting Deep Learning for Malicious Account Detection in Location-Based Social Networks

Qingyuan Gong, Yang Chen, Xinlei He, Zhou Zhuang, Tianyi Wang, Hong Huang, Xin Wang, Xiaoming Fu; IEEE Communications Magazine

img