I am currently a Research Fellow at City University of Hong Kong working with Prof. Cong WANG I have received my Ph.D. degree with “Outstanding Graduate of Zhejiang University” from Zhejiang University in 2024, advised by Prof. Zhan QIN. My research interests include Trustworthy AI, LLM Security and Safety. My recent work is mainly about LLM watermarking and privacy protection of LLM.
🔥 News
- 2025.04: 🎉 Our paper “Artificial Intelligence Security and Privacy: a Survey” is accepted by SCIENCE CHINA Information Sciences
- 2024.12: 🎉 Our paper FDINet is accepted by IEEE TDSC 2024
- 2024.08: 🎉 Our paper Explanation as a Watermark is accepted by NDSS 2025
- 2024.03: 🔥 We release AIcert Platform,Media
- 2023.12: 🎉 Our paper PoisonPrompt is accepted by IEEE ICASSP 2024
- 2023.10: 🎉 Our paper PromptCARE is accepted by IEEE S&P 2024
- 2023.09: 🔥 Our work is promoted by New Scientist
- 2023.09: 🎉 Our paper RemovalNet is accepted by IEEE TDSC 2023
📝 Publications
🎙 LLM

TAPI: Towards Target-Specific and Adeversarial Prompt Injection against Code LLMs
Yuchen Yang, Hongwei Yao, Zhan Qin, Kui Ren,
- This paper proposes a new attack paradigm, i.e., targetspecific and adversarial prompt injection (TAPI), against Code LLMs. TAPI generates unreadable comments containing information about malicious instructions and hides them as triggers in the external source code.
- This paper successfully attacks some famous deployed code completion integrated applications, including CodeGeex and Github Copilot.

PromptCARE: Prompt Copyright Protection by Watermark Injection and Verification
Hongwei Yao, Jian Lou, Zhan Qin, Kui Ren,
[Code]
- In this paper, we propose PromptCARE, the first framework for prompt copyright protection through watermark injection and verification.
- Academic Impact: Our work are promoted by media and forums, such as New Scientist、GOSSIP、隐者联盟、安全内参.

PoisonPrompt: Backdoor Attack on Prompt-based Large Language Models
Hongwei Yao, Jian Lou, Zhan Qin,
[Code]
- In this paper, we present PoisonPrompt, a novel backdoor attack capable of successfully compromising both hard and soft prompt-based LLMs. We evaluate the effectiveness, fidelity, and robustness of PoisonPrompt through extensive experiments on three popular prompt methods, using six datasets and three widely used LLMs.
arxiv 2025BadReward: Clean-Label Poisoning of Reward Models in Text-to-Image RLHF, Kaiwen Duan, Hongwei Yao, Yufei Chen, Ziyun Li, Tong Qiao, Zhan Qin, Cong Wangarxiv 2025Quantifying Conversation Drift in MCP via Latent Polytope, Haoran Shi, Hongwei Yao, Tong Qiao, Zhan Qin, Cong Wangarxiv 2025ControlNET: A Firewall for RAG-based LLM System, Hongwei Yao, Haoran Shi, Yidou Chen, Zhan Qin, Cong Wang,arxiv 2024TAPI: Towards Target-Specific and Adeversarial Prompt Injection against Code LLMs, Yuchen Yang, Hongwei Yao, Zhan Qin, Kui Renarxiv 2024Eguard: Mitigating Privacy Risks in LLM Embeddings from Embedding Inversion, Tiantian Liu, Hongwei Yao, Zhan Qin, Feng Lin, Kui RenIEEE TDSC 2024FDINet: Protecting against DNN Model Extraction via Feature Distortion Index, Hongwei Yao, Zheng Li, Haiqin Weng, Feng Xue, Zhan Qin, Kui RenNDSS 2025Explanation as a Watermark: Harmless and Multi-bit Model Ownership Verification via Watermarking Feature Attribution, Shuo Shao, Yiming Li, Hongwei Yao, Yiling He, Zhan Qin, Kui RenIEEE S&P 2024PromptCARE: Prompt Copyright Protection by Watermark Injection and Verification, Hongwei Yao, Jian Lou, Zhan Qin, Kui RenIEEE ICASSP 2024PoisonPrompt: Backdoor Attack on Prompt-based Large Language Models, Hongwei Yao, Jian Lou, Zhan Qin, Kui RenIEEE TDSC 2023RemovalNet: DNN Fingerprint Removal Attacks, Hongwei Yao, Zheng Li, Kunzhe Huang, Jian Lou, Zhan Qin, Kui Ren, IEEE Transactions on Dependable and Secure Computing, 2023Elsevier 2022Classifying Between Computer Generated and Natural Images: An Empirical Study from RAW to JPEG Format, Tong Qiao, Xiangyang Luo, Hongwei Yao, Elsevier Journal of Visual Communication and Image Representation, 2022Sensors 2020Image Forgery Detection and Localization via a Reliability Fusion Map, Hongwei Yao, Ming Xu, Tong Qiao, Ling Zheng, MDPI Sensors, 2020IEEE Access 2018Robust Multi-classifier for Camera Model Identification based on Convolution Neural Network, Hongwei Yao, Tong Qiao, Ming Xu, Ling Zheng, IEEE Access, 2018
🔏 Patents
- 姚宏伟,娄坚,秦湛,任奎. 一种大模型提示词版权验证方法及装置(发明专利,已进入实质审查,CN202311744252.0)
- 姚宏伟,秦湛,任奎. 一种深度神经网络模型指纹鲁棒性评估方法(发明专利,已进入实质审查,CN202311144816.7)
- 姚宏伟,任奎,秦湛,王志波,屠春来,牛文杰. 一种基于特征失真指数的模型窃取防御方法及装(发明专利,已进入实质审查,CN202211524887.5)
📚 Books and Technical Reports
- 《人工智能安全白皮书(2020)》 Media1/Media2/Media3
- 《人工智能安全》
🎖 Honors and Awards
- [2024] Outstanding Graduate of Zhejiang University, by Zhejiang University
- [2023] Award of Honor for Graduate, by Zhejiang University
- [2022] Outstanding Graduate Student, by Zhejiang University
- [2021] Award of Honor for Graduate, by Zhejiang University
- [2021] Graduate of Merit, by Zhejiang University
- [2020] Ph.D Freshman Scholarship, by Zhejiang University
- [2020] Outstanding Graduate of Hangzhou Dianzi University, by Hangzhou Dianzi University
- [2019] Zhejiang Province 16th “The Challenge Cup” College Students Science and Technology Competition, (First Prize), by Zhejiang Province
- [2018] China Internet Development Foundation Cyberspace Security Scholarship, by China Internet Development Foundation
- [2018] Huawei Scholarship, by Huawei
- [2018] Hack {China} Hackathon Competition, (First Prize), by Hangzhou Dianzi University
- [2018] Unique Hackathon Competition, (First Prize), by Huazhong University
💬 Invited Talks
- 2023.11, Deep Copyright Protection
🧑🎨 Services
- Reviewer of ICLR, AAAI 2026
- Reviewer of ICLR, NeurIPS, ICML, IEEE ICASSP, AISTATS 2025
- Reviewer of IEEE ICASSP 2025
- Reviewer of IEEE Transactions on Dependable and Secure Computing (TDSC)
- Reviewer of IEEE Access
- Reviewer of ACM Multimedia Systems
- Reviewer of The Journal of Supercomputing