Ph.D. student at The Hong Kong Polytechnic University
Researching LLMs, Security, Privacy, and Alignment
I specialize in analyzing the potential risks inherent in language models, with a focus on understanding why and how neural networks function and identifying vulnerabilities within them. My research is driven by a deep curiosity to uncover the mechanisms behind these models and to address the security challenges they present.
My work can be categorized into two main areas:
- Uncovering New Threats and Developing Defenses: I conduct comprehensive evaluations of popular AI services and techniques, combining in-depth theoretical analysis with practical experimentation.
- Enhancing Understanding of Models and Learning Processes: I aim to explain the root causes of safety issues in AI systems, examining how these problems arise during model training and inference, and what they imply for the broader field of machine learning.
In addition to my research, I have extensive experience in natural language processing (NLP), particularly in building conversational AI systems, which I have been actively involved in since 2019. More recently, starting in 2024, I have developed a strong interest in the future of AI, particularly in the application of reinforcement learning (RL) to advance the capabilities and safety of intelligent systems.
Project | Venue | Tags | Description |
---|---|---|---|
π¨ LoRA-sSecurity | ICML'25 | Explores vulnerabilities introduced by LoRA-based fine-tuning, proposing both theoretical analysis and practical attacks. | |
π§ LoRD-MEA | ACL'25 | Proposes an RL-based model extraction attack tailored for alignment-aware LLMs. | |
π΅οΈββοΈ PromptExtractionEval | Preprint | Analyzes why your prompts get leaked and proposes practical defenses. | |
π MERGE | AAAI'24 | Lightweight MPC + HE framework for privacy-preserving, fast text generation in real-world scenarios. | |
βοΈ ISAAC | AAAI'25 | Investigates implicit alignment signals in training corpora and their influence on downstream tuning. |
Project | Description |
---|---|
π§ͺ zhouyi | A playful and philosophical implementation of divination using the ancient Chinese classic Zhouyi (ζη»). Reflects a blend of symbolic computation, randomness, and interpretation. |
π§ easy-collections | A light-weight emacs package that encourages you to collect any fragmented important information in emacs (e.g., codes, paragraphs, outputs in terminal, and any others) into a single file. It is light-weight, quick, and non-interruptible. |
.emacs.d | My Emacs Configuration |
PolyU's Poster Theme | An unofficial latex template of posters for students who are in PolyU. |
π§ This README is automatically generated by GPT-4o and manually refined by Zi Liang.