Model Extraction Attack and Defense for Large Language Models (Oct 7th)

Speaker: Lincan (Kelsey) Li

Date: Oct 7, 11:45am – 12:45 pm

Abstract: Model extraction attacks pose significant security threats to deployed language models, potentially compromising intellectual property and user privacy. This survey provides a comprehensive taxonomy of LLM-specific extraction attacks and defenses, categorizing attacks into functionality extraction, training data extraction, and prompt-targeted attacks. We analyze various attack methodologies including API-based knowledge distillation, direct querying, parameter recovery, and prompt stealing techniques that exploit transformer architectures. We then examine defense mechanisms organized into model protection, data privacy protection, and prompt-targeted strategies, evaluating their effectiveness across different deployment scenarios. We propose specialized metrics for evaluating both attack effectiveness and defense performance, addressing the specific challenges of generative language models. Through our analysis, we identify critical limitations in current approaches and propose promising research directions, including integrated attack methodologies and adaptive defense mechanisms that balance security with model utility. This work serves NLP researchers, ML engineers, and security professionals seeking to protect language models in production environments.

Biographical Sketch:Lincan (Kelsey) Li is a first-year PhD student at Department of Computer Science, Florida State University, advised by Dr. Yushun Dong in the Reliable AI (RAI) Lab. Her research focuses on Trustworthy AI, spatial-temporal data mining, graph neural networks, privacy and security. Prior to joining FSU, Lincan conducted research at the University of New South Wales and Zhejiang University. She is the co-first author of a KDD 2025 survey on model extraction attacks and defenses and will serve as the lead presenter for the KDD tutorial. Her research contributions have been published in top venues such as SIGKDD, SIGSPATIAL, ICASSP, SMC, etc. She is also a core contributor to open-source projects like STG-Mamba and PyGIP. Lincan actively serves as a reviewer for major AI conferences including NeurIPS, ICML, IJCAI, AAAI, and SIGKDD. She is passionate about applying AI methods to real-world problems through interdisciplinary collaborations.

Location LOV 307 (In Person Only)

Leave a Reply