Curriculum Vitae
Education
- M.Phil, The HongKong University of Science and Technology (Guangzhou)
- AI thrust&INFO Hub
- Majoring in Artificial Intelligence(AI, CGA:3.323/4)
- AI thrust&INFO Hub
2024-Present
- B.S, China University of Petroleum Beijing
- Department of Mathematics, School of Arts and Sciences
- Majoring in Statistics(STAT,GPA:3.58/4)
- Department of Mathematics, School of Arts and Sciences
2020-2024
Research experience
Postgraduate(2024-Present):
LLM-Based Intelligent Companion AI Digital Humans with Consciousness and Memory
Project summary: In the digital age, addressing challenges such as prompt injection attacks, data leakage, low interpretability, and insufficient emotional engagement faced by LLM-based companion agents in fields like education and healthcare, we constructed a comprehensive framework to enhance security protection, improve reasoning capabilities, quantify uncertainty, and customize emotional support. The goal is to create a safe, reliable, and empathetic AI assistant. Responsible for the post-training and interpretability modules of the LLM.
Outcomes
- Agent Module: Adopted Qwen3-14B as the pre-trained model and performed single-GPU fine-tuning (A800 80G) on the CBT-bench dataset.
- Interpretability Module:
- Explored potential relationships among conceptual annotations in the dataset using the Apriori algorithm for association rule learning, enhancing the interpretability of causal variables.
- Designed an innovative framework of Latent Disentanglement-Concept Bottleneck Models (LDCBMs). Compared with previous models on the CUB, AwA2, and CelebA datasets, the concept alignment rate increased by 0.1242 ± 0.0576, and the label accuracy improved by 0.1316 ± 0.0203, achieving SOTA performance. Additionally, the entire input-concept-output process is interpretable.
Backdoor Attack Defense for Concept Bottleneck Models
Project Summary: In explainable AI (XAI), Concept Bottleneck Models (CBMs) enhance interpretability via understandable underlying concepts, but are vulnerable to concept-level backdoor attacks (hidden triggers in concepts causing undetectable misbehavior). First proposed Conceptguard defense: constructed poisoned datasets, divided data subsets, and used majority voting to mitigate data-driven backdoor impacts.
Key Contributions:
Theoretically proved a minimum trigger size threshold, above which Conceptguard effectively defends against attacks (average backdoor success rate reduced by ~30\%).
Led CBMs baseline and Conceptguard experiments on CUB dataset, demonstrating improved concept accuracy. Identified cluster mechanism as an unsupervised method to avoid concept-level category conflicts and enhance feature learning for better concept correlation capture.
Publications
Awards
Sep. 2022First Prize China Undergraduate Mathematical Contest in Modeling(CUMCM)Jun. 2022Grand Prize(1/4983) ”Renzheng Cup” National Student Mathematical Modeling CompetitionNov. 2021H Prize, “Shuwei Cup” International Student Mathematical Modeling CompetitionNov. 2021Third Prize “Huajiao Cup”National University Mathematics CompetitionFeb. 2022H Prize Interdisciplinary Contest In Modeling(MCM/ICM)Aug. 2022Second Prize ”MathorCup” National Student Mathematical Modeling CompetitionMay. 2022Second Prize ”Renzheng Cup” National Student Mathematical Modeling CompetitionJun. 2022Second Prize China Student Computer Design CompetitionSep. 2022Second Prize Nation University math network championship
