Curriculum Vitae

Education

M.Phil, The HongKong University of Science and Technology (Guangzhou)
- AI thrust&INFO Hub
  - Majoring in Artificial Intelligence(AI, CGA:3.323/4)

2024-Present

B.S, China University of Petroleum Beijing
- Department of Mathematics, School of Arts and Sciences
  - Majoring in Statistics(STAT,GPA:3.58/4)

2020-2024

Research experience

Postgraduate(2024-Present):

LLM-Based Intelligent Companion AI Digital Humans with Consciousness and Memory

Project summary: In the digital age, addressing challenges such as prompt injection attacks, data leakage, low interpretability, and insufficient emotional engagement faced by LLM-based companion agents in fields like education and healthcare, we constructed a comprehensive framework to enhance security protection, improve reasoning capabilities, quantify uncertainty, and customize emotional support. The goal is to create a safe, reliable, and empathetic AI assistant. Responsible for the post-training and interpretability modules of the LLM.

Outcomes

Agent Module: Adopted Qwen3-14B as the pre-trained model and performed single-GPU fine-tuning (A800 80G) on the CBT-bench dataset.
Interpretability Module:
- Explored potential relationships among conceptual annotations in the dataset using the Apriori algorithm for association rule learning, enhancing the interpretability of causal variables.
- Designed an innovative framework of Latent Disentanglement-Concept Bottleneck Models (LDCBMs). Compared with previous models on the CUB, AwA2, and CelebA datasets, the concept alignment rate increased by 0.1242 ± 0.0576, and the label accuracy improved by 0.1316 ± 0.0203, achieving SOTA performance. Additionally, the entire input-concept-output process is interpretable.

Backdoor Attack Defense for Concept Bottleneck Models

Project Summary: In explainable AI (XAI), Concept Bottleneck Models (CBMs) enhance interpretability via understandable underlying concepts, but are vulnerable to concept-level backdoor attacks (hidden triggers in concepts causing undetectable misbehavior). First proposed Conceptguard defense: constructed poisoned datasets, divided data subsets, and used majority voting to mitigate data-driven backdoor impacts.

Key Contributions:

Theoretically proved a minimum trigger size threshold, above which Conceptguard effectively defends against attacks (average backdoor success rate reduced by ~30\%).
Led CBMs baseline and Conceptguard experiments on CUB dataset, demonstrating improved concept accuracy. Identified cluster mechanism as an unsupervised method to avoid concept-level category conflicts and enhance feature learning for better concept correlation capture.

Publications

Guarding the Gate: ConceptGuard Battles Concept-Level Backdoors in Concept Bottleneck Models

Towards more holistic interpretability: A lightweight disentangled Concept Bottleneck Model

Awards

Sep. 2022First Prize China Undergraduate Mathematical Contest in Modeling(CUMCM)
Jun. 2022Grand Prize(1/4983) ”Renzheng Cup” National Student Mathematical Modeling Competition
Nov. 2021H Prize, “Shuwei Cup” International Student Mathematical Modeling Competition
Nov. 2021Third Prize “Huajiao Cup”National University Mathematics Competition
Feb. 2022H Prize Interdisciplinary Contest In Modeling(MCM/ICM)
Aug. 2022Second Prize ”MathorCup” National Student Mathematical Modeling Competition
May. 2022Second Prize ”Renzheng Cup” National Student Mathematical Modeling Competition
Jun. 2022Second Prize China Student Computer Design Competition
Sep. 2022Second Prize Nation University math network championship

Huang Gaoxiang