Ph.D. Candidate
CAS Key Laboratory of AI Safety
Institute of Computing Technology, Chinese Academy of Sciences
No.6 Kexueyuan South Road Zhongguancun, Haidian District, Beijing, China
Homepage: https://yuanyige.github.io
Email: yuanyige20z [AT] ict.ac.cn or yuanyige923 [AT] gmail.com
Github: https://github.com/yuanyige
Research Interest: Trustworthy Machine Learning Generalization and Robustness Deep Generative Models
I am currently pursuing my Ph.D. at the CAS Key Laboratory of AI Safety & Security, advised by Prof. Xueqi Cheng and Assoc. Prof. Bingbing Xu. I obtained my B.S. degree in Information Security from Xidian University in 2020.
My research interests span the areas of machine learning, especially in generalization, robustness, physics-inspired machine learning and deep generative models. I am also considering expanding my research to include the exploration of LLM.
TL;DR: In this paper, we propose to investigate generalization from an energy-based perspective and introduce TEA, a test-time adaptation method which transforms the trained classifier into an energy-based model and aligns the model's distribution with the test data's, enhancing its ability to perceive test distributions and thus improving overall generalizability.
TL;DR: In this paper, we propose to investigate generalization from PDE perspective and propose PDE-ADD framework. We introduce adaptive distributional diffusion into transport equation to enhance smoothness of its solution, thereby improving generalization directly via the underlying function of NN.
TL;DR: In this paper, we point out the traditional InfoNCE inevitably enlarges the distribution gap between different domains and impairs OOD generalization, due to it restricts the cross-domain pairs only to be negative samples. We treat the most semantically similar cross-domain negative pairs as positive to address this issue.
TL;DR: In this paper, we identify circular dependency issue of GNNs, i.e., the sampling probability of the minimum variance sampler is determined by node embeddings, while node embeddings can not be calculated until sampling is finished. We propose HDSGNN, which estimates the minimum variance sampler with the historical nodes' embeddings to break the circular dependency for better scalability of GNNs.
TL;DR: In this paper, we propose a GCL generalization ability metric and prove a MI upper bound for it from an information-theoretic perspective. Guided by the bound, we design an InfoAdv framework, which can be applied to current GCL models and achieves SOTA performance.