高泽峰(Ze-Feng Gao)

中国人民大学物理学院讲师 。中国人民大学青年英才。研究方向为量子物理数值方法、预训练模型压缩、AI辅助功能性晶体材料的发现与生成。基于矩阵乘积算符表示的神经网络、针对语音增强领域的模型压缩和小型化和面向预训练模型的轻量化微调与模型扩容三个方面,构建了基于矩阵乘积算符表示的理论方法。同时,应用人工智能方法辅助功能性新材料发现。已在本领域相关的国内外学术期刊和会议上发表论文二十余篇,涵盖ACL、NeurIPS、EMNLP、COLING等人工智能重要会议和National Science Review、Phys.Rev.Research等SCI重要期刊。其中,基于矩阵乘积算符实现预训练模型的过参数化过程的工作获得ACL2023最佳论文提名。研究成果被来自剑桥大学、斯坦福大学、Meta公司等科研机构的领域专家引用。近三年主持基金7项,包括国家自然科学基金青年科学基金项目、国家自然科学基金面上项目、国家自然科学基金重点项目(子课题)、国家重点研发计划“物态调控”专项项目(子课题)与横向项目3项。

Email: zfgao@ruc.edu.cn

GitHub  /  Google Scholar  /  DBLP  / 

profile photo

教育经历

  • 2016.09-2021.06, 中国人民大学, 理论物理, 博士. 导师: 卢仲毅 教授.
  • 2012.09-2016.06, 中国人民大学, 物理学, 学士.
  • 工作经历

  • 2021.07-2024.06, 中国人民大学, 博士后研究员. 合作导师: 文继荣, 赵鑫, 孙浩.
  • 2021.07-至今, 中国人民大学, 量子测量与调控教育部重点实验室, 副研究员



  • 邀请报告

  • Compressing deep neural network by matrix product operators, 第9届量子多体计算物理国际会议, 中国人民大学, 北京, 2019.04
  • Parameter-Efficient Mixture-of-Experts Architecture for Pre-trained Language Models, 2022智源大会, 北京, 2022.05
  • Matrix Product Operator and Natural Language Processing, 首都师范大学, 北京, 2023.04
  • Small Pre-trained Language Models Can be Fine-tuned as Large Models via Over-Parameterization, KDD暑期学校, 西南交通大学, 成都, 2023.07
  • 人工智能技术与张量方法应用, 北京师范大学, 北京, 2024.05
  • 人工智能与物理学交叉初探, 研究生国际学校课程, 郑州大学, 河南, 2024.08
  • AI-driven Inverse Design of Materials, 第二届量子智能计算研讨会, 成都, 2025.01
  • 科研项目

  • 国家自然科学基金委,面上项目,大规模预训练语言模型的高效压缩方法研究(Nos.62476278),主持
  • 国家自然科学基金委,青年项目,大规模预训练语言模型的轻量化微调与模型扩容方法研究(Nos.62206299),主持
  • 国家自然科学基金委, 重点项目子课题, 强关联电子材料超快动力学的第一性原理多体计算方法研究(Nos.12434009),主持
  • 北京市智源人工智能研究院,横向项目,基于矩阵乘积算符的多模态模型的轻量化微调,主持
  • CCF-智谱大模型项目,横向项目,基于张量分解的预训练模型压缩策略,主持
  • 中国科学院大学外协项目,横向项目,面向非线性动力系统建模的物理启发深度图学习理论与技术,主持
  • 参与项目:
  • 国家自然科学基金委,面上项目,基于物理嵌入深度图学习的复杂时空系统科学计算理论与算法(Nos.62276269)
  • 国家自然科学基金委,面上项目,张量网络与神经网络在强关联系统中的结合、发展与应用(Nos.11874421)
  • 国家自然科学基金委,面上项目,基于自然轨道重整化群对量子杂质系统中若干问题的研究(Nos.11774422)
  • 国家自然科学基金委,面上项目,多体局域化的若干问题研究(Nos.11874421)
  • 北京自然科学基金委,重点研究专题,离子阱量子计算技术与算法的研究



  • 教授课程

  • 2022-2023 春季: 人工智能概论(本科生课程)
  • 2022-2023 秋季: 人工智能与物理学(本科生课程)
  • 2024-2025 秋季: 人工智能与物理学(本科生课程)
  • 2025-2026 春季: 诺贝尔物理学奖和现代物理学(本科生课程)



  • 科研论文

    2025

    project image

    Strong phonon-mediated high temperature superconductivity in Li2AuH6 under ambient pressure


    Zhenfeng Ouyang, Bo-Wen Yao, Xiao-Qi Han, Peng-Jie Guo, Ze-Feng Gao#, Zhong-Yi Lu#
    Arxiv, 2025
    paper / arxiv / code /

    We used our developed AI search engine~(InvDesFlow) to perform extensive investigations regarding ambient stable superconducting hydrides. A cubic structure Li2AuH6 with Au-H octahedral motifs is identified to be a candidate. After performing thermodynamical analysis, we provide a feasible route to experimentally synthesize this material via the known LiAu and LiH compounds under ambient pressure. The further first-principles calculations suggest that Li2AuH6 shows a high superconducting transition temperature (Tc) ∼ 140 K under ambient pressure.

    project image

    AI-accelerated Discovery of Altermagnetic Materials


    Ze-Feng Gao*, Shuai Qu*, Bocheng Zeng*, Yang Liu, Ji-Rong Wen, Hao Sun#, Peng-Jie Guo#, Zhong-Yi Lu#
    National Science Review, 2025
    paper / arxiv / code /

    In this paper, we successfully discovered 50 new altermagnetic materials that cover metals, semiconductors, and insulators confirmed by the first-principles electronic structure calculations. The wide range of electronic structural characteristics reveals that various novel physical properties manifest in these newly discovered altermagnetic materials, e.g., anomalous Hall effect, anomalous Kerr effect, and topological property.




    2024

    project image

    Type-II Dirac nodal chain semimetal CrB4


    Xiao-Yao Hou, Ze-Feng Gao, Peng-Jie Guo#, Jian-Feng Zhang, Zhong-Yi Lu#
    Arxiv, 2024
    paper / arxiv /

    In this study, based on symmetry analysis and the first-principles electronic structure calculations, we predict that CrB4 is an ideal type-II Dirac nodal chain semimetal protected by the mirror symmetry. Moreover, there are two nodal rings protected by both space-inversion and time-reversal symmetries in CrB4. More importantly,in CrB4 the topologically protected drumhead surface states span the entire Brillouin zone at the Fermi level.

    project image

    AI-driven inverse design of materials: Past, present and future


    Xiao-Qi Han, Xin-De Wang, Meng-Yuan Xu, Zhen Feng, Bo-Wen Yao, Peng-Jie Guo, Ze-Feng Gao#, Zhong-Yi Lu#
    Chinese Physics Letter, 2024
    paper / arxiv / link /

    In this survey, we look back on the latest advancements in AI-driven inverse design of materials by introducing the background, key findings, and mainstream technological development routes. In addition, we summarize the remaining issues for future directions. This survey provides the latest overview of AI-driven inverse design of materials, which can serve as a useful resource for researchers.

    project image

    Over-parameterized Student Model via Tensor Decomposition Boosted Knowledge Distillation


    Yu-Liang Zhan, Zhong-Yi Lu, Hao Sun#, Ze-Feng Gao#
    38th Conference on Neural Information Processing Systems (NeurIPS 2024), 2024
    paper / arxiv / code / link /

    In this paper, we scale up the parameters of the student model during training, to benefit from over-parameterization without increasing the inference latency. In particular, we propose a tensor decomposition strategy that effectively over-parameterizes the relatively small student model through an efficient and nearly lossless decomposition of its parameter matrices into higher-dimensional tensors.

    project image

    Crystal valley Hall effect


    Chao-Yang Tan, Ze-Feng Gao, Huan-Cheng Yang, Zheng-Xin Liu, Kai Liu, Peng-Jie Guo#, Zhong-Yi Lu#
    Arxiv, 2024
    paper / arxiv /

    In this paper, based on symmetry analysis and the first-principles electronic structure calculations, we demonstrate that the vally Hall effect without time-reversal symmetry can be realized in two-dimensional altermagnetic materials Fe2WSe4 and Fe2WS4. Due to crystal symmetry required, the vally Hall effect without time-reversal symmetry is called crystal vally Hall effect. In addition, under uniaxial strain, both monolayer Fe2WSe4 and Fe2WS4 can realize piezomagnetic effect.

    project image

    AI-accelerated discovery of high critical temperature superconductors


    Xiao-Qi Han, Zhenfeng Ouyang, Peng-Jie Guo, Hao Sun, Ze-Feng Gao#, Zhong-Yi Lu#
    Arxiv, 2024
    paper / arxiv / code /

    In this paper, we develop an AI search engine that integrates deep model pre-training and fine-tuning techniques, diffusion models, and physics-based approaches (e.g., first-principles electronic structure calculation) for discovery of high-Tc superconductors. Utilizing this AI search engine, we have obtained 74 dynamically stable materials with critical temperatures predicted by the AI model to be Tc≥ 15 K based on a very small set of samples.

    project image

    YuLan: An Open-source Large Language Model


    Yutao Zhu, Kun Zhou, Kelong Mao, Wentong Chen, Yiding Sun, Zhipeng Chen, Qian Cao, Yihan Wu, Yushuo Chen, Feng Wang, Lei Zhang, Junyi Li, Xiaolei Wang, Lei Wang, Beichen Zhang, Zican Dong, Xiaoxue Cheng, Yuhan Chen, Xinyu Tang, Yupeng Hou, Qiangqiang Ren, Xincheng Pang, Shufang Xie, Wayne Xin Zhao, Zhicheng Dou, Jiaxin Mao, Yankai Lin, Ruihua Song, Jun Xu, Xu Chen, Rui Yan, Zhewei Wei, Di Hu, Wenbing Huang, Ze-Feng Gao, Yueguo Chen, Weizheng Lu, Ji-Rong Wen
    Arxiv, 2024
    paper / arxiv /

    In this paper, we design a three-stage pre-training method to enhance YuLan’s overall capabilities. Subsequent phases of training incorporate instruction-tuning and human alignment, employing a substantial volume of high-quality synthesized data. To facilitate the learning of complex and long-tail knowledge, we devise a curriculum-learning framework throughout across these stages, which helps LLMs learn knowledge in an easy-to-hard manner.

    project image

    Do Emergent Abilities Exist in Quantized Large Language Models: An Empirical Study


    Peiyu Liu, Zikang Liu, Ze-Feng Gao, Dawei Gao, Wayne Xin Zhao#,Yaliang Li, Bolin Ding, Ji-Rong Wen
    Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (COLING 2024), 2024
    paper / arxiv / code / link /

    This work aims to investigate the impact of quantization on \emph{emergent abilities}, which are important characteristics that distinguish LLMs from small language models. Specially, we examine the abilities of in-context learning, chain-of-thought reasoning, and instruction-following in quantized LLMs.

    project image

    Discovering symbolic expressions with parallelized tree search


    Kai Ruan, Ze-Feng Gao, Yike Guo, Hao Sun#, Ji-Rong Wen, Yang Liu
    Arxiv, 2024
    paper / arxiv /

    In this paper, we introduce a parallelized tree search (PTS) model to efficiently distill generic mathematical expressions from limited data. Through a series of extensive experiments, we demonstrate the superior accuracy and efficiency of PTS for equation discovery, which greatly outperforms the state-of-the-art baseline models on over 80 synthetic and experimental datasets (e.g., lifting its performance by up to 99% accuracy improvement and one-order of magnitude speed up).

    project image

    基于矩阵乘积算符表示的序列化推荐模型


    刘沛羽,姚博文, 高泽峰#, 赵鑫#
    山东大学学报(理学版), 2024
    paper / link /

    推荐系统中的序列化推荐任务面临着高度复杂和多样性大的挑战,基于序列化数据的商品表示学习中广泛采用预训练和微调的方法,现有方法通常忽略了在新领域中模型微调可能会遇到的欠拟合和过拟合问题。为了应对这一问题,构建一种基于矩阵乘积算符(matrix product operator, MPO)表示的神经网络结构,并实现2种灵活的微调策略。

    project image

    Bipolarized Weyl semimetals and quantum crystal valley Hall effect in two-dimensional altermagnetic materials


    Chao-Yang Tan, Ze-Feng Gao, Huan-Cheng Yang, Kai Liu, Peng-Jie Guo#, Zhong-Yi Lu#
    Arxiv, 2024
    paper / arxiv /

    In this paper, we predict four ideal two-dimensional type-I altermagnetic bipolarized Weyl semimetals Fe2WTe4 and Fe2MoZ4 (Z=S,Se,Te). More significantly, we introduce the quantum crystal valley Hall effect, a phenomenon achievable in three of these materials namely Fe2WTe4, Fe2MoS4, and Fe2MoTe4, when spin-orbit coupling is considered. Furthermore, these materials have the potential to transition from a quantum crystal valley Hall phase to a Chern insulator phase under strain.

    project image

    Extremely strong spin-orbit coupling effect in light element altermagnetic materials


    Shuai Qu, Ze-Feng Gao, Hao Sun, Kai Liu, Peng-Jie Guo#, Zhong-Yi Lu#
    Arxiv, 2024
    paper / arxiv /

    In this paper, we demonstrate that strong spin-orbit coupling effect can be realized in light element altermagnetic materials, and propose a mechanism for realizing the corresponding effective spin-orbit coupling. This mechanism reveals the cooperative effect of crystal symmetry, electron occupation, electronegativity, electron correlation, and intrinsic spin-orbit coupling. Our work not only promotes the understanding of light element compounds with strong spin-orbit coupling effect, but also provides an alternative for realizing light element compounds with an effective strong spin-orbit coupling.

    project image

    Unlocking Data-free Low-bit Quantization with Matrix Decomposition for KV Cache Compression


    Peiyu Liu, Ze-Feng Gao, Wayne Xin Zhao#, Yipeng Ma, Tao Wang, Ji-Rong Wen
    Annual Meeting of the Association for Computational Linguistics (ACL2024), 2024
    paper / arxiv / link /

    In this paper, we introduce DecoQuant, a novel data-free low-bit quantization technique based on tensor decomposition methods, to effectively compress KV cache. Our core idea is to adjust the outlier distribution of the original matrix by performing tensor decomposition, so that the quantization difficulties are migrated from the matrix to decomposed local tensors. Specially, we find that outliers mainly concentrate on small local tensors, while large tensors tend to have a narrower value range.

    project image

    Enhancing Parameter-efficient Fine-tuning with Simple Calibration Based on Stable Rank


    Peiyu Liu, Ze-Feng Gao, Xiao Zhang, Wayne Xin Zhao#, Ji-Rong Wen
    Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), 2024
    paper / code / link /

    In this paper, we proposed both theoretical analyses and experimental verification for the proposed calibration strategy. Considering efficiency, we further propose time-aware and structure-aware strategies to determine the most crucial time to commence the fine-tuning procedure and selectively apply parameter matrices for lightweight fine-tuning, respectively.




    2023

    project image

    Altermagnetic ferroelectric LiFe2F6 and spin-triplet excitonic insulator phase


    Peng-Jie Guo, Yuhao Gu, Ze-Feng Gao, Zhong-Yi Lu#
    Arxiv, 2023
    paper / arxiv /

    In this paper, we predict that LiFe2F6 is a d-wave altermagnetic and charge-ordering-mediated ferroelectric material. Moreover, the LiFe2F6 transforms into a ferrimagnetic and ferroelectric phase with strong magnetoelectric coupling under biaxial compressive strain. Interestingly, the spins of the valence band and the conduction band are opposite in ferrimagnetic LiFe2F6, which facilitates a simultaneous spin-triplet excitonic insulator phase.

    project image

    Compression Image Dataset Based on Multiple Matrix Product States


    Ze-Feng Gao*, Peiyu Liu*, Wayne Xin Zhao#, Zhi-Yuan Xie, Ji-Rong Wen and Zhong-Yi Lu
    Future of Information and Communication Conference (FICC2024), 2023
    paper / link /

    In this paper, we present an effective dataset compression approach based on the matrix product states (short as MPS) and knowledge distillation. MPS can decompose image samples into a sequential product of tensors to achieve task-agnostic image compression by preserving the low-rank information of the images. Based on this property, we use multiple MPS to represent the image datasets samples. Meanwhile, we also designed a task-related component based on knowledge distillation to enhance the generality of the compressed dataset.

    project image

    Small Pre-trained Language Models Can be Fine-tuned as Large Models via Over-Parameterization


    Ze-Feng Gao, Kun Zhou,Peiyu Liu, Wayne Xin Zhao#, Ji-Rong Wen
    Annual Meeting of the Association for Computational Linguistics (ACL2023), Oral (Nominated for Best Paper Reward), 2023
    paper / code / link /

    In this paper, we focus on just scaling up the parameters of PLMs during fine-tuning, to benefit from the over-parameterization but not increasing the inference latency. Extensive experiments have demonstrated that our approach can significantly boost the fine-tuning performance of small PLMs and even help small PLMs outperform 3x parameterized larger ones.

    project image

    Enhancing Scalability of Pre-trained Language Models via Efficient Parameter Sharing


    Peiyu Liu*, Ze-Feng Gao*, Yushuo Chen, Wayne Xin Zhao#, Ji-Rong Wen
    Association for Computational Linguistics: EMNLP 2023, 2023
    paper / arxiv / code / link /

    In this paper, we propose a parameter-efficient pre-training approach that utilizes matrix decomposition and parameter-sharing strategies to scale PLMs. Extensive experiments have demonstrated the effectiveness of our proposed model in reducing the model size and achieving highly competitive performance (i.e. with fewer parameters than BERT-base, we successfully scale the model depth by a factor of 4x and even achieve 0.1 points higher than BERT-large for GLUE score).




    2022

    project image

    Parameter-Efficient Mixture-of-Experts Architecture for Pre-trained Language Models


    Ze-Feng Gao*, Peiyu Liu*, Wayne Xin Zhao#, Zhong-Yi Lu, Ji-Rong Wen
    International Conference on Computational Linguistic (COLING2022), Oral Presentation, 2022
    paper / arxiv / code /

    In this paper, we can reduce the parameters of the original MoE architecture by sharing a global central tensor across experts and keeping expert-specific auxiliary tensors. We also design the gradient mask strategy for the tensor structure of MPO to alleviate the overfitting problem.

    project image

    利用自注意力机制优化费米网络的数值研究


    王佳奇, 高泽峰#, 李永峰,王璐#
    天津师范大学学报(自然科学版), 2022
    paper / arxiv /

    为了探索不使用特定形式的试探态研究多电子系统基态性质的方法,以至多约10个原子的小分子为例,利用神经网络的方法对多电子系统进行求解.此外,利用包含自注意力机制的Transformer结构对费米网络(FermiNet)进行改进, 结果表明:Transformer-FermiNet能够在保证原费米网络结果精度的同时将网络参数的规模缩减为原来的3/4.




    2021

    project image

    Enabling Lightweight Fine-tuning for Pre-trained Language Model Compression based on Matrix Product Operators


    Peiyu Liu*, Ze-Feng Gao*, Wayne Xin Zhao#, Z.Y. Xie, Zhong-Yi Lu#, Ji-Rong Wen
    Annual Meeting of the Association for Computational Linguistics (ACL2021), Poster, 2021
    paper / arxiv / code / slides / link /

    This paper presents a novel pre-trained language models (PLM) compression approach based on the matrix product operator (short as MPO) from quantum many-body physics.

    project image

    Compressing LSTM Networks by Matrix Product Operators


    Ze-Feng Gao*, Xingwei Sun*, Lan Gao, Junfeng Li#, Zhong-Yi Lu#
    Arxiv, 2020
    paper / arxiv /

    We propose an alternative LSTM model to reduce the number of parameters significantly by representing the weight parameters based on matrix product operators (MPO), which are used to characterize the local correlation in quantum states in physics.




    2020

    project image

    A Model Compression Method With Matrix Product Operators for Speech Enhancement


    Xingwei Sun*, Ze-Feng Gao*, Zhong-Yi Lu#, Junfeng Li#, Yonghong Yan
    IEEE/ACM Transactions on Audio, Speech, and Language Processing 28, 2837-2847, 2020
    paper / arxiv / link /

    In this paper, we propose a model compression method based on matrix product operators (MPO) to substantially reduce the number of parameters in DNN models for speech enhancement.

    project image

    Compressing deep neural networks by matrix product operators


    Ze-Feng Gao*,Song Cheng*, Rong-Qiang He, Zhi-Yuan Xie#, Hui-Hai Zhao#, Zhong-Yi Lu#, Tao Xiang#
    Physical Review Research 2 (2), 023300, 2020
    paper / arxiv / code / link /

    In this paper, we show that neural network can be effectively solved by representing linear transformations with matrix product operators (MPOs), which is a tensor network originally proposed in physics to characterize the short-range entanglement in one-dimensional quantum states.

    * Equal contribution           # Corresponding author




    社会服务

  • 程序委员:WSDM,CCIR
  • 领域主席:ACL
  • 审稿人:WSDM, ACCV, COLING, ACL, EMNLP



  • 荣誉奖励

  • ACL2023 最佳论文提名,2023.05
  • AI4SCup-OpenLAM Challenge-全元素周期表晶体结构集邮大赛, 一等奖,北京科学智能研究院, 2024.12
  • 2024年北京高校科协联盟创新场景挑战赛, Top10,北京高校科协联盟, 2024.12