Jing ZHANG

Professor
Department of Computer Science and Technology
School of Information
Renmin University of China

Email: zhang-jing AT ruc DOT edu DOT cn

Address: Room 427, Information Building, Renmin University of China, Beijing


Publications | Students | Talks | Services | Teaching


I am a professor at School of Information, Renmin University of China. Prior to that, I received my Ph.D. degree from Department of Computer Science and Technology, Tsinghua University under supervision of Professor Jie Tang and Professor Juanzi Li. My research focuses on data mining and knowledge discovery, with an emphasis on tailoring large language models (LLMs) for structured data processing to advance their application in data science. Specifically, I investigate model alignment methods, including data synthesis and learning from human-AI feedback, to enhance LLMs' capabilities in querying, manipulating, and analyzing structured data stored in databases, spreadsheets, and APIs. Additionally, I explore model compression techniques and efficient inference methods to improve the deployment of these models. More information about my experience and research can be found on Google Scholar.

I am looking for highly-motivated students to work with me. If interested, please drop me a message by email.


Publications

  • all : conference | journal | pre-print
  • large language model : LLM alignment | LLM Compression | LLM for Structured Data | LLM for DB | LLM for Tool Using | LLM Hallucination | evaluation
  • knowledge reasoning : entity linking | entity alignment | knowledge graph completion | question answering | dialogue | text2sql
  • graph representation learning : gnn | pre-training and SSL for gnn | probabilistic graphic model | node similarity
  • social computing : recommendation | social influence | link prediction
    1. SpreadsheetBench: Towards Challenging Real World Spreadsheet Manipulation.
      Zeyao Ma, Bohan Zhang, Jing Zhang*, Jifan Yu, Xiaokang Zhang, Xiaohan Zhang, Sijia Luo, Xi Wang, Jie Tang
      Neurips'24.
      code&data
    2. PowerPM: Foundation Model for Power Systems.
      Shihao Tu, Yupeng Zhang, Jing Zhang, Zhendong Fu, Yin Zhang, and Yang Yang.
      Neurips'24.
    3. PCQPR: Proactive Conversational Question Planning with Reflection.
      Shasha Guo, Lizi Liao, Jing Zhang, Cuiping Li, Hong Chen
      EMNLP'24.
    4. Authorship style transfer with inverse transfer data augmentation.
      Zhonghui Shao, Jing Zhang*, Haoyang Li, Xinmei Huang, Chao Zhou, Yuanchun Wang, Jibing Gong, Cuiping Li, Hong Chen
      AI Open'24.
    5. R-Eval: A Unified Toolkit for Evaluating Domain Knowledge of Retrieval Augmented Large Language Models.
      Shangqing Tu, Yuanchun Wang, Jifan Yu, Yuyang Xie, Yaran Shi, Xiaozhi Wang, Jing Zhang*, Lei Hou, Juanzi Li
      KDD'24.
    6. Transferable and Efficient Non-Factual Content Detection via Probe Training with Offline Consistency Checking.
      Xiaokang Zhang, Zijun Yao, Jing Zhang*, Kaifeng Yun, Jifan Yu, Juanzi Li, Jie Tang
      ACL'24.
    7. AlignBench: Benchmarking Chinese Alignment of Large Language Models.
      Xiao Liu, Xuanyu Lei, Shengyuan Wang, Yue Huang, Zhuoer Feng, Bosi Wen, Jiale Cheng, Pei Ke, Yifan Xu, Weng Lam Tam, Xiaohan Zhang, Lichao Sun, Hongning Wang, Jing Zhang, Minlie Huang, Yuxiao Dong, Jie Tang
      ACL'24.
    8. SP^3: Enhancing Structured Pruning via PCA Projection.
      Yuxuan Hu, Jing Zhang*, Zhe Zhao, Chen Zhao, Xiaodong Chen, Cuiping Li, Hong Chen.
      Findings of ACL'24.
    9. LLMTune: Accelerate Database Knob Tuning with Large Language Models.
      Xinmei Huang, Haoyang Li, Jing Zhang*, Xinxin Zhao, Zhiming Yao, Yiyan Li, Zhuohao Yu,Tieying Zhang, Hong Chen, Cuiping Li
      arXiv:2404.11581.
    10. Hidden Question Representations Tell Non-Factuality Within and Across Large Language Models.
      Yanling Wang, Haoyang Li, Hao Zou, Jing Zhang, Xinlei He, Qi Li, Ke Xu
      arXiv:2406.05328.
    11. A Solution-based LLM API-using Methodology for Academic Information Seeking.
      Yuanchun Wang, Jifan Yu, Zijun Yao, Jing Zhang*, Yuyang Xie, Shangqing Tu, Yiyang Fu, Youhe Feng, Jinkai Zhang, Jingyao Zhang, Bowen Huang, Yuanyao Li, Huihui Yuan, Lei Hou, Juanzi Li, Jie Tang
      arXiv:2405.15165.
    12. TableLLM: Enabling Tabular Data Manipulation by LLMs in Real Office Usage Scenarios.
      Xiaokang Zhang, Jing Zhang*, Zeyao Ma, Yang Li, Bohan Zhang, Guanlin Li, Zijun Yao, Kangli Xu, Jinchang Zhou, Daniel Zhang-Li, Jifan Yu, Shu Zhao, Juanzi Li, Jie Tang
      arXiv:2403.19318.
      code&model&data&demo
    13. Compressing Large Language Models by Streamlining the Unimportant Layer.
      Xiaodong Chen, Yuxuan Hu, Jing Zhang*
      arXiv:2403.19135.
    14. Large Language Model for Table Processing: A Survey.
      Weizheng Lu, Jiaming Zhang, Jing Zhang*, Yueguo Chen
      arXiv:2402.05121.
    15. CodeS: Towards Building Open-source Language Models for Text-to-SQL.
      Haoyang Li, Jing Zhang*, Hanbing Liu, Ju Fan, Xiaokang Zhang, Jun Zhu, Renjie Wei, Hongyan Pan, Cuiping Li, Hong Chen.
      SIGMOD'24.
      code&model&data
    16. Open-World Semi-Supervised Learning for Node Classification.
      Yanling Wang, Jing Zhang*, Lingxi Zhang, Lixin Liu, Yuxiao Dong, Cuiping Li, Hong Chen, Hongzhi Yin.
      ICDE'24.
    17. A Cause-Effect Look at Alleviating Hallucination of Knowledge-grounded Dialogue Generation.
      Jifan Yu, Xiaohan Zhang, Yifan Xu, Xuanyu Lei, Zijun Yao, Jing Zhang, Lei Hou, Juanzi Li
      LREC-COLING'24.
    18. Diversifying Question Generation over Knowledge Base via External Natural Questions.
      Shasha Guo, Jing Zhang*, Xirui Ke, Cuiping Li, Hong Chen
      LREC-COLING'24.
    19. SGSH: Stimulate Large Language Models with Skeleton Heuristics for Knowledge Base Question Generation.
      Shasha Guo, Lizi Liao, Jing Zhang*, Yanling Wang, Cuiping Li, Hong Chen
      Findings of NAACL'24.
    20. A Generation-based Deductive Method for Math Word Problems.
      Yuxuan Hu, Jing Zhang*, Haoyang Li, Cuiping Li, Hong Chen.
      EMNLP'23.
    21. FFAEval: Evaluating Dialogue System via Free-For-All Ranking.
      Zeyao Ma, Zijun Yao, Jing Zhang*, Jifan Yu, Xiaohan Zhang, Juanzi Li, Jie Tang.
      Findings of EMNLP'23.
    22. GLM-Dialog: Noise-tolerant Pre-training for Knowledge-grounded Dialogue Generation.
      Jing Zhang, Xiaokang Zhang, Daniel Zhang-Li, Jifan Yu, Zijun Yao, Zeyao Ma, Yiqi Xu, Haohua Wang, Xiaohan Zhang, Nianyi Lin, Sunrui Lu, Juanzi Li, Jie Tang.
      KDD'23 (ADS).
    23. Web-Scale Academic Name Disambiguation: the WhoIsWho Benchmark, Leaderboard, and Toolkit.
      Bo Chen, Jing Zhang*, Fanjin Zhang, Tianyi Han, Yuqing Cheng, Xiaoyan Li, Yuxiao Dong, Jie Tang.
      KDD'23 (ADS).
    24. FC-KBQA: A Fine-to-Coarse Composition Framework for Knowledge Base Question Answering.
      Lingxi Zhang, Jing Zhang*, Yanling Wang, Shulin Cao, Xinmei Huang, Cuiping Li, Hong Chen, Juanzi Li.
      ACL'23.
    25. Chain of Thought Prompting Elicits Knowledge Augmentation.
      Dingjun Wu, Jing Zhang*, Xinmei Huang.
      Findings of ACL'23.
    26. RESDSQL: Decoupling Schema Linking and Skeleton Parsing for Text-to-SQL.
      Haoyang Li, Jing Zhang*, Cuiping Li, Hong Chen.
      AAAI'23.
    27. A survey on complex factual question answering.
      Lingxi Zhang, Jing Zhang*, Xirui Ke, Haoyang Li, Xinmei Huang, Zhonghui Shao, Shulin Cao, Xin Lv.
      AI Open'22.
    28. Graph Contrastive Learning for Anomaly Detection.
      Bo Chen, Jing Zhang*, Xiaokang Zhang, Yuxiao Dong, Jian Song, Peng Zhang, Kaibo Xu, Evgeny Kharlamov, Jie Tang.
      TKDE'22.
    29. DSM: Question Generation over Knowledge Base via Modeling Diverse Subgraphs with Meta-learner.
      Shasha Guo, Jing Zhang*, Yanling Wang, Qianyi Zhang, Cuiping Li and Hong Chen.
      EMNLP'22.
    30. Knowledge-augmented Self-training of A Question Rewriter for Conversational Knowledge Base Question Answering.
      Xirui Ke, Jing Zhang*, Xin Lv, Yiqi Xu, Shulin Cao, Cuiping Li, Hong Chen and Juanzi Li.
      Findings of EMNLP'22 (Findings of the Association for Computational Linguistics: EMNLP 2022).
    31. XDAI: A Tuning-free Framework for Exploiting Pre-trained Language Models in Knowledge Grounded Dialogue Generation.
      Jifan Yu, Xiaohan Zhang, Yifan Xu, Xuanyu Lei, Xinyu Guan, Jing Zhang, Lei Hou, Juanzi Li, and Jie Tang.
      KDD'22.
    32. Subgraph Retrieval Enhanced Model for Multi-hop Knowledge Base Question Answering.
      Jing Zhang, Xiaokang Zhang, Jifan Yu, Jian Tang, Jie Tang, Cuiping Li, Hong Chen.
      ACL'22.
    33. HOSMEL: A Hot-Swappable Modulized Entity Linking Toolkit for Chinese.
      Daniel Zhang-Li, Jing Zhang*, Jifan Yu, Xiaokang Zhang, Peng Zhang, Jie Tang, Juanzi Li.
      ACL'22 (Demo).
    34. ClusterSCL: Cluster-Aware Supervised Contrastive Learning on Graphs.
      Yanling Wang, Jing Zhang*, Haoyang Li, Yuxiao Dong, Hongzhi Yin, Cuiping Li, Hong Chen.
      WWW'22.
    35. CODE: Contrastive Pre-training with Adversarial Fine-tuning for Zero-shot Expert Linking.
      Bo Chen, Jing Zhang*, Xiaokang Zhang, Xiaobin Tang, Lingfan Cai, Hong Chen, Cuiping Li, Peng Zhang, Jie Tang.
      AAAI'22.
    36. A Pretraining Numerical Reasoning Model for Ordinal Constrained Question Answering on Knowledge Base.
      Yu Feng, Jing Zhang*, Gaole He, Wayne Xin Zhao, Lemao Liu, Quan Liu, Cuiping Li, Hong Chen.
      Findings of EMNLP'21 (Findings of the Association for Computational Linguistics: EMNLP 2021).
    37. P-INT: A Path-based Interaction Model for Few-shot Knowledge Graph Completion.
      Jingwen Xu, Jing Zhang*, Xirui Ke, Yuxiao Dong, Hong Chen, Cuiping Li, Yongbin Liu.
      Findings of EMNLP'21 (Findings of the Association for Computational Linguistics: EMNLP 2021).
    38. Neural, symbolic and neural-symbolic reasoning on knowledge graphs.
      Jing Zhang, Bo Chen, Lingxi Zhang, Xirui Ke and Haipeng Ding.
      AI Open'21.
    39. Decoupling Representation Learning and Classification for GNN-based Anomaly Detection.
      Yanling Wang, Jing Zhang*, Hongzhi Yin, Cuiping Li, and Hong Chen.
      SIGIR'21 (Proceedings of the 44rd ACM International SIGIR Conference on Research and Development in Information Retrieval).
    40. OAG_know: Self-supervised Learning for Linking Knowledge Graphs.
      Xiao Liu, Li Mian, Yuxiao Dong, Fanjin Zhang, Jing Zhang, Jie Tang, Peng Zhang, Jibing Gong, and Kuansan Wang.
      TKDE'21 (IEEE Transaction on Knowledge and Data Engineering).
    41. GCC: Graph Contrastive Coding for Graph Neural Network Pre-Training.
      Jiezhong Qiu, Qibin Chen, Yuxiao Dong, Jing Zhang, Hongxia Yang, Ming Ding, Kuansan Wang, and Jie Tang.
      KDD'20 (Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining).
    42. BERT-INT: A BERT-based Interaction Model For Knowledge Graph Alignment.
      Xiaobin Tang, Jing Zhang*, Bo Chen, Yang Yang, Hong Chen, Cuiping Li.
      IJCAI'20 (Proceedings of the 29th International Joint Conference on Artificial Intelligence).
    43. CONNA: Addressing Name Disambiguation on The Fly.
      Bo Chen, Jing Zhang*, Jie Tang, Lingfan Cai, Zhaoyu Wang, Shu Zhao, Hong Chen, and Cuiping Li.
      TKDE'20 (IEEE Transaction on Knowledge and Data Engineering).
    44. Robust Network Enhancement from Flawed Networks.
      Jiarong Xu, Yang Yang, Chunping Wang, Zongtao Liu, Jing Zhang, Lei Chen and Jiangang Lu.
      TKDE'20 (IEEE Transaction on Knowledge and Data Engineering).
    45. JarKA: Modeling Attribute Interactions for Cross-lingual Knowledge Alignment.
      Bo Chen, Jing Zhang*, Xiaobin Tang, Hong Chen, Cuiping Li.
      PAKDD'20 (Proceedings of the 24th Pacific-Asia Conference on Knowledge Discovery and Data Mining).
    46. Graph Convolutional Network using a Reliability-based Feature Aggregation Mechanism.
      Yanling Wang, Cuiping Li, Jing Zhang, Peng Ni, Hong Chen.
      DASFAA'20 (Proceedings of the 5th International Conference on Database Systems for Advanced Applications).
    47. Trust Relationship Prediction in Alibaba E-Commerce Platform.
      Yukuo Cen, Jing Zhang*, Gaofei Wang, Yujie Qian, Chuizheng Meng, Zonghong Dai, Hongxia Yang, Jie Tang.
      TKDE'19 (IEEE Transaction on Knowledge and Data Engineering).
    48. MEgo2Vec: Embedding Matched Ego Networks for User Alignment Across Social Networks.
      Jing Zhang, Bo Chen, Xianming Wang, Hong Chen, Cuiping Li, Fengmei Jin, Guojie Song and Yutao Zhang.
      CIKM'18 (Proceedings of the International Conference on Information and Knowledge Management).
    49. Fast and Flexible Top-k Similarity Search on Large Networks.
      Jing Zhang, Jie Tang, Cong Ma, Hanghang Tong, Yu Jing, Juanzi Li, Walter Luyten, and Marie-Francine Moens.
      TOIS'17 (ACM Transactions on Information Systems).
    50. Panther: Fast Top-k Similarity Search on Large Networks.
      Jing Zhang, Jie Tang, Cong Ma, Hanghang Tong, Yu Jing, and Juanzi Li.
      SIGKDD'15 (Proceedings of the 21rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining).
      code&data Slides_PPT Slides_PDF Poster
    51. A Unified Probabilistic Framework for Name Disambiguation in Digital Library.
      Jie Tang, A.C.M. Fong, Bo Wang, and Jing Zhang.
      TKDE'12.

    Students

    I'm very lucky to have the opportunities to work with these brilliant students (ordered by year of enrollment).

    Graduate Students

    1. Zhiming Yao, Ph.D. student, Fall 2024 -
    2. Yang Li, Ph.D. student, Fall 2024 -
    3. Bohan Zhang, M.S. student, Fall 2024 -
    4. Xiaodong Chen, M.S. student, Fall 2024 -
    5. Guanlin Li, M.S. student, Fall 2024 -
    6. Shang Wu, M.S. student, Fall 2024 -
    7. Yuanchun Wang, Ph.D. student, Fall 2023 -
    8. Xinxin Zhao, Ph.D. student, Fall 2023 - (Co-supervised with Prof. Cuiping Li)
    9. Zeyao Ma, M.S. student, Fall 2023 -
    10. Xinmei Huang, M.S. student, Fall 2023 -
    11. Yuxuan Hu, Ph.D. student, Fall 2022 -
    12. Xiaokang Zhang, M.S. student, Fall 2022 -
    13. Zhonghui Shao, M.S. student, Fall 2022 -
    14. Haoyang Li, Ph.D. student, Fall 2021 - (Co-supervised with Prof. Cuiping Li)
    15. Shasha Guo, Ph.D. student, Fall 2020 - (Co-supervised with Prof. Cuiping Li)
    Alumni
    1. Yiqi Xu, M.S. 2022-2024. Thesis Title:Factuality detection method combining inside and outside information of Large Language Models, First Employment: Agricultural Bank of China
    2. Xirui Ke, M.S. 2021-2024. Thesis Title:Knowledge Base Question Answering Based on Pre-trained Language Models, First Employment: JingDong
    3. Lingxi Zhang, M.S. 2021-2024. Thesis Title:Generalization of knowledge base question answering, First Employment: Pursuing a Ph.D. in Computer Science at Rice University
    4. Yanling Wang, Ph.D., 2018-2023. Dissertation Title: Research on Graph Contrastive Learning for Complex Scenes (Co-supervised with Prof. Cuiping Li), First Employment: Zhongguancun Laboratory
    5. Jingwen Xu, M.S., 2020-2023. Thesis Title: Few-shot Knowledge Graph Completion Based on Path Interaction Supervised Learning and Large Language Model Prompt Learning, First Employment: China Construction Bank
    6. Chenxu Hu, M.S., 2020-2022. Thesis Title: Metadata Management under Multi-model Big Data Systems (Co-supervised with Prof. Feng Zhang), First Employment: Haizhixingtu
    7. Xiaobin Tang, M.S., 2019-2022. Thesis Title: Interaction and Self-training based Methods for Entity Alignment, First Employment: China Tobacco
    8. Xiangying Cao, M.S., 2019-2021. Thesis Title: Entity Linking-based Disambiguation of Academic Institution Names, First Employment: People's Bank of China
    9. Lingfan Cai, M.S., 2018-2021. Thesis Title: The optimization of zero-shot entity linking, First Employment: CITIC Securities
    10. Bowen Hao, Ph.D., 2017-2022. Dissertation Title: A Research on User Profile Denoising and Cold-start Issue in Recommender System (Co-supervised with Prof. Cuiping Li). First Employment: Assistant Professor, Capital Normal University
    11. Bo Chen, M.S., 2017-2020. Thesis Title: Knowledge Graph Integration and Entity Linking from Multi-Sources (Co-supervised with Prof. Hong Chen). First Employment: Pursuing a Ph.D. in Computer Science and Technology at Tsinghua University

    Invited Talks

    1. 2024: Invited Talk about StructureDataLLM at WAIC 2024. slides
    2. 2024: Invited Talk about LLM Alignment at YOCSEF 2024. slides
    3. 2024: Invited Talk about LLM4DB at Huawei.
    4. 2023: Invited Talk about StructruedDataLLM at the 11th China National Conference on Social Media Processing.
    5. 2023: Invited Talk about Integrating Structured Data With LLM at School of Computer Science & Techonlogy, Anhui University.
    6. 2023: Invited Talk about Integrating Structured Data With LLM by CCF Database Special Committee.
    7. 2023: Invited Talk about ChatGPT at PingCAP.
    8. 2022: Invited Talk about Graph Contrastive Learning at BAAI Seminar 2022. slides
    9. 2022: Invited Talk about Knowledge Graph Question Answering on WAIC 2022. slides
    10. 2021: Invited Talk about Neural-Symbolic Reasoning on Knowledge Graphs on CCKS 2021. slides
    11. 2021: Invited Talk about Knowledge Graph Question Answering on CNCC 2021. slides
    12. 2021: Invited Talk about Graph Self-supervised Learning on CCAI 2021. slides
    13. 2021: Invited Talk about Neural-Symbolic Reasoning on Knowledge Graphs at BAAI Seminar. slides
    14. 2021: Invited Talk about Graph Self-supervised Learning at School of Computer Science & Technology, HUST.
    15. 2020: Invited Talk at NLPCC 2020 Student Workshop. slides
    16. 2019: Invited Talk at Sino-German International Seminar. slides
    17. 2018: Invited Talk at ACML 2018 Workshop on Machine Learning in Education. slides

    Professional Services

    Journal Editors:

    1. Associate Editor of IEEE TBD, 2023--
    2. Associate Editor of AI OPEN, 2020--

    Conference PC members:

    1. 2025: KDD, WWW, ARR, ICLR
    2. 2024: KDD, WWW, AAAI, ARR, NeurIPS
    3. 2023: KDD, WWW (SPC), AAAI, ECML/PKDD (SPC)
    4. 2022: KDD, WWW, IJCAI, AAAI, WSDM
    5. 2021: KDD, WWW, IJCAI, ECML/PKDD (SPC)
    6. 2020: KDD, WWW, IJCAI, ECML/PKDD

    Journal Reviewers:

    1. TKDE, IEEE Transactions on Knowledge and Data Engineering
    2. TOIS, ACM Transactions on Information Systems
    3. TPAMI, Transactions on Pattern Analysis and Machine Intelligence
    4. TKDD, ACM Transactions on the Knowledge Discovery from Data
    5. TWEB, ACM Transactions on the Web
    6. TBD, IEEE Transactions on Big Data
    7. JCST, Journal of Computer Science and Technology
    8. JASIST, Journal of the Association for Information Science and Technology
    9. SCIENCE CHINA Information Sciences

    Teaching

    1. 2021- Data Structure and Algorithm
    2. 2020- Deep Learning

    Contact: zhang-jing AT ruc DOT edu DOT cn