avatar

Zihao Fu

University of Oxford



Zihao Fu (付子豪)

I am a Postdoc Researcher in University of Oxford. Previously, I was a Research Associate in the Language Technology Lab, University of Cambridge working with Prof. Nigel Collier. My research mainly focuses on Natural Language Processing, Large Language Model, Text Generation, Machine Learning, Biomedical Applications, and etc. Before I came to Cambridge, I received my Ph.D. degree from The Chinese University of Hong Kong under the supervision of Prof. Wai Lam. I have also been a visiting student at the NLP Lab of Tsinghua University, working with Prof. Zhiyuan Liu. Before I started my Ph.D. study, I have three years of experience in developing large-scale distributed parallel algorithms for the PAI platform in Alibaba Cloud.

Education

 drawing  2024–Now      PostDoc      University of Oxford, Oxford Internet Institute
 drawing  2021–2024      PostDoc      University of Cambridge, Language Technology Lab, Advisor: Prof. Nigel Collier
drawing 2017–2021      Ph.D.          The Chinese University of Hong Kong, Department of Systems Engineering and Engineering Management, Advisor: Prof. Wai Lam
drawing 2012–2015      M. E.          Beihang University, National Laboratory for Aeronautics and Astronautics, Advisor: Prof. Guanghong Gong
drawing 2008–2012      B. E.           Beihang University, School of Automation Science and Electrical Engineering

Work

drawing 2015–2017      Machine Learning Algorithm Engineer          IDST, Alibaba Cloud

Publications

Zihao Fu, Meiru Zhang, Zaiqiao Meng, Yannan Shen, David Buckeridge, Nigel Collier. BAND: Biomedical Alert News Dataset. The 38th AAAI Conference on Artificial Intelligence (AAAI 2024).
Zihao Fu, Anthony Man-Cho So, Nigel Collier. A Stability Analysis of Fine-Tuning a Pre-Trained Model. PrePrint
Zihao Fu, Wai Lam, Qian Yu, Anthony Man-Cho So, Shengding Hu, Zhiyuan Liu, Nigel Collier. Decoder-Only or Encoder-Decoder? Interpreting Language Model as a Regularized Encoder-Decoder. PrePrint
Zihao Fu, Yixuan Su, Zaiqiao Meng, Nigel Collier. Biomedical Named Entity Recognition via Dictionary-based Synonym Generalization. The 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP 2023).
Zihao Fu, Haoran Yang, Anthony Man-Cho So, Wai Lam, Lidong Bing, Nigel Collier. On the Effectiveness of Parameter-Efficient Fine-Tuning. The 37th AAAI Conference on Artificial Intelligence (AAAI 2023).
Zihao Fu. Open Domain Text Generation. (PhD Thesis, The Chinese University of Hong Kong, 2021).
Zihao Fu, Wai Lam, Anthony Man-Cho So, Bei Shi. A Theoretical Analysis of the Repetition Problem in Text Generation. The 35th AAAI Conference on Artificial Intelligence (AAAI 2021).
Zihao Fu, Lidong Bing, Wai Lam. Open Domain Event Text Generation. The 34th AAAI Conference on Artificial Intelligence (AAAI 2020).
Zihao Fu, Bei Shi, Wai Lam, Lidong Bing, Zhiyuan Liu. Partially-Aligned Data-to-Text Generation with Distant Supervision. The 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP 2020).
Zihao Fu, Lidong Bing, Wai Lam, Shoaib Jameel. Dynamic Topic Tracker for KB-to-Text Generation. The 28th International Conference on Computational Linguistics (COLING 2020).
Zihao Fu, Bei Shi, Lidong Bing, Wai Lam. Unsupervised KB-to-Text Generation with Auxiliary Triple Extraction using Dual Learning. The 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing (AACL-IJCNLP 2020).
Zihao Fu, Yankai Lin, Zhiyuan Liu, Wai Lam. Fact Discovery from Knowledge Base via Facet Decomposition. The 2019 Conference of the North American Chapter of the Association for Computational Linguistics - Human Language Technologies (NAACL-HLT 2019).
Shoaib Jameel, Zihao Fu, Bei Shi, Wai Lam, Steven Schockaert. Word Embedding as Maximum A Posteriori Estimation. The 33rd AAAI Conference on Artificial Intelligence (AAAI 2019).
Bei Shi, Zihao Fu, Lidong Bing, Wai Lam. Learning Domain-Sensitive and Sentiment-Aware Word Embeddings. The 56th Annual Meeting of the Association for Computational Linguistics (ACL 2018).
Guo, Tszhang, Bowen Li, Zihao Fu, Tao Wan and Zengchang Qin. Learning Sentimental Weights of Mixed-gram Terms for Classification and Visualization. Pacific Rim International Conferences on Artificial Intelligence (PRICAI 2016).
Yanan Zhou, Zihao Fu, Guanghong Gong. Pilot Behavior Modeling Using LSTM Network: A Case Study. Asian Simulation Conference (ASC 2016).
Zihao Fu, Guanghong Gong. Explicit moment integration algorithm and its application. Journal of Beijing University of Aeronautics and Astronautics (JBUAA 2015).
Zihao Fu. Research on the Optimization Methods of the Blended-Wing-Body Aircraft. (Master Thesis 2015).

Invited Talks

Introduction to Large Language Model: Technology, Challenges, and Prospects. CSSA Cambrige. 2023. [Slide]

Language Model for Science. Cambridge Centre for Data-Driven Discovery. 2023.

Service

Reviewer : EMNLP, ACL, AAAI, TKDE, ICML, EACL, ICASSP, ICDM
External Reviewer : TACL, CIKM, ICDM, EMNLP, COLING, AAAI, IJCAI, SIGIR

Teaching

2022–2023  Li18, Computational Linguistics, Guest Lecturer, University of Cambridge
2021–2022  FTEC 5510, Advanced Financial Infrastructure, Teaching Assistant, CUHK
2020–2021  FTEC 5530, Quantitative and Algorithmic Trading, Teaching Assistant, CUHK
2020–2021  ENGG 2780B, Statistics for Engineers, Teaching Assistant, CUHK
2020–2020  SEEM 4610, Supply Chain Management, Teaching Assistant, CUHK
2017–2020  SEEM 3460, Computer Processing System Concepts, Teaching Assistant, CUHK
2017–2019  SEEM 4540, Open Systems for E-Commerce, Teaching Assistant, CUHK

Awards

Outstanding postgraduate students (2015); Outstanding graduate students (2012); Second Prize of National Undergraduate Electronic Design Contest (2012); Second Prize of National Mathematical Contest in Modeling (2011); Beihang Scholarship (four times)

Course Notes

2019 IERG 6300 : Probability Theory. By Prof. Chandra Nair [My Notes]
2019 SEEM 5380 : Optimization Methods for High-Dimensional Statistics. By Prof. Anthony Man-Cho So [My Notes]
2018 IERG 5130 : Probabilistic Models and Inference Algorithms for Machine Learning. By Prof. Dahua Lin [My Notes]
2018 ENGG 5781 : Matrix Analysis and Computations. By Prof. Wing-Kin (Ken) Ma [My Notes]
2017 ENGG 5501 : Foundations of Optimization. By Prof. Anthony Man-Cho So [My Notes]

Projects

BioCaster The BioCaster website at the University of Cambridge presents structured information about disease outbreaks, links to Internet news and related knowledge resources to people interested in public health and safety.
drawing PaperArxiv  A paper management tool that helps to organize the mind. It focuses on taking notes, organizing, and archiving papers.
unillm An unified large language model interface for ChatGPT, LLaMA(2, 3), Mistral, Claude, RAG, CommandRPlus, and etc.
StreamTask  A lightweight python parallel framework for parallelizing the computationally intensive pipelines.
CAM-Tool  A Cloud Assignment Manager tool that helps manage tasks across different machines.
BitTrader  A real-time quantitative crypto coin trading, backtesting, and strategy developing plantform based on Backtrader.
CSTL A C++ STL containers wrapper for python to partially solve the Copy-on-Write issue.
ChatGPTHelper A plugin that automatically saves ChatGPT history and predefined prompts to local disk. This allows us to disable the official history feature, preventing our data from being used in training processes.