Zihao Fu (付子豪)
I am a Postdoc Researcher in Oxford Internet Institute, University of Oxford collaborating with Prof. Chris Russell, Prof. Brent Mittelstadt, and Prof. Sandra Wachter. Additionally, I serve as an Affiliated Lecturer at the Language Technology Lab (LTL) at the University of Cambridge. Previously, I was a Research Associate in LTL, University of Cambridge working with Prof. Nigel Collier. My research mainly focuses on Natural Language Processing, Large Language Model, Text Generation, Machine Learning, Biomedical Applications, LLM Policy and etc. Before I came to Cambridge, I received my Ph.D. degree from The Chinese University of Hong Kong under the supervision of Prof. Wai Lam. I have also been a visiting student at the NLP Lab of Tsinghua University, working with Prof. Zhiyuan Liu. Before I started my Ph.D. study, I have three years of experience in developing large-scale distributed parallel algorithms for the PAI platform in Alibaba Cloud.
Education & Research
2024–Now PostDoc University of Oxford, Oxford Internet Institute
2021–2024 PostDoc University of Cambridge, Language Technology Lab, Advisor: Prof. Nigel Collier
2017–2021 Ph.D. The Chinese University of Hong Kong, Department of Systems Engineering and Engineering Management, Advisor: Prof. Wai Lam
2012–2015 M. E. Beihang University, National Laboratory for Aeronautics and Astronautics, Advisor: Prof. Guanghong Gong
2008–2012 B. E. Beihang University, School of Automation Science and Electrical Engineering
Work
2015–2017 Machine Learning Algorithm Engineer IDST, Alibaba Cloud
Publications
Eoin Delaney, Zihao Fu, Sandra Wachter, Brent Mittelstadt, Chris Russell OxonFair: A Flexible Toolkit for Algorithmic Fairness (PrePrint 2024)
Zihao Fu, Meiru Zhang, Zaiqiao Meng, Yannan Shen, David Buckeridge, Nigel Collier. BAND: Biomedical Alert News Dataset. The 38th AAAI Conference on Artificial Intelligence (AAAI 2024).
Zihao Fu, Anthony Man-Cho So, Nigel Collier. A Stability Analysis of Fine-Tuning a Pre-Trained Model. PrePrint
Zihao Fu, Wai Lam, Qian Yu, Anthony Man-Cho So, Shengding Hu, Zhiyuan Liu, Nigel Collier. Decoder-Only or Encoder-Decoder? Interpreting Language Model as a Regularized Encoder-Decoder. PrePrint
Zihao Fu, Yixuan Su, Zaiqiao Meng, Nigel Collier. Biomedical Named Entity Recognition via Dictionary-based Synonym Generalization. The 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP 2023).
Zihao Fu, Haoran Yang, Anthony Man-Cho So, Wai Lam, Lidong Bing, Nigel Collier. On the Effectiveness of Parameter-Efficient Fine-Tuning. The 37th AAAI Conference on Artificial Intelligence (AAAI 2023).
Zihao Fu. Open Domain Text Generation. (PhD Thesis, The Chinese University of Hong Kong, 2021).
Zihao Fu, Wai Lam, Anthony Man-Cho So, Bei Shi. A Theoretical Analysis of the Repetition Problem in Text Generation. The 35th AAAI Conference on Artificial Intelligence (AAAI 2021).
Zihao Fu, Lidong Bing, Wai Lam. Open Domain Event Text Generation. The 34th AAAI Conference on Artificial Intelligence (AAAI 2020).
Zihao Fu, Bei Shi, Wai Lam, Lidong Bing, Zhiyuan Liu. Partially-Aligned Data-to-Text Generation with Distant Supervision. The 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP 2020).
Zihao Fu, Lidong Bing, Wai Lam, Shoaib Jameel. Dynamic Topic Tracker for KB-to-Text Generation. The 28th International Conference on Computational Linguistics (COLING 2020).
Zihao Fu, Bei Shi, Lidong Bing, Wai Lam. Unsupervised KB-to-Text Generation with Auxiliary Triple Extraction using Dual Learning. The 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing (AACL-IJCNLP 2020).
Zihao Fu, Yankai Lin, Zhiyuan Liu, Wai Lam. Fact Discovery from Knowledge Base via Facet Decomposition. The 2019 Conference of the North American Chapter of the Association for Computational Linguistics - Human Language Technologies (NAACL-HLT 2019).
Shoaib Jameel, Zihao Fu, Bei Shi, Wai Lam, Steven Schockaert. Word Embedding as Maximum A Posteriori Estimation. The 33rd AAAI Conference on Artificial Intelligence (AAAI 2019).
Bei Shi, Zihao Fu, Lidong Bing, Wai Lam. Learning Domain-Sensitive and Sentiment-Aware Word Embeddings. The 56th Annual Meeting of the Association for Computational Linguistics (ACL 2018).
Guo, Tszhang, Bowen Li, Zihao Fu, Tao Wan and Zengchang Qin. Learning Sentimental Weights of Mixed-gram Terms for Classification and Visualization. Pacific Rim International Conferences on Artificial Intelligence (PRICAI 2016).
Yanan Zhou, Zihao Fu, Guanghong Gong. Pilot Behavior Modeling Using LSTM Network: A Case Study. Asian Simulation Conference (ASC 2016).
Zihao Fu, Guanghong Gong. Explicit moment integration algorithm and its application. Journal of Beijing University of Aeronautics and Astronautics (JBUAA 2015).
Zihao Fu. Research on the Optimization Methods of the Blended-Wing-Body Aircraft. (Master Thesis 2015).
Invited Talks
Introduction to Large Language Model: Technology, Challenges, and Prospects. CSSA Cambrige. 2023. [Slide]
Language Model for Science. Cambridge Centre for Data-Driven Discovery. 2023.
Service
Reviewer : AAAI, ACL, EACL, EMNLP, ICASSP, ICDM, ICLR, ICML, NeurIPS, TKDE
External Reviewer : AAAI, CIKM, COLING, EMNLP, ICDM, IJCAI, SIGIR, TACL
Teaching
2022–2023 Li18, Computational Linguistics, Guest Lecturer, University of Cambridge
2021–2022 FTEC 5510, Advanced Financial Infrastructure, Teaching Assistant, CUHK
2020–2021 FTEC 5530, Quantitative and Algorithmic Trading, Teaching Assistant, CUHK
2020–2021 ENGG 2780B, Statistics for Engineers, Teaching Assistant, CUHK
2020–2020 SEEM 4610, Supply Chain Management, Teaching Assistant, CUHK
2017–2020 SEEM 3460, Computer Processing System Concepts, Teaching Assistant, CUHK
2017–2019 SEEM 4540, Open Systems for E-Commerce, Teaching Assistant, CUHK
Course Notes
2019 IERG 6300 : Probability Theory. By Prof. Chandra Nair [My Notes]
2019 SEEM 5380 : Optimization Methods for High-Dimensional Statistics. By Prof. Anthony Man-Cho So [My Notes]
2018 IERG 5130 : Probabilistic Models and Inference Algorithms for Machine Learning. By Prof. Dahua Lin [My Notes]
2018 ENGG 5781 : Matrix Analysis and Computations. By Prof. Wing-Kin (Ken) Ma [My Notes]
2017 ENGG 5501 : Foundations of Optimization. By Prof. Anthony Man-Cho So [My Notes]
Projects
BioCaster The BioCaster website at the University of Cambridge presents structured information about disease outbreaks, links to Internet news and related knowledge resources to people interested in public health and safety.
PaperArxiv A paper management tool that helps to organize the mind. It focuses on taking notes, organizing, and archiving papers.
unillm An unified large language model interface for ChatGPT, LLaMA(2, 3), Mistral, Claude, RAG, CommandRPlus, and etc.
StreamTask A lightweight python parallel framework for parallelizing the computationally intensive pipelines.
CAM-Tool A Cloud Assignment Manager tool that helps manage tasks across different machines.
BitTrader A real-time quantitative crypto coin trading, backtesting, and strategy developing plantform based on Backtrader.
CSTL A C++ STL containers wrapper for python to partially solve the Copy-on-Write issue.
ChatGPTHelper A plugin that automatically saves ChatGPT history and predefined prompts to local disk. This allows us to disable the official history feature, preventing our data from being used in training processes.