Publications
“*” indicates equal contribution.
Preprints
Marco-MoE: Open Multilingual Mixture-of-Expert Language Models with Efficient Upcycling [PDF]
Fan Jiang, Yu Zhao, Chenyang Lyu, Tianqi Shi, Yichao Du, Feihu Jiang, Longyue Wang, Weihua Luo
CulturALL: Benchmarking Multilingual and Multicultural Competence of LLMs on Grounded Tasks [PDF]
Peiqin Lin, Chenyang Lyu, Wenjiang Luo, Haotian Ye, Md Mehrab Hossain, Chunlan Ma, Shaoxiong Ji, Younes Samih, Bo Zeng, Fan Jiang, Yuanbin Cao, Dilda Duisenbek, Adrian Neo Sau Xun, Daria Pozdniakova, Liubou Misevich, Nevena Marinković, Ngoc Gia Linh Nguyen, Thi Khanh Linh Do, Sarakmatak Sophy, Baotian Hu, Guanhua Chen, Gongbo Tang, Alham Fikri Aji, Longyue Wang, Weihua Luo
Difficulty-Estimated Policy Optimization [PDF]
Yu Zhao, Fan Jiang*, Tianle Liu, Bo Zeng, Yu Liu, Longyue Wang, Weihua Luo
2026
Tokenizer-Aware Cross-Lingual Adaptation of Decoder-Only LLMs through Embedding Relearning and Swapping [PDF]
Fan Jiang, Honglin Yu, Grace Chung, Trevor Cohn. In Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics. EACL’26
2025
Few-Shot Multilingual Open-Domain QA from 5 Examples [PDF]
Fan Jiang, Tom Drummond and Trevor Cohn. Transactions of the Association for Computational Linguistics. TACL’25
2024
Language Bias in Multilingual Information Retrieval: The Nature of the Beast and Mitigation Methods [PDF]
Jinrui Yang, Fan Jiang, and Timothy Baldwin. In Proceedings of the Fourth Workshop on Multilingual Representation Learning. MRL 2024
Pre-training Cross-lingual Open Domain Question Answering with Large-scale Synthetic Supervision [PDF] [Code]
Fan Jiang, Tom Drummond and Trevor Cohn. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing. EMNLP’24
2023
Boot and Switch: Alternating Distillation for Zero-Shot Dense Retrieval [PDF] [Code]
Fan Jiang, Qiongkai Xu, Tom Drummond and Trevor Cohn. In Findings of the Association for Computational Linguistics (EMNLP). EMNLP-Findings’23
Noisy Self-Training with Synthetic Queries for Dense Retrieval [PDF] [Code]
Fan Jiang, Tom Drummond and Trevor Cohn. In Findings of the Association for Computational Linguistics (EMNLP). EMNLP-Findings’23
Don’t Mess with Mister-in-Between: Improved Negative Search for Knowledge Graph Completion [PDF] [Code]
Fan Jiang, Tom Drummond and Trevor Cohn. In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics. EACL’23
2022
Incorporating Constituent Syntax for Coreference Resolution [PDF] [Code]
Fan Jiang and Trevor Cohn. In Proceedings of the 36th AAAI Conference on Artificial Intelligence. AAAI’22
2021
Incorporating Syntax and Semantics in Coreference Resolution with Heterogeneous Graph Attention Network [PDF] [Code]
Fan Jiang and Trevor Cohn. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. NAACL’21 (short)
