Haha

Yaru Hao (郝雅茹)

Researcher @ Microsoft Research

Email / Google Scholar / Github / LinkedIn

I am currently a researcher in General Artificial Intelligence (GenAI) group at Microsoft Research Asia (MSRA). Before that, I recieved my bachelor's and master's degrees in computer science from Beihang university in 2019 and 2022, advised by Dr. Ke Xu. During my master's studies, I spent most of my time as a research intern in MSRA, mentored by Dr. Li Dong and Dr. Furu Wei. I'm interested in developing simple yet effective methods to enhance foundation models and explore the science behind them.

Experience

Publications

Data Selection via Optimal Control for Language Models

Yuxian Gu, Li Dong, Hongning Wang, Yaru Hao, Qingxiu Dong, Furu Wei, Minlie Huang.

International Conference on Learning Representations (ICLR), Oral, 2025.

Kosmos-2: Grounding Multimodal Large Language Models to the World

Zhiliang Peng*, Wenhui Wang*, Li Dong*, Yaru Hao, Shaohan Huang, Shuming Ma, Furu Wei.

International Conference on Learning Representations (ICLR), 2024. [code]

Optimizing Prompts for Text-to-Image Generation

Yaru Hao*, Zewen Chi*, Li Dong, Furu Wei.

Neural Information Processing Systems (NeurIPS), Spotlight, 2023. [code]

Language Is Not All You Need: Aligning Perception with Language Models

Shaohan Huang*, Li Dong*, Wenhui Wang*, Yaru Hao*, Saksham Singhal*, Shuming Ma*, Tengchao Lv, Lei Cui, Owais Khan Mohammed, Barun Patra, Qiang Liu, Kriti Aggarwal, Zewen Chi#, Johan Bjorck, Vishrav Chaudhary, Subhojit Som, Xia Song, Furu Wei.

Neural Information Processing Systems (NeurIPS), 2023.

Prototypical Calibration for Few-shot Learning of Language Models

Zhixiong Han, Yaru Hao, Li Dong, Yutao Sun, Furu Wei.

International Conference on Learning Representations (ICLR), 2023.

Prototypical fine-tuning: Towards robust performance under varying data sizes

Yiqiao Jin, Xiting Wang, Yaru Hao, Yizhou Sun, Xing Xie

AAAI Conference on Artificial Intelligence (AAAI), 2023.

Why Can GPT Learn In-Context? Language Models Secretly Perform Gradient Descent as Meta-Optimizers

Damai Dai, Yutao Sun, Li Dong, Yaru Hao, Shuming Ma, Zhifang Sui, Furu Wei.

Findings of Association for Computational Linguistics (Findings of ACL), 2023.

Large Language Model for Science: A Study on P vs. NP

Qingxiu Dong*, Li Dong*, Ke Xu*, Guangyan Zhou, Yaru Hao, Zhifang Sui, Furu Wei.

arXiv preprint:2309.05689, 2023.

Structured Prompting: Scaling In-Context Learning to 1,000 Examples

Yaru Hao*, Yutao Sun*, Li Dong, Zhixiong Han, Yuxian Gu, Furu Wei.

arXiv preprint:2212.06713, 2022. [code]

Language Models are General-Purpose Interfaces

Yaru Hao*, Haoyu Song*, Li Dong*, Shaohan Huang, Zewen Chi, Wenhui Wang, Shuming Ma, Furu Wei.

arXiv preprint:2206.06336, 2022.

Knowledge Neurons in Pretrained Transformers

Damai Dai, Li Dong, Yaru Hao, Zhifang Sui, Furu Wei.

Association for Computational Linguistics (ACL), 2022.

Self-Attention Attribution: Interpreting Information Interactions Inside Transformer

Yaru Hao, Li Dong, Furu Wei, Ke Xu.

Best Paper Runner Up

AAAI Conference on Artificial Intelligence (AAAI), 2021. [code]

Learning to Sample Replacements for ELECTRA Pre-Training

Yaru Hao, Li Dong, Hangbo Bao, Ke Xu, Furu Wei.

Findings of Association for Computational Linguistics (Findings of ACL), 2021.

Investigating learning dynamics of BERT fine-tuning

Yaru Hao, Li Dong, Hangbo Bao, Ke Xu, Furu Wei.

Asia-Pacific Association for Computational Linguistics (AACL), 2020.

Visualizing and understanding the effectiveness of BERT

Yaru Hao, Li Dong, Furu Wei, Ke Xu.

Empirical Methods in Natural Language Processing (EMNLP), 2019.

Education