Chen Shen (申晨)

I am a Senior Algorithm Expert at Alibaba Cloud’s Apsara Lab and Tongyi Lab. I obtained my Ph.D. degree and B.S. degree both from Zhejiang University in 2018 and 2012, respectively.

My current research focuses on large language models for reasoning and content safety. Prior to this, I contributed to the nationwide implementation of the City Brain Project in multiple cities across China, with a focus on applying artificial intelligence to urban governance in order to enhance administrative efficiency. I have published over 20 papers in top-tier conferences such as NeurIPS / ICLR / ICML / CVPR / ECCV / ACL / NAACL / KDD / ACM MM and leading journals such as TIP / TCSVT / TMM / TKDE / TOMM.

We are recruiting interns who are interested in large language model reasoning. Please feel free to contact me.

news

Aug 21, 2025	Two papers was accepted by EMNLP 2025.
May 15, 2025	One paper was accepted by ACL 2025 main conference.
Jan 23, 2025	Two papers were accepted by ICLR 2025.
Jan 23, 2025	Two papers were accepted by NAACL 2025 with one oral presentation.
Sep 26, 2024	One paper was accepted by NeurIPS 2024.
May 17, 2024	One paper was accepted by KDD 2024 - Research Track.
May 02, 2024	One paper was accepted by ICML 2024.

selected publications

(*Corresponding Author, †Project Lead)

ICLR

Don’t Take Things Out of Context: Attention Intervention for Enhancing Chain-of-Thought Reasoning in Large Language Models

Shaotian Yan, Chen Shen^*, Wenxiao Wang, Liang Xie, Junjie Liu, and Jieping Ye

In The Thirteenth International Conference on Learning Representations, 2025

Abs arXiv

Few-shot Chain-of-Thought (CoT) significantly enhances the reasoning capabilities of large language models (LLMs), functioning as a whole to guide these models in generating reasoning steps toward final answers. However, we observe that isolated segments, words, or tokens within CoT demonstrations can unexpectedly disrupt the generation process of LLMs. The model may overly concentrate on certain local information present in the demonstration, introducing irrelevant noise into the reasoning process and potentially leading to incorrect answers. In this paper, we investigate the underlying mechanism of CoT through dynamically tracing and manipulating the inner workings of LLMs at each output step, which demonstrates that tokens exhibiting specific attention characteristics are more likely to induce the model to take things out of context; these tokens directly attend to the hidden states tied with prediction, without substantial integration of non-local information. Building upon these insights, we propose a Few-shot Attention Intervention method (FAI) that dynamically analyzes the attention patterns of demonstrations to accurately identify these tokens and subsequently make targeted adjustments to the attention weights to effectively suppress their distracting effect on LLMs. Comprehensive experiments across multiple benchmarks demonstrate consistent improvements over baseline methods, with a remarkable 5.91% improvement on the AQuA dataset, further highlighting the effectiveness of FAI.
ICLR

Improving Complex Reasoning with Dynamic Prompt Corruption: A Soft Prompt Optimization Approach

Sinan Fan, Liang Xie, Chen Shen^*†, Ge Teng, Xiaosong Yuan, Xiaofeng Zhang, Chenxi Huang, and 3 more authors

In The Thirteenth International Conference on Learning Representations, 2025

arXiv Poster
NAACL

Concise and Organized Perception Facilitates Reasoning in Large Language Models

Junjie Liu, Shaotian Yan, Chen Shen^*, Zhengdong Xiao, Liang Xie, Wenxiao Wang, and Jieping Ye

In Findings of the Association for Computational Linguistics: NAACL 2025, 2025

arXiv HTML
NAACL

From Redundancy to Relevance: Information Flow in LVLMs Across Reasoning Tasks

Xiaofeng Zhang, Yihao Quan, Chen Shen^†, Xiaosong Yuan, Shaotian Yan, Liang Xie, Wenxiao Wang, and 3 more authors

In Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 2025

arXiv HTML
NeurIPS

Instance-adaptive Zero-shot Chain-of-Thought Prompting

Xiaosong Yuan, Chen Shen^†, Shaotian Yan, Xiaofeng Zhang, Liang Xie, Wenxiao Wang, Renchu Guan, and 2 more authors

In Advances in Neural Information Processing Systems, 2024

arXiv HTML