Chen Shen (申晨)

Senior Algorithm Expert | Tongyi Lab

I received my Ph.D. and B.S. degrees from Zhejiang University in 2018 and 2012, respectively.

My research focuses on LLM post-training, bridging the gap between cutting-edge academic research and large-scale industrial deployment. My current interests include:

Reasoning & Agents: Enhancing model reasoning and agent capabilities through knowledge distillation and Reinforcement Learning.
Model Safety: Building LLM safety guardrails to intercept multi-dimensional risks—including content violations, prompt injection, jailbreak attacks, and model hallucinations—ensuring end-to-end security across the lifecycle, from AIGC to AI Agent operations.

We are recruiting self-motivated interns with a strong LLM background. Please feel free to contact me via Email or WeChat.

news

Apr 07, 2026	Two papers are accepted by ACL 2026.
Jan 26, 2026	Three papers are accepted by ICLR 2026, including one ORAL.
Sep 25, 2025	One paper is accepted by NeurIPS 2025.
Sep 04, 2025	Two papers are accepted by EMNLP 2025, including one ORAL.
May 16, 2025	One paper is accepted by ACL 2025.
Jan 23, 2025	Two papers are accepted by ICLR 2025.
Jan 23, 2025	Two papers are accepted by NAACL 2025, including one ORAL.
Sep 26, 2024	One paper is accepted by NeurIPS 2024.
May 17, 2024	One paper is accepted by KDD 2024.
May 02, 2024	One paper is accepted by ICML 2024.

featured projects

Distribution-Aligned Sequence Distillation for Superior Long-CoT Reasoning

We released the technical report "Distribution-Aligned Sequence Distillation for Superior Long-CoT Reasoning", along with two reasoning models trained on Qwen base (DASD-4B-Thinking / DASD-30B-A3B-Thinking-Preview) and the corresponding training data. Upon open-sourcing, this work received enthusiastic responses from the Hugging Face community. Our training data quickly topped the HuggingFace Dataset Trending leaderboard, ranking #1 for over ten consecutive days (Jan 20-31). The total downloads exceeded 70K within three weeks, with models and derived models accumulating over 20K downloads.

ArXiv GitHub HF Dataset HF Model

AI Safety Guardrail based on Tongyi-Skynet LLM

I led the R&D for the Tongyi-Skynet LLM-based AI Safety Guardrail project, which was honored as a 2025 Alibaba & Ant Group Outstanding Technical Project. My project was one of only three selected from Alibaba Cloud (Top 3).

On Alibaba Cloud, our model provides security protection for hundreds of millions of calls daily, covering multimodal scenarios (text, image, etc.) and addressing multi-dimensional risks such as content compliance and prompt injection. This helps cloud-based enterprises deploy high-availability, highly compliant AI application closed-loops.

selected publications

(*Corresponding Author or Project Lead)

arXiv

Distribution-Aligned Sequence Distillation for Superior Long-CoT Reasoning

Shaotian Yan, Kaiyuan Liu, Chen Shen^*, Bing Wang, Sinan Fan, Jun Zhang, Yue Wu, and 2 more authors

arXiv preprint arXiv:2601.09088, 2026

arXiv Code
ICLR

Where Did This Sentence Come From? Tracing Provenance in LLM Reasoning Distillation

Kaiyuan Liu, Shaotian Yan, Rui Miao, Bing Wang, Chen Shen^*, Jun Zhang, and Jieping Ye

ICLR 2026, arXiv preprint arXiv:2512.20908, 2026

arXiv
ICLR Oral

Hallucination Begins Where Saliency Drops

Xiaofeng Zhang, Yuanchao Zhu, Chaochen Gu, Xiaosong Yuan, Qiyan Zhao, Jiawei Cao, Feilong Tang, and 4 more authors

ICLR 2026 Oral, arXiv preprint arXiv:2601.20279, 2026

arXiv
ICLR

Differential Fine-Tuning Large Language Models Towards Better Diverse Reasoning Abilities

Xiaosong Yuan, Chen Shen^*, Shaotian Yan, Kaiyuan Liu, Xiaofeng Zhang, Liang Xie, Wenxiao Wang, and 3 more authors

ICLR, 2026
ICLR

Don’t Take Things Out of Context: Attention Intervention for Enhancing Chain-of-Thought Reasoning in Large Language Models

Shaotian Yan, Chen Shen^*, Wenxiao Wang, Liang Xie, Junjie Liu, and Jieping Ye

In The Thirteenth International Conference on Learning Representations, 2025

Abs arXiv

Few-shot Chain-of-Thought (CoT) significantly enhances the reasoning capabilities of large language models (LLMs), functioning as a whole to guide these models in generating reasoning steps toward final answers. However, we observe that isolated segments, words, or tokens within CoT demonstrations can unexpectedly disrupt the generation process of LLMs. The model may overly concentrate on certain local information present in the demonstration, introducing irrelevant noise into the reasoning process and potentially leading to incorrect answers. In this paper, we investigate the underlying mechanism of CoT through dynamically tracing and manipulating the inner workings of LLMs at each output step, which demonstrates that tokens exhibiting specific attention characteristics are more likely to induce the model to take things out of context; these tokens directly attend to the hidden states tied with prediction, without substantial integration of non-local information. Building upon these insights, we propose a Few-shot Attention Intervention method (FAI) that dynamically analyzes the attention patterns of demonstrations to accurately identify these tokens and subsequently make targeted adjustments to the attention weights to effectively suppress their distracting effect on LLMs. Comprehensive experiments across multiple benchmarks demonstrate consistent improvements over baseline methods, with a remarkable 5.91% improvement on the AQuA dataset, further highlighting the effectiveness of FAI.
ICLR

Improving Complex Reasoning with Dynamic Prompt Corruption: A Soft Prompt Optimization Approach

Sinan Fan, Liang Xie, Chen Shen^*, Ge Teng, Xiaosong Yuan, Xiaofeng Zhang, Chenxi Huang, and 3 more authors

In The Thirteenth International Conference on Learning Representations, 2025

arXiv Poster
NeurIPS

Instance-adaptive Zero-shot Chain-of-Thought Prompting

Xiaosong Yuan, Chen Shen^*, Shaotian Yan, Xiaofeng Zhang, Liang Xie, Wenxiao Wang, Renchu Guan, and 2 more authors

In Advances in Neural Information Processing Systems, 2024

arXiv