Wenjie Du

Welcome to my homepage!🤗 I'm Wenjie Du (杜文杰), a MSc student at Nanyang Technological University (2026.01 - now). Previously, I obtained BEng in Computer Science and Technology from Sichuan University in 2024.

I have been working as a research intern in ENCODE Lab of Prof. Huan Wang at Westlake University since Fall 2025. I was also fortunate to work as a research assistant at the Hong Kong University of Science and Technology in 2025, a research intern at the Institute for AI Industry Research (AIR), Tsinghua University in 2024, and a software developer intern at ByteDance in 2023.

I work closely with Yuhao at NTU and Keda at Westlake. I am grateful to all my supervisors, mentors, and friends for their kindness and support throughout my journey!

Email / CV / Scholar / Linkedin / Github

I am seeking a visiting student position in Summer 2026 and PhD positions starting in Fall 2027.

Research

My research interests lie in large language models and machine learning systems. I focus on understanding the inner mechanisms of these models to improve their efficiency and effectiveness. I am also working towards efficient training and inference systems.

	OmniAgent: Audio-Guided Active Perception Agent for Omnimodal Audio-Video Understanding Keda Tao, Wenjie Du, Bohan Yu, Weiqiang Wang, Jian Liu, Huan Wang, arXiv, 2025 page / arXiv An audio-guided active perception agent that dynamically orchestrates specialized tools to achieve more fine-grained audio-visual reasoning, surpassing leading open-source and proprietary models by substantial margins of 10% - 20% accuracy.
	Which Heads Matter for Reasoning? RL-Guided KV Cache Compression Wenjie Du, Li Jiang, Keda Tao, Xue Liu, Huan Wang, arXiv, 2025 page / arXiv An RL-based method identifies reasoning-critical KV heads in LLMs to enable KV cache compression, achieving 20-50% KV cache reduction with minimal performance loss.
	AutoDroid-V2: Boosting SLM-based GUI Agents via Code Generation Hao Wen, Shizuo Tian, Borislav Pavlov, Wenjie Du, Yixuan Li, Ge Chang, Shanhui Zhao, Jiacheng Liu, Yunxin Liu, Ya-Qin Zhang, Yuanchun Li, MobiSys, 2025 (Best Artifact Award) code / arXiv / pdf A document-centered framework that converts mobile UI automation into code generation, enabling the script-based mobile GUI agent to achieve higher success rates and lower latency than step-wise agents.
	LLM-Explorer: Towards Efficient and Affordable LLM-based Exploration for Mobile Apps Shanhui Zhao, Hao Wen, Wenjie Du, Cheng Liang, Yunxin Liu, Xiaozhou Ye, Ye Ouyang, Yuanchun Li, MobiCom, 2025 code / arXiv An efficient mobile app exploration agent that uses LLMs for knowledge maintenance rather than action generation, achieving the highest coverage with 148x lower cost than baselines.

This webpage is built upon the source code of Jon Barron.