|
Wenjie Du
Welcome to my homepage!🤗
I'm Wenjie Du (杜文杰), a MSc student at Nanyang Technological University (2026.01 - now).
Previously, I obtained BEng in Computer Science and Technology from Sichuan University in 2024.
I have been working as a research intern in ENCODE Lab of Prof. Huan Wang at Westlake University since Fall 2025.
I was also fortunate to work as a research assistant at the Hong Kong University of Science and Technology in 2025, a research intern at the Institute for AI Industry Research (AIR), Tsinghua University in 2024, and a software developer intern at ByteDance in 2023.
I work closely with Yuhao at NTU and Keda at Westlake.
I am grateful to all my supervisors, mentors, and friends for their kindness and support throughout my journey!
Email /
CV /
Scholar /
Linkedin /
Github
I am seeking a visiting student position in Summer 2026 and PhD positions starting in Fall 2027.
|
|
Research
My research interests lie in large language models and machine learning systems.
I focus on understanding the inner mechanisms of these models to improve their efficiency and effectiveness.
I am also working towards efficient training and inference systems.
|
|
OmniAgent: Audio-Guided Active Perception Agent for Omnimodal Audio-Video Understanding
Keda Tao,
Wenjie Du,
Bohan Yu, Weiqiang Wang, Jian Liu,
Huan Wang,
arXiv, 2025
page
/
arXiv
An audio-guided active perception agent that dynamically orchestrates specialized tools to achieve more fine-grained audio-visual reasoning, surpassing leading open-source and proprietary models by substantial margins of 10% - 20% accuracy.
|
|
Which Heads Matter for Reasoning? RL-Guided KV Cache Compression
Wenjie Du,
Li Jiang,
Keda Tao,
Xue Liu,
Huan Wang,
arXiv, 2025
page
/
arXiv
An RL-based method identifies reasoning-critical KV heads in LLMs to enable KV cache compression, achieving 20-50% KV cache reduction with minimal performance loss.
|
|
AutoDroid-V2: Boosting SLM-based GUI Agents via Code Generation
Hao Wen,
Shizuo Tian,
Borislav Pavlov,
Wenjie Du,
Yixuan Li,
Ge Chang,
Shanhui Zhao,
Jiacheng Liu,
Yunxin Liu,
Ya-Qin Zhang,
Yuanchun Li,
MobiSys, 2025   (Best Artifact Award)
code
/
arXiv
/
pdf
A document-centered framework that converts mobile UI automation into code generation, enabling the script-based mobile GUI agent to achieve higher success rates and lower latency than step-wise agents.
|
|
LLM-Explorer: Towards Efficient and Affordable LLM-based Exploration for Mobile Apps
Shanhui Zhao,
Hao Wen,
Wenjie Du,
Cheng Liang,
Yunxin Liu,
Xiaozhou Ye,
Ye Ouyang,
Yuanchun Li,
MobiCom, 2025
code
/
arXiv
An efficient mobile app exploration agent that uses LLMs for knowledge maintenance rather than action generation, achieving the highest coverage with 148x lower cost than baselines.
|
|