Haoyu Li is an undergraduate student in Computer Science at Wuhan University. His research interests include world models, diffusion models, multimodal learning, efficient inference, and embodied AI. He has been a research assistant at the MARS Lab, Wuhan University, and a remote research intern at the Intelligent Interface Center, Harbin Institute of Technology.
| Contact: haoyuli404@outlook.com | +86-158-2700-2669 |
News
- 2026.05: Released Diffusion Models from Zero to Hero as an open-source course; contributions are welcome.
- 2026.04: Project approved, National College Students’ Innovation and Entrepreneurship Training Program.
- 2026.04: S3Mamba-Pan was accepted by IEEE TGRS with DOI 10.1109/TGRS.2026.3686021.
- 2025.10: Received Meritorious Student Leader, Wuhan University Merit Student, and related honors.
- 2025.08: National Third Prize, National Computer System Capability Competition.
- 2025.05: Central-South Regional First Prize, National Undergraduate Computer Design Competition.
- 2025.05: Honorable Mention, Mathematical Contest in Modeling.
- 2024.11: Hubei Province Second Prize, National College Student Mathematics Competition.
Publications

S3Mamba-Pan: Spectral-Spatial-Scale Mamba With Frequency-Decoupled Dual-Stream for Pansharpening
Zishun Song; Yao Zhang; Haoyu Li; Yanlin He; Jiawei Zhao; Yi Yang; Wei Zhang; Dezhen Wang
IEEE Transactions on Geoscience and Remote Sensing, 2026. DOI: 10.1109/TGRS.2026.3686021
Honors and Awards
- 2025 Meritorious Student Leader, Wuhan University
- 2025 Wuhan University Merit Student
- 2025 National Third Prize, National Computer System Capability Competition
- 2025 Central-South Regional First Prize, National Undergraduate Computer Design Competition
- 2025 Hubei Province Second Prize, National College Student Mathematics Competition
- 2025 Honorable Mention, Mathematical Contest in Modeling
- 2026 Project Approved, National College Students’ Innovation and Entrepreneurship Training Program
Education
- Sept. 2023 - June 2027 (expected), Bachelor of Engineering in Computer Science, Wuhan University, Wuhan, China
- GPA: 3.67 / 4.0
- Selected coursework: Data Structures (97), Algorithm Design and Analysis (92), Computer Graphics (92), Fundamentals of Software Construction (94), Advanced Mathematics (96), Probability Theory and Mathematical Statistics (92)
Research Experience
-
MARS Lab, Wuhan University - Research Assistant Apr. 2025 - Present Wuhan, China Advisor: Prof. Mang Ye - Studied federated prototype learning for multi-center, multimodal psychiatric diagnosis, focusing on non-IID data, privacy constraints, and cross-domain generalization; designed topology-aware prototype modeling to reduce prototype shift and semantic mixing across heterogeneous clinical centers.
- Designed a multimodal agentic diagnosis framework that organizes clinical text, structured scales, and multimodal evidence into a hierarchical reasoning chain.
- Co-developed a multidimensional psychiatric benchmark, focusing on interpretability, robustness, and failure-mode diagnosis in complex settings; the dataset is released on Harvard Dataverse.
-
Intelligent Interface Center, Harbin Institute of Technology - Remote Research Intern Dec. 2025 - Present Remote Advisor: Prof. Tiejun Zhao - Contributed to S3Mamba-Pan, an efficient visual modeling project, studying the trade-offs among spatial detail, spectral consistency, and inference efficiency through frequency-decoupled dual-stream Mamba, Haar-wavelet decomposition, spectral anchors, and adaptive distribution recalibration.
- Time-series generation: Developed ChronoRect, using rectified flow to model continuous distribution transport in clinical time-series data, and proposed EHR-TriDiT to improve synthetic fidelity, downstream utility, and empirical privacy safety, gaining hands-on understanding of flow matching, generative sampling, and conditional modeling.
- Controllable video generation: Studied region and trajectory conditioning in object-centric text-to-video diffusion, aiming to improve object motion control, identity preservation, attribute binding, and temporal consistency in generated videos.
Projects
-
Diffusion Models from Zero to Hero - Oct. 2025 - Present PyTorch, Diffusers, Jupyter Notebook - Systematically organized and maintained a hands-on diffusion-model course covering DDPM, DDIM, Diffusers, Stable Diffusion, CFG, LoRA, ControlNet, SDXL, DiT, Flow Matching, and video generation, with a documentation site, learning path, environment setup notes, and practice notebooks.
-
Happy-LLM: Building Large Language Models from Scratch - Aug. 2025 - Apr. 2026 Datawhale Decoder-only Pretraining -> SFT - Co-developed Happy-LLM (30k+ stars), organizing runnable teaching and practice materials around the core pipeline of large language models, including attention mechanisms, tokenization, RoPE, and KV Cache.
- Connected pretraining, instruction tuning, PEFT, and inference optimization into an end-to-end engineering workflow, emphasizing data construction, training objectives, optimization strategies, memory management, stability practices, and extensions to evaluation, RAG, and agents.
-
CyberMars: Intelligent CyberDog Robotics Control - Apr. 2025 - Aug. 2025 ROS, reinforcement learning - Built an embodied control system for Xiaomi CyberDog, using ROS to coordinate task scheduling, real-time visual perception, lane following, QR / marker recognition, and obstacle-aware navigation, and converting fused sensor observations into executable motion decisions.
- Mapped visual recognition outputs into task states and action constraints, designed low-level motion interfaces and an LCMT trajectory-tracking pipeline, and combined reinforcement-learning-guided policy tuning with real-time feedback correction to improve closed-loop stability under sensor noise, action delay, and scene disturbance.
-
SURS: Structured Ultrasound Reporting System - Dec. 2025 - Apr. 2026 WPF, MVVM, QuestPDF - Designed a structured desktop reporting system for real gynecological ultrasound workflows, decomposing clinical forms into composable state units and using lesion-option dependencies with O-RADS grading assistance to reduce repetitive input and rule omissions.
- Organized business logic, UI state, and PDF generation with MVVM and an App / Core / Infrastructure layered architecture; integrated QuestPDF for real-time preview and A4 medical-report export, serving 10,000+ patients and clinicians.
Leadership & Service
-
Class monitor, Wuhan University Sept. 2023 - Present - responsible for a class of 31 students; coordinated academic, administrative, and collective affairs; organized 16 themed activities across class, college, and inter-college settings. -
Committee member, External Liaison Department Sept. 2023 - Present - facilitated 2 campus-enterprise collaborations; planned events reaching 5,000+ participants cumulatively.
Skills
- Languages: Mandarin (native), English (fluent)
- Programming: Python, Java, C#, C++, SQL, Go
- Tools & frameworks: PyTorch, Linux, Git, ROS, Docker
Haoyu Li is an undergraduate student in Computer Science at Wuhan University. His research interests include world models, diffusion models, multimodal learning, efficient inference, and embodied AI. He has been a research assistant at the MARS Lab, Wuhan University, and a remote research intern at the Intelligent Interface Center, Harbin Institute of Technology.
| Contact: haoyuli404@outlook.com | +86-158-2700-2669 |
News
- 2026.05: Released Diffusion Models from Zero to Hero as an open-source course; contributions are welcome.
- 2026.04: Project approved, National College Students’ Innovation and Entrepreneurship Training Program.
- 2026.04: S3Mamba-Pan was accepted by IEEE TGRS with DOI 10.1109/TGRS.2026.3686021.
- 2025.10: Received Meritorious Student Leader, Wuhan University Merit Student, and related honors.
- 2025.08: National Third Prize, National Computer System Capability Competition.
- 2025.05: Central-South Regional First Prize, National Undergraduate Computer Design Competition.
- 2025.05: Honorable Mention, Mathematical Contest in Modeling.
- 2024.11: Hubei Province Second Prize, National College Student Mathematics Competition.
Publications

S3Mamba-Pan: Spectral-Spatial-Scale Mamba With Frequency-Decoupled Dual-Stream for Pansharpening
Zishun Song; Yao Zhang; Haoyu Li; Yanlin He; Jiawei Zhao; Yi Yang; Wei Zhang; Dezhen Wang
IEEE Transactions on Geoscience and Remote Sensing, 2026. DOI: 10.1109/TGRS.2026.3686021
Honors and Awards
- 2025 Meritorious Student Leader, Wuhan University
- 2025 Wuhan University Merit Student
- 2025 National Third Prize, National Computer System Capability Competition
- 2025 Central-South Regional First Prize, National Undergraduate Computer Design Competition
- 2025 Hubei Province Second Prize, National College Student Mathematics Competition
- 2025 Honorable Mention, Mathematical Contest in Modeling
- 2026 Project Approved, National College Students’ Innovation and Entrepreneurship Training Program
Education
- Sept. 2023 - June 2027 (expected), Bachelor of Engineering in Computer Science, Wuhan University, Wuhan, China
- GPA: 3.67 / 4.0
- Selected coursework: Data Structures (97), Algorithm Design and Analysis (92), Computer Graphics (92), Fundamentals of Software Construction (94), Advanced Mathematics (96), Probability Theory and Mathematical Statistics (92)
Research Experience
-
MARS Lab, Wuhan University - Research Assistant Apr. 2025 - Present Wuhan, China Advisor: Prof. Mang Ye - Studied federated prototype learning for multi-center, multimodal psychiatric diagnosis, focusing on non-IID data, privacy constraints, and cross-domain generalization; designed topology-aware prototype modeling to reduce prototype shift and semantic mixing across heterogeneous clinical centers.
- Designed a multimodal agentic diagnosis framework that organizes clinical text, structured scales, and multimodal evidence into a hierarchical reasoning chain.
- Co-developed a multidimensional psychiatric benchmark, focusing on interpretability, robustness, and failure-mode diagnosis in complex settings; the dataset is released on Harvard Dataverse.
-
Intelligent Interface Center, Harbin Institute of Technology - Remote Research Intern Dec. 2025 - Present Remote Advisor: Prof. Tiejun Zhao - Contributed to S3Mamba-Pan, an efficient visual modeling project, studying the trade-offs among spatial detail, spectral consistency, and inference efficiency through frequency-decoupled dual-stream Mamba, Haar-wavelet decomposition, spectral anchors, and adaptive distribution recalibration.
- Time-series generation: Developed ChronoRect, using rectified flow to model continuous distribution transport in clinical time-series data, and proposed EHR-TriDiT to improve synthetic fidelity, downstream utility, and empirical privacy safety, gaining hands-on understanding of flow matching, generative sampling, and conditional modeling.
- Controllable video generation: Studied region and trajectory conditioning in object-centric text-to-video diffusion, aiming to improve object motion control, identity preservation, attribute binding, and temporal consistency in generated videos.
Projects
-
Diffusion Models from Zero to Hero - Oct. 2025 - Present PyTorch, Diffusers, Jupyter Notebook - Systematically organized and maintained a hands-on diffusion-model course covering DDPM, DDIM, Diffusers, Stable Diffusion, CFG, LoRA, ControlNet, SDXL, DiT, Flow Matching, and video generation, with a documentation site, learning path, environment setup notes, and practice notebooks.
-
Happy-LLM: Building Large Language Models from Scratch - Aug. 2025 - Apr. 2026 Datawhale Decoder-only Pretraining -> SFT - Co-developed Happy-LLM (30k+ stars), organizing runnable teaching and practice materials around the core pipeline of large language models, including attention mechanisms, tokenization, RoPE, and KV Cache.
- Connected pretraining, instruction tuning, PEFT, and inference optimization into an end-to-end engineering workflow, emphasizing data construction, training objectives, optimization strategies, memory management, stability practices, and extensions to evaluation, RAG, and agents.
-
CyberMars: Intelligent CyberDog Robotics Control - Apr. 2025 - Aug. 2025 ROS, reinforcement learning - Built an embodied control system for Xiaomi CyberDog, using ROS to coordinate task scheduling, real-time visual perception, lane following, QR / marker recognition, and obstacle-aware navigation, and converting fused sensor observations into executable motion decisions.
- Mapped visual recognition outputs into task states and action constraints, designed low-level motion interfaces and an LCMT trajectory-tracking pipeline, and combined reinforcement-learning-guided policy tuning with real-time feedback correction to improve closed-loop stability under sensor noise, action delay, and scene disturbance.
-
SURS: Structured Ultrasound Reporting System - Dec. 2025 - Apr. 2026 WPF, MVVM, QuestPDF - Designed a structured desktop reporting system for real gynecological ultrasound workflows, decomposing clinical forms into composable state units and using lesion-option dependencies with O-RADS grading assistance to reduce repetitive input and rule omissions.
- Organized business logic, UI state, and PDF generation with MVVM and an App / Core / Infrastructure layered architecture; integrated QuestPDF for real-time preview and A4 medical-report export, serving 10,000+ patients and clinicians.
Leadership & Service
-
Class monitor, Wuhan University Sept. 2023 - Present - responsible for a class of 31 students; coordinated academic, administrative, and collective affairs; organized 16 themed activities across class, college, and inter-college settings. -
Committee member, External Liaison Department Sept. 2023 - Present - facilitated 2 campus-enterprise collaborations; planned events reaching 5,000+ participants cumulatively.
Skills
- Languages: Mandarin (native), English (fluent)
- Programming: Python, Java, C#, C++, SQL, Go
- Tools & frameworks: PyTorch, Linux, Git, ROS, Docker
李浩宇是 武汉大学 计算机科学与技术专业本科生。研究兴趣包括 世界模型、扩散模型、多模态学习、高效推理 与 具身智能。曾在武汉大学 MARS 实验室担任科研助理,并在哈尔滨工业大学智能接口技术中心远程实习。
联系方式:haoyuli404@outlook.com · +86-158-2700-2669
🔥 动态
- 2026.05: 教程 Diffusion Models from Zero to Hero 正式开源,欢迎大家贡献
- 2026.04: 📋 大学生创新创业训练计划 国家级项目立项。
- 2026.04: 📝S³Mamba-Pan 已收录 IEEE TGRS(DOI 10.1109/TGRS.2026.3686021)。
- 2025.10: 🏆 获评武汉大学 优秀学生干部、优秀学生等荣誉奖项。
- 2025.08: 🥇 全国大学生计算机系统能力大赛 全国三等奖。
- 2025.05: 🥇 中国大学生计算机设计大赛 中南赛区一等奖。
- 2025.05: 🥇 美国大学生数学建模竞赛 Honorable Mention。
- 2024.11:全国大学生数学竞赛 湖北赛区二等奖。
📝 发表论文

S3Mamba-Pan: Spectral–Spatial–Scale Mamba With Frequency-Decoupled Dual-Stream for Pansharpening
Zishun Song; Yao Zhang; Haoyu Li; Yanlin He; Jiawei Zhao; Yi Yang; Wei Zhang; Dezhen Wang
IEEE Transactions on Geoscience and Remote Sensing,2026。DOI: 10.1109/TGRS.2026.3686021
🎖 荣誉与奖项
- 2025 武汉大学 优秀学生干部
- 2025 武汉大学 优秀学生
- 2025 全国大学生计算机系统能力大赛 全国三等奖
- 2025 中国大学生计算机设计大赛 中南赛区一等奖
- 2025 全国大学生数学竞赛 湖北赛区二等奖
- 2025 美国大学生数学建模竞赛 Honorable Mention
- 2026 国家级大学生创新创业训练计划 项目立项
📖 教育背景
- 2023 年 9 月 – 2027 年 6 月(预计),计算机科学与技术 工学学士,武汉大学,武汉
- GPA: 3.67 / 4.0
- 部分课程: 数据结构(97)、算法设计与分析(92)、计算机图形学(92)、软件构造基础(94)、高等数学(96)、概率论与数理统计(92)
🔬 科研经历
- 武汉大学 MARS 实验室 — 科研助理 · 2025 年 4 月 – 至今 · 武汉 · 导师:叶茫教授
- 研究多中心、多模态精神医学诊断中的 联邦原型学习,围绕非独立同分布数据、隐私约束与跨域泛化问题设计拓扑感知的原型建模思路,以缓解异质中心之间的原型偏移与语义混叠。
- 设计多模态智能体诊断框架,将临床文本、结构化量表与多模态证据组织为层次化推理链。
- 共同构建多维度精神医学基准,关注模型在复杂场景下的可解释性、鲁棒性与失效模式诊断;数据开源于 Harvard Dataverse。
- 哈尔滨工业大学 智能接口技术中心 — 远程科研实习 · 2025 年 12 月 – 至今 · 远程 · 导师:赵铁军教授
- 参与 S³Mamba-Pan 高效视觉建模研究,围绕频率解耦双流 Mamba、Haar 小波分解、光谱锚点与自适应分布再校准,理解遥感多模态图像融合中空间细节、光谱一致性与推理效率之间的权衡。
- 时序生成: 开发 ChronoRect,以 rectified flow 建模临床时序数据的连续分布传输,并提出 EHR-TriDiT 结构提升合成数据保真度、下游可用性与实证隐私安全性,形成对 flow matching、生成采样与条件建模的实作理解。
- 可控视频生成: 研究 object-centric text-to-video diffusion 中的区域与轨迹条件控制,目标是提升生成视频中的物体运动控制、身份保持、属性绑定与时间一致性。
💼 项目
- Diffusion Models from Zero to Hero — 2025 年 10 月 – 至今 · PyTorch、Diffusers、Jupyter Notebook
- 系统化整理并工程化维护了一个扩散模型实战课程,覆盖 DDPM、DDIM、Diffusers、Stable Diffusion、CFG、LoRA、ControlNet、SDXL、DiT、Flow Matching 和视频生成等内容,并提供文档站、学习路径、运行环境说明和实践 notebook。
- Happy-LLM: Building Large Language Models from Scratch — 2025 年 8 月 – 2026年 4 月 · Datawhale · Decoder-only · 预训练→SFT
- 共建 Happy-LLM(30k+ star),围绕大语言模型的核心链路组织可运行的教学与实践材料,覆盖 注意力机制、Tokenizer、RoPE 与 KV Cache 等关键模块。
- 将 预训练、指令微调、PEFT 与推理优化 串联为完整工程流程,强调数据构造、训练目标、优化策略、显存管理与稳定性实践,并延伸至 评测、RAG 与 Agent 应用。
- CyberMars: Intelligent CyberDog Robotics Control — 2025 年 4 月 – 8 月 · ROS、强化学习
- 面向小米 CyberDog 构建具身控制系统,基于 ROS 组织任务调度、实时视觉感知、车道跟随、二维码 / 标记识别与障碍导航模块,并通过传感信息融合把环境观测转化为可执行的运动决策。
- 将视觉目标识别结果映射为任务状态与动作约束,设计低层运动接口和 LCMT 轨迹跟踪链路;结合强化学习引导的运动策略调参与实时反馈修正,提升在传感噪声、动作延迟和场景扰动下的闭环执行稳定性。
- SURS: Structured Ultrasound Reporting System — 2025 年 12 月 – 2026 年 4 月 · WPF、MVVM、QuestPDF
- 面向 妇科超声 的真实报告流程设计结构化桌面系统,将临床表单拆解为可组合的状态单元,并通过病灶选项联动与 O-RADS 分级辅助降低重复录入和规则遗漏。
- 采用 MVVM 与 App / Core / Infrastructure 分层组织业务逻辑、界面状态和 PDF 生成链路,结合 QuestPDF 实现实时预览与 A4 医疗报告导出,累计服务 10,000+ 患者与临床人员。
🌟 学生工作与志愿服务
- 班长,武汉大学 · 2023 年 9 月 – 至今 — 负责 31 人班级的大小事务;协调教学、行政与集体事务;组织 16 场覆盖班级、院系与校际的主题活动。
- 外联部部委 · 2023 年 9 月 – 至今 — 促成 2 次校企对接;策划活动累计覆盖 5000+ 人次。
🛠 技能
- 语言: 中文(母语)、英文(流利)
- 编程: Python、Java、C#、C++、SQL、Go
- 工具与框架: PyTorch、Linux、Git、ROS、Docker