top of page

​​​

Reinforcement Learning

​​​

​

  • A Robust Model-Based Approach for Continuous-Time Policy Evaluation with Unknown Lévy Process Dynamics [pdf]

Qihao Ye, Xiaochuan Tian, Yuhua Zhu

Preprint, 2025.​

​​​​​

​

  • Optimal-PhiBE: A PDE-based Model-free framework for Continuous-time Reinforcement Learning [pdf]

Yuhua Zhu, Yuming Paul Zhang, Haoyu Zhang.

Preprint, 2025.​​​​​​​​​

​​

​

  • Variance Reduction via Resampling and Experience Replay [pdf][code]

Jiale Han, Xiaowu Dai, Yuhua Zhu.

Fortieth AAAI Conference on Artificial Intelligence (AAAI), 2026.​ (oral presentation)​​​​

​​​​

​

  • On Bellman Equations for Continuous-time Policy Evaluation: High-order Discretization and Function Approximation [pdf]

Wenlong Mou, Yuhua Zhu*.

SIAM Journal on Mathematics of Data Science, to appear, 2025.​

​​​

​

  • PhiBE: A PDE-based Bellman Equation for Continuous Time Policy Evaluation. [pdf]

Yuhua Zhu.

Preprint, 2024.​​​

​​

​

  • Continuous-in-time Limit for Bayesian Bandits. [pdf]

Yuhua Zhu, Zachary Izzo and Lexing Ying.​ 

Journal of Machine Learning Research (JMLR), 2023. 

Matlab code​

​​

  • Operator Augmentation for Model-based Policy Evaluation. [pdf]

Xun Tang, Lexing Ying and Yuhua Zhu*.​ 

Communications in Mathematical Sciences, 2023. â€‹â€‹â€‹

​

  • Variational Actor-Critic Algorithms. [pdf]

Yuhua Zhu and Lexing Ying.​

ESAIM: Control, Optimisation and Calculus of Variations, 2023.​​

​​

​

  • A Note on Optimization Formulations of Markov Decision Processes. [pdf]

Lexing Ying and Yuhua Zhu.​

Communications in Mathematical Sciences, 2021, to appear.​​​

​​

​

  • Borrowing From the Future: Addressing Double Sampling in Model-free Control. [pdf]

Yuhua Zhu, Zachary Izzo and Lexing Ying

Mathematical and Scientific Machine Learning, PMLR, 2021.​​​

​​

​

  • Borrowing From the Future: An Attempt to Address Double Sampling. [pdf]

Yuhua Zhu and Lexing Ying.

Mathematical and Scientific Machine Learning, PMLR 107:246-268, 2020. 

​​

​

*: Alphabetical authorship.

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​​

© 2025 Yuhua Zhu
bottom of page