​​​
Reinforcement Learning
​​​
​
-
A Robust Model-Based Approach for Continuous-Time Policy Evaluation with Unknown Lévy Process Dynamics [pdf]
Qihao Ye, Xiaochuan Tian, Yuhua Zhu
Preprint, 2025.​
​​​​​
​
-
Optimal-PhiBE: A PDE-based Model-free framework for Continuous-time Reinforcement Learning [pdf]
Yuhua Zhu, Yuming Paul Zhang, Haoyu Zhang.
Preprint, 2025.​​​​​​​​​
​​
​
Jiale Han, Xiaowu Dai, Yuhua Zhu.
Fortieth AAAI Conference on Artificial Intelligence (AAAI), 2026.​ (oral presentation)​​​​
​​​​
​
-
On Bellman Equations for Continuous-time Policy Evaluation: High-order Discretization and Function Approximation [pdf]
Wenlong Mou, Yuhua Zhu*.
SIAM Journal on Mathematics of Data Science, to appear, 2025.​
​​​
​
-
PhiBE: A PDE-based Bellman Equation for Continuous Time Policy Evaluation. [pdf]
Yuhua Zhu.
Preprint, 2024.​​​
​​
​
-
Continuous-in-time Limit for Bayesian Bandits. [pdf]
Yuhua Zhu, Zachary Izzo and Lexing Ying.​
Journal of Machine Learning Research (JMLR), 2023.
Matlab code​
​​
-
Operator Augmentation for Model-based Policy Evaluation. [pdf]
Xun Tang, Lexing Ying and Yuhua Zhu*.​
Communications in Mathematical Sciences, 2023. ​​​
​
-
Variational Actor-Critic Algorithms. [pdf]
Yuhua Zhu and Lexing Ying.​
ESAIM: Control, Optimisation and Calculus of Variations, 2023.​​
​​
​
-
A Note on Optimization Formulations of Markov Decision Processes. [pdf]
Lexing Ying and Yuhua Zhu.​
Communications in Mathematical Sciences, 2021, to appear.​​​
​​
​
-
Borrowing From the Future: Addressing Double Sampling in Model-free Control. [pdf]
Yuhua Zhu, Zachary Izzo and Lexing Ying
Mathematical and Scientific Machine Learning, PMLR, 2021.​​​
​​
​
-
Borrowing From the Future: An Attempt to Address Double Sampling. [pdf]
Yuhua Zhu and Lexing Ying.
Mathematical and Scientific Machine Learning, PMLR 107:246-268, 2020.
​​
​
*: Alphabetical authorship.
​
​
​
​
​
​
​
​
​
​
​
​
​
​
​
​
​
​
​
​
​​