In the past years, Deep Reinforcement Learning methods integrated with planning methods have been very successful in solving complex Sequential Decision Problems, e.g., in games such as Go. These methods can handle very large state spaces, however, less so large action spaces.
The proposed PhD research project aims to develop new hybrid solution approaches for (stochastic) Sequential Decision Problems with discrete, high-dimensional, and linearly constrained decision spaces as often encountered in Operations Research. To this end, the PhD student will integrate (Deep) Reinforcement Learning methods into traditional Operations Research methods, such as Rolling Horizon approaches using Stochastic Programming. In this way, at the time of decision-making, we combine planning by looking ahead and learning from previous experience. We aim to provide theoretical and empirical results, showing the superiority of the new methods compared to the state of the art in terms of computation at the time of decision-making and the quality of the solution. We will transition from problems with known dynamics to (partly) unknown dynamics. Therefore, we will explore solution strategies that transition from robust optimization to distributionally robust optimization and finally to stochastic optimization.
The PhD student will implement the algorithms and apply them to real-world use cases in Healthcare Logistics, such as multi-appointment scheduling, surgery scheduling, and resource allocation in times of scarce healthcare capacity. Therefore, the PhD student will also be part of the inter-faculty group CHOIR (Centre for Healthcare Operations Improvement and Research). CHOIR is a research center within the UT, and it is currently one of the most active and productive research groups in the field of Operations Research and Management in Healthcare. Through Research, Education, and Valorization, we help healthcare
practitioners face their complex logistical challenges.