Dr. Guang Wang, an Assistant Professor in the Computer Science Department, and his recent research has been recently published in the prestigious Thirty-Eighth AAAI Conference on Artificial Intelligence (AAAI-24). AAAI is recognized as one of the top conferences in the field of artificial intelligence. The paper title is “NondBREM: Nondeterministic Offline Reinforcement Learning for Large-scale Order Dispatching” and Dr. Guang Wang is the co-first author of this paper.

This paper is the first work that utilizes the offline reinforcement learning framework to address the real-world large-scale order dispatching problem. The authors designed a new Nondeterministic Offline Reinforcement Learning algorithm called NondBREM to learn policy from only the accumulated logged data to avoid costly and unsafe interactions with the environment. In NondBREM, a Nondeterministic Batch- Constrained Q-learning module is developed to reduce the algorithm extrapolation error and a Random Ensemble Mixture module that integrates multiple value networks with multi-head networks is utilized to improve the model generalization and robustness. Extensive experiments on large-scale real-world ride-hailing datasets show the superiority of the design.

The paper was presented at the AAAI 2024 conference in Vancouver, Canada, in February and is set to be published by the AAAI Press in the conference proceedings.