[RSS 2023] Bridging Active Exploration and Uncertainty-Aware Deployment

Bridging Active Exploration and Uncertainty-Aware Deployment Using Probabilistic Ensemble Neural Network Dynamics
Taekyung Kim*, Junwi Mun*, Junwon Seo, Beomsu Kim, Seongil Hong
(* equal contribution)
Agency for Defense Development, Korea
1 more property
Overview diagram of our unified model-based reinforcement learning framework with dynamics learning. In exploration phase, a parallelized ensemble neural network serves as the robot dynamics and outputs the estimated posterior distribution of the next state. To enable active exploration, we quantify epistemic uncertainty by measuring the ensemble disagreement via Jensen-Rényi Divergence. In deployment phase, the neural network dynamics trained during the active exploration phase is applied directly to perform uncertainty-aware control. We transfer the neural network dynamics for uncertainty-aware deployment with minimal modification.


Main Supplmentary Video

Real-World Demo

We successfully transferred our algorithm to the real-world settings for uncertainty-aware deployment tasks. These are our initial real-world experiments using zero-shot transfer, without considering sim-to-real techniques such as domain adaptation or domain randomization. We integrated our algorithm with global path planning and online traversability map generation using a LiDAR sensor. These experiments were conducted on our off-road testbeds. See ‘Future Work’ section below for more details.

(1) Uphill

Sim-to-Real Demo (1)

(2) Downhill

Sim-to-Real Demo (2)

(3) Circular Track

Sim-to-Real Demo (3)

Demo Results

Preview Experiments

The scatter plots of collected data during active exploration. The data are from 10, 30, 100, and 300 iterations, respectively. During all iterations, ours using JRD information gain covers the largest state spaces of sideslip angle and yaw rate compared to other methods.
(a) The race track designed in the IPG CarMaker simulator for direct deployment experiments. We visualize the vehicle trajectory taken by our method at 300 iterations for driving 10 laps in a counter-clockwise direction. (b) The number of times that each method completes the whole lap during every 50 iterations. (c) The average speed cost for each saved model along exploration iterations. The shaded areas denote a 95% confidence interval. (d) The number of trials of each method that violates the stabilizing constraints at least once, i.e., having larger sideslip angles than 0.3 rad, during every 50 iterations.
Visualization of uncertainty-aware navigation results on the vehicle simulator. We display the rotational impacts exerted on the vehicle onto the trajectories. The maximum value of rotational impact during a three-second window is used for visual clarity.
Additional experiments using a 1:5 scale wheeled robotics testbed in (a)-(c) Gazebo and (d)-(f) Nvidia Isaac Sim. (a),(d) The simulated environments during the exploration phase. (b),(e) The simulated environments during the deployment phase. (c),(f) The vehicle trajectories taken by the robot at 300 iterations for driving 10 laps in a counter-clockwise direction.

Future Work

We are actively working on agile off-road autonomous driving by transferring our work to the real world. Recently, we have successfully addressed the issue of steering flunctuations, which can be seen in the Demo Video (3) - circular track. Our experiments on both uphill and downhill testbeds have achieved an impressive average speed of 30 km/h. In our next paper, we will provide a comprehensive analysis of these results, with a specific focus on the detailed design choices of the neural network dynamics and the sim-to-real transfer process. Furthermore, we will present the hardware specifications, as well as elaborate on our approaches to global path planning and traversability map generation utilizing a LiDAR sensor.


@inproceedings{kim2023bridging, author = {Taekyung Kim and Jungwi Mun and Junwon Seo and Beomsu Kim and Seongil Hong}, title = {{Bridging Active Exploration and Uncertainty-Aware Deployment Using Probabilistic Ensemble Neural Network Dynamics}}, booktitle = {Robotics: Science and Systems (RSS)}, year = {2023}, month = {July} }