Towards Explainable Robots: Developing Consensus Reaching Mechanisms for Co-Robots in Haptic Shared Control Paradigms
Analytics
634 views ◎33 downloads ⇓
Abstract
Human-automation teaming (HAT) is gaining importance in military and commercial applications with autonomous vehicles because it promises to improve performance, reduce operating and designing platforms' costs, and increase adaptability to new situations. Given that both humans and automation systems are subject to misses, faults, or errors, to ensure the HAT performance in unpredictable conditions, it is critical to address the hand-off problem -- how to transition control between a human driver and automation system. Current solutions for control transfer in semi-automated ground vehicles face issues such as prolonged transfer time, misinterpretations or misappropriations of responsibility, and incomplete or inaccurate understandings of the vehicle and environment state. Transitions involving such issues are often ``bumpy'' and implicated in safety compromises.This dissertation focuses on addressing these issues by designing and testing an adaptive haptic shared control wherein a driver and an automation system are physically connected through a motorized steering wheel. We model the structure of the automation system like the structure of the human driver, including a higher-level intent generator and lower-level impedance controller. In the first part of this dissertation, we developed a nonlinear stochastic model predictive approach (SMPC) to determine how automation's impedance should be modulated in different interaction modes to enable the smooth and dynamic transition of control authority between humans and automation systems. The cost function in this MPC is defined to maximize task performance and minimize the disagreement between humans and automation within different interaction modes. To solve the optimal control problem, first, we employed the polynomial chaos (PC) approach to construct a deterministic surrogate for the stochastic MPC problem of adaptive HSC. Then, we employed the continuation generalized minimum residual (C/GMRES) solver that provides an iterative algorithm to solve the nonlinear model predictive controller. Finally, a set of numerical and experimental results are demonstrated to evaluate the performance of the proposed adaptive haptic shared control framework. The numerical results demonstrate that when the human control command is sufficient for avoiding the obstacle, the disagreement between the human and automation systems can be reduced by modulating and adopting smaller values for the impedance controller. On the other hand, when the human control command is insufficient, the automation system gains control authority by modulating and adopting larger values for the impedance controller parameters. It ensures the safety of the obstacle avoidance task. We also performed tests with processors in the loop (PIL) to show that the proposed predictive controller can compute the optimal modulation policy in real-time. The PIL results show high computational speed and numerical accuracy for the proposed method using low-cost microcontrollers. Finally, we quantified the performance of an adaptive haptic shared control through a set of human-subject studies using a fixed-base driving simulator. We invited 27 participants to drive a simulated vehicle through a course with obstacles. For forty percent of these obstacles, the human is instructed to avoid the obstacles in a similar direction as the automation system. For the other sixty percent of the obstacles, the human driver is instructed to take an opposite direction than the automation system to avoid the obstacle. We compare the performance of the adaptive haptic shared control with two other shared control schemes named assistive haptic shared and active-safety haptic shared control schemes. The automation system weighs the error term between the steering angle and the driver's desired steering command in the Assistive mode. This mode represents a case where the automation has relatively high confidence in the driver. The automation system weighs the error term between the steering angle and the automation's desired steering command in the Active-Safety mode. This mode represents a case where the automation has relatively low confidence in the driver. In the adaptive haptic shared control, the automation adaptively assigns different weights to the error terms based on the human impedance. Here, we used the human grip force as a proxy to estimate the human impedance on the steering wheel. We compared the performance of these three shared control schemes by analyzing five metrics, including obstacle hits and metrics related to driving maneuvers around the obstacles that were avoided. Our statistical analysis indicated that the adaptive haptic shared control paradigm supports the best overall team performance in resolving a conflict between the driver and automation system while keeping the vehicle safe. In the second part of this dissertation, we studied the principles of convention formation in a haptic shared control framework to narrow down the many possible strategies for resolving a conflict to those that a driver might be more gravitate toward. To this end, we proposed a modular platform to separate partner-specific conventions from task-dependent representations and use this platform to learn various forms of conventions between a human-driver and automation system. We assumed the human and automation steering commands could be determined by optimizing a set of cost functions in this platform. For each agent, the cost function is defined as a combination of hand-coded features and vectors of weights. We argue that the hand-coded features can be selected to describe task-dependent representations. On the other hand, the weight distributions over these features can be used to determine the partner-specific conventions. Using this platform, we created a map of human-automation interaction outcomes to the space of conventions. Finally, an adaptable automation system is designed to reach a desirable shared convention using the convention map. In particular, we developed a reinforcement-learning-based model predictive controller to enable the automation system to learn complex policies and adapt its behavior accordingly. To this end, we designed an episode-based policy search using the Deep Deterministic Policy Gradients agent to determine automation's cost function's optimal weights vector distribution. We applied the proposed platform to the problem of intent negotiation for resolving a conflict. Specifically, we considered a scenario where both humans and automation detect an obstacle but choose different paths to maneuver around the obstacle. The simulation results demonstrate that the convention-based handover strategies can successfully resolve a conflict and improve the performance of the human-automation teaming.