Skip to Main content Skip to Navigation
Theses

Safe and Efficient Reinforcement Learning for Behavioural Planning in Autonomous Driving

Edouard Leurent 1, 2, 3
1 SEQUEL - Sequential Learning
Inria Lille - Nord Europe, CRIStAL - Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189
2 VALSE - Finite-time control and estimation for distributed systems
Inria Lille - Nord Europe, CRIStAL - Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189
Abstract : In this Ph.D. thesis, we study how autonomous vehicles can learn to act safely and avoid accidents, despite sharing the road with human drivers whose behaviours are uncertain. To explicitly account for this uncertainty, informed by online observations of the environment, we construct a high-confidence region over the system dynamics, which we propagate through time to bound the possible trajectories of nearby traffic. To ensure safety under such uncertainty, we resort to robust decision-making and act by always considering the worst-case outcomes. This approach guarantees that the performance reached during planning is at least achieved for the true system, and we show by end-to-end analysis that the overall sub-optimality is bounded. Tractability is preserved at all stages, by leveraging sample-efficient tree-based planning algorithms. Another contribution is motivated by the observation that this pessimistic approach tends to produce overly conservative behaviours: imagine you wish to overtake a vehicle, what certainty do you have that they will not change lane at the very last moment, causing an accident? Such reasoning makes it difficult for robots to drive amidst other drivers, merge into a highway, or cross an intersection –an issue colloquially known as the “freezing robot problem”. Thus, the presence of uncertainty induces a trade-off between two contradictory objectives: safety and efficiency. How does one arbitrate this conflict? The question can be temporarily circumvented by reducing uncertainty as much as possible. For instance, we propose an attention-based neural network architecture that better accounts for interactions between traffic participants to improve predictions. But to actively embrace this trade-off, we draw on constrained decision-making to consider both the task completion and safety objectives independently. Rather than a unique driving policy, we train a whole continuum of behaviours, ranging from conservative to aggressive. This provides the system designer with a slider allowing them to adjust the level of risk assumed by the vehicle in real-time.
Document type :
Theses
Complete list of metadatas

https://hal.inria.fr/tel-03035705
Contributor : Edouard Leurent <>
Submitted on : Wednesday, December 2, 2020 - 12:21:34 PM
Last modification on : Friday, December 11, 2020 - 6:44:08 PM

File

PhD_thesis__Edouard_Leurent.pd...
Files produced by the author(s)

Identifiers

  • HAL Id : tel-03035705, version 1

Citation

Edouard Leurent. Safe and Efficient Reinforcement Learning for Behavioural Planning in Autonomous Driving. Computer Science [cs]. Université de Lille, 2020. English. ⟨tel-03035705⟩

Share

Metrics

Record views

235

Files downloads

847