Hamilton-Jacobi-Bellman Equation for a Time-Optimal Control Problem in the Space of Probability Measures

. In this paper we formulate a time-optimal control problem in the space of probability measures endowed with the Wasserstein metric as a natural generalization of the correspondent classical problem in R d where the controlled dynamics is given by a diﬀerential inclusion. The main motivation is to model situations in which we have only a probabilistic knowledge of the initial state. In particular we prove ﬁrst a Dynamic Programming Principle and then we give an Hamilton-Jacobi-Bellman equation in the space of probability measures which is solved by a generalization of the minimum time function in a suitable viscosity sense.


Introduction
The controlled dynamics of a classical time-optimal control problem in finitedimension can be presented by mean of a differential inclusion as follows: ẋ(t) ∈ F (x(t)), for a.e.t > 0, where F is a set-valued map from R d to R d .The problem in this setting is to minimize the time needed to steer x 0 to a given closed target set S ⊆ R d , S = ∅, defining the minimum time function T : R d → [0, +∞] by T (x 0 ) := inf{T > 0 : ∃ x(•) solving (1) such that x(T ) ∈ S}. ( The main motivation of this work is to model situations in which the knowledge of the starting position x 0 is only probabilistic (for example in the case of affection by noise) and this can happen even if the evolution of the system is deterministic.
We thus consider as state space the space of Borel probability measures with finite p-moment endowed with the p-Wasserstein metric W p (•, •), (P p (R d ), W p ).In [2] the reader can find a detailed treatment about Wasserstein distance.
Following this idea we choose to describe the initial state by a probability measure µ 0 ∈ P p (R d ) and for its evolution in time we take a time-depending probability measure on R d , µ := {µ t } t∈[0,T ] ⊆ P p (R d ), µ |t=0 = µ 0 .In order to preserve the total mass µ 0 (R d ) during the evolution, the process will be described by a (controlled) continuity equation where the time-depending Borel velocity field v t : R d → R d has to be chosen in the set of L 2 µt -selections of F in order to respect also the classical underlying control problem (1) which is the characteristic system of (3) in the smooth case.
It is well known that if v t (•) is sufficiently regular then the solution of the continuity equation is characterized by the push-forward of µ 0 through the unique solution of the characteristic system.
In Theorem 8.2.1 in [2] and Theorem 5.8 in [4], the so called Superposition Principle states that, if we conversely require much milder assumptions on v t , the solution µ t of the continuity equation can be characterized by the push-forward e t η, where e t : R d × Γ T → R d , (x, γ) → γ(t), Γ T := C 0 ([0, T ]; R d ) and η is a probability measure in the infinite-dimensional space R d × Γ T concentrated on those pairs (x, γ) ∈ R d × Γ T such that γ is an integral solution of the underlying characteristic system, i.e. of an ODE of the form γ(t) = v t (γ(t)), with γ(0) = x.We refer the reader to the survey [1] and the references therein for a deep analysis of this approach that is at the basis of the present work.
Pursuing the goal of facing control systems involving measures, we define a generalization of the target set S by duality.We consider an observer that is interested in measuring some quantities φ(•) ∈ Φ; the results of this measurements are the average of these quantities w.r.t. the state of the system.The elements of the generalized target set SΦ p are the states for which the results of all these measurements are below a fixed thershold.
Once defined the admissible trajectories in this framework, the definition of the generalized minimum time function follows in a straightforward way the classical one.
Since classical minimum time function can be characterized as unique viscosity solution of a Hamilton-Jacobi-Bellman equation, the problem to study a similar formulation for the generalized setting would be quite interesting.Several authors have treated a similar problem in the space of probability measures or in a general metric space, giving different definitions of sub-/super differentials and viscosity solutions (see e.g.[2,3,[8][9][10]).For example, the theory presented in [10] is quite complete: indeed there are proved also results on time-dependent problems, comparison principles granting uniqueness of the viscosity solutions under very reasonable assumptions.
However, when we consider as metric space the space P 2 (R d ), we notice that the class of equations that can be solved is quite small: the general structure of metric space of [10] allows only to rely on the metric gradient, while P 2 (R d ) enjoys a much more richer structure in the tangent space (which is a subset of L 2 ).
Dealing with the definition of sub-/superdifferential given in [8], the major bond is that the "perturbed" measure is assumed to be of the form (Id R d + φ) µ in which a (rescaled) transport plan is used.It is well known that, by Brenier's Theorem, if µ L d in this way we can describe all the measures near to µ. However in general this is not true.Thus if the set of admissible trajectories contains curves whose points are not all a.c.w.r.t.Lebesgue measure (as in our case), the definition in [8] cannot be used.
In order to fully exploit the richer structure of the tangent space of P 2 (R d ), recalling that AC curves in P 2 (R d ) are characterized to be weak solutions of the continuity equation (Theorem 8.3.1 in [2]), we considered a different definition than the one presented in [8] using the Superposition Principle.
The paper is structured as follows: in Section 2 we give the definitions of the generalized objects together with the proof of a Dynamic Programming Principle in this setting.In Section 3 we focus on the main result of this work, namely we outline a Hamilton-Jacobi-Bellman equation in P 2 (R d ) and we solve it in a suitable viscosity sense by the generalized minimum time function, assuming some regularity on the velocity field.Finally, in Section 4 we illustrate future research lines on the subject.

Generalized Minimum Time Function
Definition 1 (Standing Assumptions).We will say that a set-valued function F : R d ⇒ R d satisfies the assumption (F j ), j = 0, 1, 2 if the following hold true uous with respect to the Hausdorff metric, i.e. given x ∈ X, for every ε > 0 ) such that the following property holds (T E ) there exists x 0 ∈ R d with φ(x 0 ) ≤ 0 for all φ ∈ Φ.
We define the generalized target SΦ p as follows For an analysis of the properties of the generalized target see [5] or [6] for deeper results.
We say that a Borel family of probability measures µ = {µ t } t∈I ⊆ P p (R d ) is an admissible trajectory (curve) defined in I for the system Σ F joining α and β, if there exists a family of Borel vector fields v = {v t (•)} t∈I such that 1. µ is a narrowly continuous solution in the distributional sense of the continuity equation where In this case, we will also shortly say that µ is driven by v.
When J F (•) is finite, this value expresses the time needed by the system to steer α to β along the trajectory µ with family of velocity vector fields v. Definition 4 (Generalized minimum time).Given p ≥ 1, let Φ ∈ C 0 (R d ; R) and SΦ p be the corresponding generalized target defined in Definition 2. In analogy with the classical case, we define the generalized minimum time function T Φ p : driven by v, with where, by convention, inf ∅ = +∞.
Some interesting results concerning the generalized minimum time function together with comparisons with the classical definition are proved in the proceedings [5] and in the forthcoming paper [6].
Here we will focus our attention on the problem of finding an Hamilton-Jacobi-Bellman equation for our time-optimal control problem.
First of all we need to state and prove a Dynamic Programming Principle and, for this aim, the gluing result for solutions of the continuity equation stated in Lemma 4.4 in [7] will be used.
Proof.The proof is based on the fact that, by Lemma 4.4 in [7], the juxtaposition of admissible curves is an admissible curve.Thus, for every ε > 0 we consider the curve obtained by following µ up to time s, and then following an admissible curve steering µ s to the generalized target in time T Φ p (µ s ) + ε.We obtain an admissible curve steering µ 0 to the generalized target in time s + T Φ p (µ s ) + ε, and so, by letting ε → 0 + , we have T Φ p (µ 0 ) ≤ s + T Φ p (µ s ).
Finally, assume that µ is optimal for µ 0 and T Φ p (µ 0 ) < +∞.Starting from µ 0 , we follow µ up to time s.Since µ is still an admissible curve steering µ s to the generalized target in time T Φ p (µ 0 ) − s, we must have T Φ p (µ s ) ≤ T Φ p (µ 0 ) − s, and so T Φ p (µ s ) + s = T Φ p (µ 0 ), since the reverse inequality always holds true.

Hamilton-Jacobi-Bellman Equation
In this section we will prove that, under some assumptions, the generalized minimum time functional T Φ 2 is a viscosity solution, in a sense we will precise, of a suitable Hamilton-Jacobi-Bellman equation on P 2 (R d ).In this paper we assume the velocity field to be continuous for simplicity.In the forthcoming paper [6] we prove a result of approximation of L 2 µ -selections of F with continuous and bounded ones in L 2 µ -norm that allows us to treat a more general case.
We recall that, given T ∈]0, +∞], the evaluation operator e t : R d × Γ T → R d is defined as e t (x, γ) = γ(t) for all 0 ≤ t < T .We set It is not hard to prove the following result.
Lemma 1 (Properties of the evaluation operator).Assume (F 0 ) and (F 1 ), and let L 1 , L 2 > 0 be the constants as in (F 1 ).For any (iii) there exists C > 0 depending only on L 1 , L 2 such that for all t ∈ [0, T ] we have In the case we are considering, where the trajectory t → e t η is driven by a sufficiently smooth velocity field, we recover as initial velocity what we expected.
Lemma 2 (Regular driving vector fields).Let µ = {µ t } t∈[0,T ] be an absolutely continuous solution of The proof is based on the boundedness of v and on the fact that, by hypothesis, The conclusion comes applying Lebesgue's Dominated Convergence Theorem.
We give now the definitions of viscosity sub/-superdifferential and viscosity solutions that suit our problem.As presented in the Introduction, these concepts are different from the ones treated in [2,3,[8][9][10], due mainly to the structure of In the same way, Definition 6 (Viscosity solutions).Let V : P 2 (R d ) → R be a function and We say that V is a 1. viscosity supersolution of H (µ, DV (µ)) = 0 if there exists C > 0 depending only on H such that H (µ, q µ ) ≥ −Cδ for all 3. viscosity solution of H (µ, DV (µ)) = 0 if it is both a viscosity subsolution and a viscosity supersolution.
Proof.The proof is splitted in two claims.
We recall that since by definition p µ0 ∈ L 2 µ0 , we have that p µ0 • e 0 ∈ L 2 η .Dividing by s > 0 the left hand side, we observe that we can use Lemma 2, indeed the velocity field v(•) associated to η ∈ T F (µ 0 ) satisfies all the hypothesis (the boundedness comes from (F 2 )) and so we have lim sup Recalling that p µ0 ∈ D + δ T Φ 2 (µ 0 ) and using Lemma 1(iii), we have lim sup where C > 0 is a suitable constant (we can take twice the upper bound on F given by (F 2 )).
We thus obtain for all η ∈ T F (µ 0 ) that By passing to the infimum on η ∈ T F (µ 0 ) we have Thus, using Lemma 2 and Lemma 1, we have for all s > 0.

Conclusion
In this work we have studied a Hamilton-Jacobi-Bellman equation solved by a generalized minimum time function in a regular case.In the forthcoming paper [6] an existence result is proved for optimal trajectories as well as attainability properties in the space of probability measures.Furthermore, a suitable approximation result allows to give a sense to a Hamilton-Jacobi-Bellman equation in a more general case.
We plan to study if it is possible to prove a comparison principle for an Hamilton-Jacobi equation solved by the generalized minimum time function, as well as to give a Pontryagin maximum principle for our problem.