On the Approximation Error of Mean-Field Models

Mean-field models have been used to study large-scale and complex stochastic systems, such as large-scale data centers and dense wireless networks, using simple deterministic models (dynamical systems). This paper analyzes the approximation error of mean-field models for continuous-time Markov chains (CTMC), and focuses on mean-field models that are represented as finite-dimensional dynamical systems with a unique equilibrium point. By applying Stein's method and the perturbation theory, the paper shows that under some mild conditions, if the mean-field model is globally asymptotically stable and locally exponentially stable, the mean square difference between the stationary distribution of the stochastic system with size M and the equilibrium point of the corresponding mean-field system is O(1/M). The result of this paper establishes a general theorem for establishing the convergence and the approximation error (i.e., the rate of convergence) of a large class of CTMCs to their mean-field limit by mainly looking into the stability of the mean-field model, which is a deterministic system and is often easier to analyze than the CTMCs. Two applications of mean-field models in data center networks are presented to demonstrate the novelty of our results.


INTRODUCTION
The mean-field method is to study large-scale and complex stochastic systems using simple deterministic models. The idea of the mean-field method is to assume the states of nodes in a large-scale system are independently and identically distributed (i.i.d.). Based on this i.i.d. assumption, in a large-scale system, the interaction of a node to the rest of the system can be replaced with an "average" interaction, and the evolution of the system can then be modeled as a deterministic dynamical system, called a mean-field model. Then the macroscopic behaviors of the stochastic system can be approximated using the mean-field model, e.g., the stationary distribution of the stochastic system may be ap-Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for prof t or commercial advantage and that copies bear this notice and the full citation on the f rst page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specif c permission and/or a fee. Request permissions from permissions@acm.org.
SIGMETRICS '16, June 14 -18, 2016, Antibes Juan-Les-Pins, France The mean-field method has important applications in various areas including statistical physics, epidemiology, communication networks, queueing theory, and game theory (e.g., [15,4,3,32,22,19,12,7,1,21,11]). In particular, over the last few years, it also has emerged as a powerful method for analyzing large-scale cloud computing systems and data center networks. For example, in [22,32], the mean-field analysis has been used to show that routing each incoming task to the shorter of two randomly sampled servers can significantly reduce queueing delays, a phenomenon called the power-of-two-choices (Po2). The result has been extended to heavy-tailed service-time distributions [9], and to heterogeneous servers [23]. In [31], a mean-field model has been used to quantify the significant benefit of resource pooling. [28] established the asymptotic optimality of the join-idlequeue (JIQ), proposed in [20], using the mean-field model. In [34], a novel randomized load balancing algorithm, named Batch-Filling, has been developed for cloud computing systems with batch arrivals. The algorithm achieves similar delay performance as the power-of-two-choices with a sampling ratio slightly larger than one (i.e., it only samples slightly more than one server on average for each incoming task). In [33], the mean-field analysis has been used for studying the virtual machine placement problem in data center networks. In these applications, the systems under consideration are modeled as CTMCs, and the solutions (equilibrium points) of the corresponding mean-field models are then used to approximate the stationary distributions of the CTMCs.
To justify the mean-field analysis, a critical step is to prove that the stationary distribution of the CTMC indeed converges to the equilibrium point of the mean-field model as the size of the system increases. Consider a family of CTMCs. The M th CTMC is an M -dimensional continuoustime Markov chain W (M ) ∈ U M , where the superscript M is the number of nodes (or called particles) in the system and U M ⊆ R M is the state space of the CTMC. We assume U is a finite state space and the CTMC is irreducible. Without loss of generality, let U = {1, · · · , n}. We further define where 1 is the indicator function, so x (M ) i (t) is the fraction of nodes in state i at time t. This paper focuses on the case such that x (M ) = x (M ) (t), t ≥ 0 is also an (ndimensional) CTMC, i.e., the CTMC is a population process [17,18]. We remark that many applications of the mean-field method such as those in queueing networks and epidemiology are for population processes.
Now let x (M ) (∞) denote the stationary distribution of the M th CTMC. Furthermore, let x(t) denote the solution of an associated mean-field model and x * denote its equilibrium point. Existing approaches for proving the convergence of x (M ) (∞) to x * often involve the following three components.
(1) The first component is to show the convergence of CTMCs to the trajectory of the mean-field model for any finite time interval [0, t], i.e., where d(·, ·) is some measure of distance. This can be proved using different techniques including Kurtz's theorem [17,18,22,34], propagation of chaos [30,2,9], or the convergence of the transition semigroup of CTMCs [32,23].
(2) The second component is to prove the asymptotic stability of the mean-field model, i.e., Lyapunov theorem or LaSalle invariance principle can often be used for proving the stability.
(3) After establishing the previous two results, we obtained The convergence of the stationary distributions can then be concluded if we can prove the interchange of the limits, i.e., where step (a) is called the interchange of the limits.
Since these approaches are all based on the interchange of the limits and use the finite-time convergence (equality (1)) as the stepping stone, they are indirect methods of proving Because of this reason, these approaches can only establish the convergence of mean-field models and the asymptotic behavior of the systems (i.e., for M = ∞). The approximation error (or called the rate of convergence) of mean-field models for finite-size systems (e.g., x (M ) (∞) − x * for a fixed M ) is difficult to obtain using these indirect methods.
This paper tackles this fundamental problem and directly studies the approximation error of a large class of mean-field models using Stein's method [26,27,6], which is a method to bound the distance of two probability distributions. Our use of Stein's method for the rate of convergence was inspired by the work by Braverman and Dai [10], in which they developed a modular framework with three components for steady-state diffusion approximations and established the rate of convergence to diffusion models for M/P h/n + M queuing systems. The results in this paper also share similar spirit with the work by Gurvich [14], which establishes the rate of convergence of diffusion models for steady-state approximations for exponentially ergodic Markovian queues. This paper differs from both work in that it considers meanfield models instead of diffusion approximations.
To establish the approximation error, the paper identifies a fundamental connection between the perturbation theory for nonlinear systems and the convergence of meanfield models. The perturbation theory shows that for a stable nonlinear system with exponentially stable equilibrium point, the error of the first-order approximation of the nonlinear system is at the order of O(ǫ 2 ), where ǫ is the scaling factor of the perturbation. It turns out the mean-square difference between the stationary distribution of the M th CTMC and the equilibrium point of the mean-field model is related to the cumulative error (integrated over infinite time horizon) of the first-order approximation of the mean-field model. After quantifying the cumulative error, we establish the following results for finite-dimensional mean-field models.
• If the mean-field model is perfect (see definition in Section 2), globally asymptotically stable and locally exponentially stable, then the stationary distributions of the CTMCs converges in the mean-square sense to the equilibrium point of the mean-field model with rate O(1/M ) (Theorem 1), specifically, we have the following result on the approximation error • If the mean-field model is not perfect, sufficient conditions that guarantee the convergence of the stationary distributions have been obtained in Corollary 1.
We remark that these results are different from the celebrated law of large numbers for Markov chains established by Kurtz [17,18], where the convergence is established for sample paths of the CTMCs over a finite time interval or for a sequence of tM which increases M increases [24], not for the stationary distributions of the CTMCs. The contributions of our results are two-fold: First, it provides a direct method of studying the convergence of stationary distributions of stochastic systems to their mean-field limits. The method connects the convergence of CTMCs with the stability of the mean-field model. Note that the mean-field model is a deterministic system, so it is often easier to analyze than the CTMCs. Second, the method quantifies the rate of convergence, and provides bounds on the approximation error when using the mean-field limit for approximating the performance of finite-size systems.
We finally comment that the convergence of stationary distributions of one-dimensional discrete-time Markov chains has been studied in [25]. The approximation error of meanfield models for discrete-time Markov chains has been studied in [8], which, however, focuses on numerical methods to compute the error bounds and does not establish a general analytic answer like (2). Furthermore, an approach similar to Stein's method has been used in [29] to prove the tightness of diffusion-scaled stationary distributions for a two-queue system with many servers. The tightness result in [29] establishes an approximation error of the fluid-limit, which is at the same order of the approximation error established in this paper. The key differences are that this paper considers mean-field models for population processes (i.e., with many queues instead of two queues) and establishes sufficient con-ditions for the convergence and the rate of convergence of a large-class of systems instead of only for a specific system.

MEAN-FIELD MODELS
Consider an M -dimensional continuous-time Markov chain W (M ) ∈ U M , where the superscript M is the number of nodes (or called particles) in the system and U M ⊆ R M is the state space of the CTMC. We assume U is a finite state space and the CTMC is irreducible. Without loss of generality, we assume U = {1, · · · , n}. We further define where 1 is the indicator function, so X  1] represents the fraction of nodes in state i at time t. In this paper, we assume is an (n-dimensional) CTMC. We use x (M ) (∞) to denote its stationary distribution. Furthermore, we have a mean-field model described by the following autonomous dynamical system: where D is a compact set. Here, we abuse the notation and use x to denote the initial condition, which simplifies the notation in the analysis later without causing too much confusion. Assume the system has a unique equilibrium point and let x * denote the equilibrium point. The key idea of the mean-field analysis is to use the solution of this deterministic dynamical system to approximate the behavior of the CTMC when M is large; for example, use x * to approximate Let Q x (M ) ,y (M ) denote the transition rate of the CTMC from state x M to state y M . A family of CTMCs is called a density-dependent family of CTMCs if the normalized transition rate only depends on x (M ) and y (M ) but is independent of M (see a detailed definition in [22]). For a density-dependent family of CTMCs, the mean-field model can often be obtained by choosing because qx,y is the transition rate from x to y and y − x is the change of system state when such a transition occurs. We next illustrate the idea using an SIS (susceptible-infectedsusceptible) model with an external infection source, which is a variation of the original SIS model. is the fraction of infected individuals. We assume the recover time of an individual follows an exponential distribution with mean 1. Each infected node randomly selects a node after waiting for a random time that is exponentially distributed with mean 1/β. If the selected node is an susceptible node, it gets infected. Each susceptible node, after it becomes susceptible, gets infected by an external infection source after a random time period that is exponentially distributed with mean 1/α. Therefore, W (M ) , X (m) and x (M ) are CTMCs. Specifically, x (M ) has the following transition rates, where x Note for a given M, computing the stationary distribution of x (M ) is not easy because it has a large state space 2 and the transition rates are nonlinear functions of the states. The SIS considered above is a density-dependent CTMC, so we consider the following mean-field model To solve the mean-field model above, we notice that x0 + x1 = 1 always holds, so we only need to consideṙ The equilibrium point can then be obtained by solving For example, if α = β = 0.5, then which can be used to approximate the fractions of susceptible and infected populations when M is large, i.e., the stationary distribution of x (M ) . The simulation results of the fraction of susceptible population with M = 100, 1, 000 and 100, 000 are shown in Figure 1, from which the convergence of x (M ) 0 to 2 − √ 2 can be seen clearly.

STEIN'S METHOD FOR QUANTIFYING THE APPROXIMATION ERROR
In this section, we study the convergence and the approximation error (the rate of convergence) of the CTMCs to a mean-field model using Stein's method and the perturbation theory. Throughout this paper, · denotes the 2-norm, i.e., x = i x 2 i , and | · | denotes the absolute value. For two vectors a, b ∈ R n , a · b is the dot product. Furthermore, ▽g(x) denotes the gradient of g(x), and ▽xi(t, x) refers to differentiating with respect to the location x, andẋ is the derivative with respect to time.
Recall the mean-field model defined in equation (3): The mean-field model is said to be globally asymptotically stable if given any initial condition x(0) ∈ D and any ǫ > 0, there exists t(x(0), ǫ) such that The mean-field model is said to be locally exponentially stable if there exist positive constants ǫ, α and κ such that starting from any initial condition x(0) − x * ≤ ǫ, Let g(x) be the solution to the Poisson equation Then, the solution has the following form when the integral is finite (see [5,13]), where x(t, x) is the trajectory of the dynamical system with x as the initial condition. The integral is finite when the mean-field model is asymptotically stable and locally exponentially stable, which will become clear in Section 5. Note that −g(x) can be viewed as the cumulative square-deviation of the system state from the equilibrium point when the initial condition is x. Now let G x (M ) denote the generator for the M th CTMC, then Since x (M ) is irreducible and has finite state space, x (M ) has a stationary distribution. Initializing x (M ) (0) according to its stationary distribution, and using E x (M ) [·] throughout to the expectation taking over the stationary distribution Then by taking expectation of the Poisson equation (4) over the stationary distribution x (M ) (∞) and then adding (5) to the equation, we obtain Now adding and subtracting ▽g(x) · y:y =x qx,yM (y − x) yields From the equality above, intuitively, that converges to zero as M → ∞ can be established if the followings are true: • Bounded gradient of g(x) : ▽g(x) is bounded by a constant independent of M.
• Convergence of the generator: • Bounded transition-rate of the CTMC: E x (M ) y:y =x qx,y is bounded.
• Diminishing first-order approximation error: Note that g(x)+▽g(x)·(y−x) is the first-order Taylor approximation of g(y).
For many CTMCs and the associated mean-field models, the first three conditions mentioned above can be easily verified. In the following theorem, we will prove that the last condition holds when the mean-field model is globally asymptotically stable and locally exponentially stable (see inequality (11)), and then establish the rate of convergence based on that. The following theorem presents the main result of this paper.
when the following conditions hold: • Bounded transition-rate condition: There exists a constant c > 0 independent of M such that • Bounded state transition condition: There exists a constantc independent of M such that x − y ≤c M for any x and y such that qx,y = 0.
• Perfect mean-field model condition: The meanfield model (3) is given by • Partial derivative condition: The function f (x) is twice continuously differentiable.
• Stability condition: The mean-field model is globally asymptotically stable and is locally exponentially stable.
Remark 1. The first four conditions are easy to verify, so only the stability condition requires nontrivial work. Since a dynamical system has an exponentially stable equilibrium point if and only if the linearized system (at the equilibrium) is exponentially stable (see Theorem 4.15 in [16]), the local exponential stability can be verified by proving the linearized system is exponentially stable (e.g., using Lyapunov method) or numerically verified by calculating the eigenvalues of the state matrix of the mean-field model. The global asymptotical stability in general is studied using the Lyapunov theorem. Two applications of this theorem in data center networks will be presented in Section 4.
Remark 2. It is worth to pointing out that if the meanfield model is unstable but the perfect mean-field model assumption holds. Kurtz's theorem [17,18] indicates that the sample paths of the CTMCs converge to the trajectory of the mean-field model for any finite time interval, which implies that the CTMCs are "unstable" as well.
Proof. We first prove the theorem assuming the meanfield model is globally exponentially stable, and then extend it to the general case. Under the perfect mean-field model assumption, equation (7) becomes We next focus on the following term, Note that we exchanged the order of integration and differentiation for the third term. This is can be done because is finite, which can be proved using the fact that both (xi(t, x) − x * i ) and ▽xi(t, x) decay exponentially fast to zero as t increases (apply inequalities (20) and (31) with z = 1), and the fact that y − x is bounded due to the bounded state transition condition.
We next define i.e., According to the perturbation theory, in particular, inequality (34), when the system is exponentially stable, we have that According to the bounded state transition condition, x − y ≤c M .
Furthermore, both ▽xi(t, x) and xi(t, x) are bounded (see inequalities (20) and (31)) by constants independent of M and t. Therefore, we can choose a constant b and a sufficiently largeM such that for any M ≥M , where the last inequality is based on the following relation between 1-norm and 2-norm: i |ei(t)| ≤ √ n e(t) . In Section 5 (in particular, inequality (35)), we will show that under the exponential stability assumption, From the bounded state transition condition, Now according to inequality (31), there exist positive constants b1 and b2, both independent M, such that Therefore, we can conclude that which implies that Finally, using the bounded transition rate condition, we conclude Now consider the case that the mean-field model is not globally exponentially stable, but is globally asymptotically stable and locally exponentially stable. Recall that D ⊆ [0, 1] n is compact. According to the definition of global asymptotic stability (Definition 4.4 in [16]), given any ǫ > 0, there exists a finite time t ′ such that for any t ≥ t ′ . For any finite t, following a similar analysis as in Section 5 (or Section 10.1 in [16]), e(t, x) = O(1/M 2 ) holds. 1 Therefore, we can bound the term (8) by separating the integration into two intervals: from 0 to t ′ , and from t ′ to ∞, where t ′ is chosen such that x(t) converges exponentially to the equilibrium point after t ′ . Since e(t ′ , x) = O(1/M 2 ), the analysis above applies to the integration over [t ′ , ∞). Hence, the result holds.
Example: Let us go back to the SIS model introduced in Section 2. A closed-form solution can be obtained for x0(t). Again assume α = β = 0.5, then the solution of the ordinary differential equation is which converges to 2 − √ 2 as t → ∞ independent of x0(0). Therefore, it is easy to verify that the system is globally, asymptotically stable. Furthermore, the linearized system at the equilibrium iṡ where ǫ0 = x0 − x * 0 and x * 0 is the equilibrium value, so the equilibrium point is locally exponentially stable. Furthermore, the mean-field model is perfect in this case and it can be easily verified that all other conditions in Theorem 1 hold. So in the mean square sense, stationary distributions converge to the x0 = 2− Figure 2, where M varies from 100 to 1,000. We can see that M E x (M ) n i=1 (xi − x * i ) 2 varies within the interval [0.21, 0.27] while the size of the system increases by 10 times (from 100 to 1,000). The standard deviation (deviation from 2 − √ 2) is 0.02177 (3.72%×(2 − √ 2))) when M = 100 and is 0.0068 (1.16%×(2 − √ 2)) when M = 1, 000. From this example, we can see that the mean-field limit is a good approximation of the system when the size of the system is moderate large and the mean-square approximation error of this example is around 1 4M .  Theorem 1 requires a perfect mean-field model and bounded state transitions. Both conditions can be relaxed, but the rate of convergence will be different. Corollary 1. Assume partial derivative condition and the stability condition in Theorem 1 hold. The stationary distributions of the CTMCs converge (in the mean square sense) to equilibrium point of the mean-field model, i.e., when the following conditions also hold: We say that the mean-field model is asymptotically accurate when condition (14) holds, which replaces the perfect meanfield model condition. Conditions (15) and (16) replace the bounded state transition condition.
Proof. First recall that we have By choosing z = 1 in Section 5, it is easy to verify according to inequality (31) that (maxx ▽g(x) ) is upper bounded by a constant independent of M. Therefore, under condition (14), (17) → 0 as M → ∞.
A careful examination of inequality (35) shows that So under condition (16), we have When condition (16) holds, following the analysis that leads to inequality (10), we can again show that there exists a constantb independent M such that According to inequality (31), we also have Therefore, we have which converges to zero according to condition (15). Hence, the corollary holds.
Remark 3. When the mean-field model is asymptotically accurate, the convergence rate depends on the convergence rates of (14) and (15).

APPLICATIONS IN DATA CENTER NET-WORKS
In this section, we will demonstrate the novelty of Theorem 1 by considering two applications in data center networks: the power-of-two-choices [22,32] and the virtual machine placement problem [33]. For both problems, meanfield models have been used to analyze the performance of the systems in infinite server regime, but the approximation errors of the mean-field limits for systems with finite number of servers were unknown.

The power-of-two-choices for servers with f nite buffer
In [22,32], the authors considered a data center network with M identical servers as shown in Figure 3. Assume tasks arrive at the data center following a Poisson process with rate λM and the processing time of each task is exponentially distributed with mean processing time µ = 1. Each server maintains a queue and Qm(t) denotes the queue size of server m at time t. For each incoming task, the router (or called scheduler) randomly samples two servers and dispatches the task to the server with a smaller queue size. In this setting, Q(t) is a CTMC and is a population process. server 1 server 2 server M scheduler Figure 3: The system has M servers. When a task comes in, the scheduler samples two servers and routes the task to the server with shorter queue. In this example, the scheduler probes server 1 and server M and routes the task to server 1.
Let s (M ) k (t) denote the fraction of servers with queue size at least k. Based on the mean-field analysis, it has been shown in [22,32] that s where s * k is the equilibrium point of the following mean-field system:ṡ The mean-field model above is an infinite-dimensional system, so Theorem 1 does not apply. We instead consider finite-buffer servers with buffer size B, for which, the following mean-field model is a perfect mean-field model for the finite-buffer system.
, and the equilibrium point satisfies the conditions: The existence and uniqueness of the solution has been proved in [22].
Define the Lyapunov function to be The existence of such w k > 0 and δ > 0 for the infinite-dimensional mean-field model has been proved in [22]. The same w k and δ can be used in the finitedimensional system as well. Following a similar analysis in [22], we obtainV (t) ≤ −δV (t), which implies that So the system is globally, exponentially stable. Other conditions in Theorem 1 can be easily verified. So the approximation error in Theorem 1 applies.

Virtual machine placement in cloud computing systems
In [33], the authors considered a data center network with M identical servers where each server have B units of resources and can host at most B virtual machines (VMs) as shown in Figure 4. Assume VM requests arrive according to a Poisson process with rate λM and the lifetime of each VM is is exponentially distributed with mean lifetime µ = 1. Let Qm(t) denotes the number of VMs hosted at server m at time t. For each incoming request, the router (or called scheduler) randomly samples two servers and dispatches to the server with a smaller number of VMs. If both servers have already hosted B VMs, the request is blocked. In this setting, Q(t) again is a CTMC and is a population process.
Let s (M ) k (t) denote the fraction of servers with at least k VMs. Based on the mean-field analysis, it has been shown server 1 server 2 server M scheduler Figure 4: The system has M servers, and each server can host at most three VMs. the scheduler samples two servers and routes the VM request to the server with a smaller number of VMs. In this example, the scheduler routes the VM request to server 1. The request is blocked if both servers are full.
in [33] that s (M ) k (∞) weakly converges to s * k , where s * k is the equilibrium point of the following mean-field system: This is a finite-dimensional mean-field model, so Theorem 1 can be applied. The equilibrium point in this case can be recursively solved but the closed-form expression is difficult to obtain. The asymptotic stability of the system has been proved in [33]. We now consider the linearized system at the equilibrium, which iṡ where x k = s k − s * k . Define the Lyapunov function to be Note that when x k > 0 and when x k < 0, .
It is not difficult to verify that the same inequalities hold when x k = 0. Therefore, we havė Therefore, the equilibrium point is (locally) exponentially stable. Other conditions in Theorem 1 again can be easily checked, so the approximation bound applies. For both systems, the convergence of the stationary distributions to the mean-field limits have been proved in the literature based on the interchange of the limits, but the approximation errors (or the rate of convergence) were unknown. The result in this paper not only establishes the approximation errors, but also significantly reduces the additional analysis, in particular, both the convergence in finite time and the interchange of the limits are no longer needed. The purpose of presenting these two applications is to demonstrate the novelty of our result. The mean-field analysis of these two systems has been established in the literature, and is not the focus of this paper. We therefore ignored the details and only presented the key steps. The simulation results that demonstrate the convergence of the finite systems can also be found in the original papers, so were not presented due to page limit.

THE PERTURBATION THEORY
In this section, we summarize the results of the perturbation theory for nonlinear systems. These results are special cases of those results presented in [16] because we only need to consider a perturbation to the initial condition. Furthermore, the mean-field model considered in this paper is an autonomous system, which again is a special case of the nonlinear system considered in [16]. For these reasons, the analysis of the perturbation results can be simplified. On the other hand, the perturbation method introduced in [16] only states that the 2-norm of the following error is is at the order of y−x 2 independent of t (under certain conditions) Our result on the rate of convergence, however, requires such an upper bound on the cumulative error, i.e., an upper bound on ∞ 0 e(t) dt.
Therefore, it is necessary to go through the detailed analysis for the system considered in this paper to establish the result for the cumulative error. For the completeness of the paper and the easy reference of the reader, we next introduce the perturbation results in [16] with a more detailed calculation of e(t) , which shows that not only the approximation error is bounded, but the upper bound decays exponentially to zero as t increases. The analysis closely follows [16].
Consider the systemẋ = f (x) (19) where f : D ⊆ [0, 1] n → R n . Without the loss of generality, we assume x * = 0. We are interested in comparing the solution of this nominal system with the system with a perturbation on the initial condition x(0) = x + ǫz, where z = 1 ǫ (y − x) and is an n-dimensional vector. For the meanfield analysis considered in this paper, ǫ = 1/M. Under the condition of Theorem 1, for any neighboring states x and y, Let x(t, ǫ) to denote the solution of the dynamical system with initial perturbation ǫ. Note that the dependence of the solution on y − x is omitted to simplify the notation. The analysis holds for any y and x. We next first repeat the assumptions on the nominal dynamical system. Assumption 1. For any i, the function fi(x) is twice continuously differentiable. Therefore, the Jacobian matrix of f (x), denoted by ∂f ∂x , is Lipschitz. In other words, there exists a constant L > 0 such that Assumption 2. The dynamical system (19) has a unique equilibrium point and is exponentially stable. In other words, there exist positive constants α and κ such that starting from any initial condition x(0) ∈ D, Under this assumption, according to Theorem 4.14 in [16], there exist a Lyapunov function V (x) and positive constants cu, c l , c d , and cp such that for any x ∈ D, the following inequalities hold We first consider the finite Taylor series for x(t, ǫ) in terms of ǫ : and where Substituting (21) into the dynamical system equation, we getẋ where x (≤1) = x (0) , x (1) . The zero-order term h (0) is given bẏ which is the nominal system without the perturbation on the initial condition. The first-order term is given by Recall that ∂f ∂x is the Jacobian matrix. Therefore, we havė We next study e(t) = x(t, ǫ)−x (0) (t)−ǫx (1) (t). Combining the results above, we havė and we obtaiṅ Note that both ρ and γ are n-dimensional vectors. It is easy to see that According to Taylor's theorem and the mean value theorem, we have Hessian matrix of function f l (x). Then we have Furthermore, we have According to the mean-value theorem and (27), we have that whereẽ(t) = ae(t) for some 0 ≤ a ≤ 1. According to the Lipschitz condition in Assumption (1) and the Cauchy-Schwarz inequality, we have Now we utilize the assumption that the nominal system (19) converges to the equilibrium point exponentially fast from any initial condition in the domain. We use the Lyapunov function in Assumption (2) to bound e(t) . We start fromV (e(t)) =▽V (e(t)) ·ė(t) =▽V (e(t)) · f (e(t)) + ▽V (e(t)) · (ė(t) − f (e(t))) where inequality (a) is due to assumption (2) and the last inequality is a result of the Cauchy-Schwarz inequality. Note that based on Assumption (1) and the mean-value theorem, we have We also know that where we define to simplify the notation. Summarizing the results above, we geṫ V (e(t)) ≤ − c d V (e(t)) + L ▽V (e(t)) x (0) (t) + ǫ x (1) (t) + 2 e(t) e(t) Define W (t) = V (t), then we havė By the comparison lemma in [16], we have where the transition function φ(t, τ ) is and the equality holds because e(0) = 0.
The following lemma proves that the first-order system (25) converges exponentially starting from any initial condition in D.
Lemma 1. The first-order system (25) is exponentially stable for any solution x (0) (t) that starts from D.
Consider the set such that e(t) ≤ c d c l 4cpL , we have where last inequality yields from (20) and (31). Recall we have inequality (29) Substituting the bounds on φ(t, τ ) and A(τ ), we obtain In other words, we have e(t) otherwise. .
It is easy to see that with properly defined α1, α2, α3 and α4, we have ∞ 0 e(t) dt We keep the terms x(0) and α to show that the cumulative error depends on the initial condition and the convergence rate of the mean-field model. Furthermore, p = z T Pz ≤ λmax(P) z 2 .

CONCLUSION
This paper studies the approximation error of a large-class of mean-field models. When the mean-field model is perfect, the mean-square difference (also called the rate of convergence) has been proved to be O(1/M ). Based on Stein's method for bounding the distance of probability distributions and the perturbation theory for nonlinear systems, a fundamental connection between the convergence to the mean-field limit and the stability of the mean-field model has been established. Two applications of mean-field models for large-scale data center networks were discussed to demonstrate the novelty of our results.