Investigating Fluid-Flow Semantics of Asynchronous Tuple-Based Process Languages for Collective Adaptive Systems

. Recently, there has been growing interest in nature-inspired interaction paradigms for Collective Adaptive Systems, for modelling and implementation of adaptive and context-aware coordination, among which the promising pheromone-based interaction paradigm. System modelling in the context of such a paradigm may be facilitated by the use of languages in which adaptive interaction is decoupled in time and space through asynchronous buﬀered communication, e.g. asynchronous, repository-or tuple-based languages. In this paper we propose a diﬀerential semantics for such languages. In particular, we consider an asynchronous, repository based modellingkernel-languagewhichisarestrictedversionofLINDA,extended withstochasticinformationaboutactionduration.Weprovidestochastic formalsemanticsforbothanagent-basedviewandapopulation-basedview. We then derive an ordinary diﬀerential equation semantics from the latter, which provides a ﬂuid-ﬂow deterministic approximation for the mean behaviour of large populations. We show the application of the language and the ODE analysis on a benchmark example of foraging ants.


Introduction and Related Work
Collective Adaptive Systems (CAS) are systems typically composed of a large number of heterogeneous agents with decentralised control and varying degrees of complex autonomous behaviour. Agents may be competing for shared resources and, at the same time, collaborate for reaching common goals. The pervasive nature of CAS, together with the importance of the role they play, for instance in the very core of the ICT support for smart cities, implies that a serious a priori analysis-and, consequently modelling-of the design of any such a system must be performed and that all critical aspects of its behaviour must be carefully investigated before the system is deployed. This research has been partially funded by the EU project QUANTICOL (nr. 600708), and the IT MIUR project CINA.
Recently, there has been growing interest in nature-inspired interaction paradigms for CAS, for enforcing adaptive and context-aware coordination. Among these, those based on the metaphor of pheromones seem promising. System modelling in the context of such a paradigm may be facilitated by the use of languages in which adaptive interaction is decoupled in time and space through asynchronous buffered communication, e.g. tuple-based languages, a la LINDA [5]. For systems of limited size, several languages have already been proposed in the literature and have proven useful for modelling-as well as programmingautonomic adaptive coordination. Examples include KLAIM [10], which extends LINDA with, among others, a notion of space, the TOTA framework [21], which, additionally, provides for explicit adaptive tuple propagation mechanisms and a sort of force field view of tuples, and SCEL [8], where the basic interaction paradigm is enriched with a flexible, predicate-based addressing mechanism, with a framework for defining policies, and with a notion of tuple-space which is extended to a more general knowledge-space. Additionally, quantitative extensions of both KLAIM and SCEL have been developed, namely StoKLAIM [11] and StocS [20], where the quantity of interest is the duration of (the execution of) process actions. Such durations are assumed to be continuous random variables with negative exponential distributions, commonly used in stochastic process algebra [17]. Consequently, each such random variable is fully characterised by its rate, a positive real value that is equal to the inverse of the mean duration of the execution of the action. This choice for action durations gives rise to a Markovian semantics for the languages: the behaviour of each agent of a system is modelled by a continuous time Markov chain (CTMC). The collective behaviour of a system of agents is also modelled by a CTMC, of course obtained as a suitable combination of those of the component agents.
Unfortunately, as soon as the size of the systems under consideration grows, the infamous combinatorial state space explosion problem makes system modelling and analysis essentially unfeasible. On the other hand, one of the key features of CAS is the large size of their component populations. Consequently, scalability of modelling-and, most of all, analysis-techniques and tools becomes a must in the context of CAS design and development. It is thus essential to develop alternative approaches for modelling systems with large populations of agents, possibly based on-and formally linked to-process algebra. In this way, one can try to extend, to such alternative approaches, modelling and analysis techniques which have proven effective for standard stochastic process algebra, such as stochastic model-checking of probabilistic temporal logics. One way to deal with large population systems is the so called fluid-flow approach, which consists in computing a deterministic approximation of the mean behaviour of the large population [2]. The first step is to abstract from agent identity and to look only at the number of agents in a particular state, for each of the possible states of the agents in the population and at any point in time. Then, a further step is performed by approximating the average values of such numbers by means of a deterministic, continuous function of time, which is obtained as the solution of an initial value problem where the set of ordinary differential equations (ODE) is derived from the system model and the initial condition is the initial distribution of the population over the set of local states of the agents. Prominent examples of the fluid-flow approach are the differential (i.e. ODE-based) semantics version of the Performance Analysis Process Algebra (PEPA) [23], which we will call ODE-PEPA, Bio-PEPA [7] and, more recently, PALOMA [13]. The advantage of a fluid-flow approach is that the transient average behaviour of the system can be analysed orders of magnitude faster than by stochastic simulation, where the mean of a usually large number of simulation traces must be computed. The fluid-flow approach is independent of the size of the involved populations, as long as this size is large enough to provide a good deterministic approximation [2].
In this paper we explore the possibility for differential semantics for languages with an asynchronous buffered communication interaction paradigm, e.g. datarepository-/tuple-based ones. We present OdeLinda, a simple experimental language, based on a LINDA-like, asynchronous paradigm, where processes interact only via a data repository by means of out, in and read operations for respectively inserting, withdrawing and reading data values to/from the repository. In particular, we present a quantitative, Markovian language; the behaviour of each agent is modelled by a Markov process.
In most stochastic process languages, each action is decorated with its rate, which is typically a constant. In OdeLinda, instead, action rates are allowed to depend on the global state of the complete system; thus they are functions from global system states to positive real values, in a similar way as in Bio-PEPA. We provide a formal definition of the Markovian semantics using State-to-Function Labeled Transition Systems (FuTS) [9], an approach that provides for a simple and concise treatment of transition multiplicities-an issue closely related to the CTMC principle of race-condition-and a clean and compact definition of the semantics.
We follow the fluid-flow approach for making the language scalable in order to be able to deal with CAS. We define a population semantics for OdeLinda from which a differential (ODE) semantics is derived, in a similar way as proposed in [13] for PALOMA and in [23] for ODE-PEPA. The interaction paradigm underlying OdeLinda is fundamentally different from those of ODE-PEPA, Bio-PEPA and PALOMA. ODE-PEPA is based on the well-known PEPA process interaction paradigm, with processes synchronising on common, specific activities, Bio-PEPA is based on the chemical-reaction paradigm, whereas PALOMA agents use message multicasting. Additionally, both Bio-PEPA and PALOMA provide some simple means for spatial modelling. Spatial information is currently not incorporated in OdeLinda.
In tuple-based approaches, data repositories are typically multi-sets of values and adding/withdrawing a value to/from the repository increases/decreases the multiplicity of that value in the repository. In a "population"-oriented view, this means that the total system population size may change during the computations, i.e. we are dealing with a birth-death type of systems. This is the case for OdeLinda and constitutes another distinguishing feature when compared with e.g. ODE-PEPA. In this respect, our proposal is more similar to sCCP [3], although, from a technical point of view, for the actual definition of the differential semantics we followed the approach used in [23,13] rather than that presented in [3]. Finally, our work is also related to PALPS [1] and MASSPA [15]. PALPS is a language for ecological models. Only an individual-based semantics is available for PALPS. The language is thus usable only in the specific domain of ecological models and, furthermore, seriously suffers of lack of scalability. MASSPA [15] shares some features with PALOMA, e.g. a multicast-like interaction paradigm; it is lacking a Markovian, individual-based semantics.
It is worth noting that the language we present here is a minimal kernel language; we intended to address only the basic issues which arise when defining a differential semantics for tuple-based asynchronous languages. For this reason, operations on data, and in particular templates and pattern-matching are not considered, so that in and read operations result into pure synchronisation actions (with or without value consumption, respectively). The unconstrained use of templates and pattern-matching, as well as the use of general operations on data types, could result in an unbounded number of distinct values in a model, which, in turn, would require an unbounded number of differential equations in the differential semantics. Consequently, only ground terms are allowed in model specifications. It is worth noting that this does not imply that we allow only finite computations or that there are bounds on the multiplicity of each piece of data or on the resulting state spaces. In fact, the number of copies of any given value which can be stored in a repository by means of repeated executions of out actions in a computation, by one or more processes, is unbounded (and may be infinite for infinite computations). Anyway, one should also keep in mind that OdeLinda is intended to be a process modelling, rather than a programming, language and that differently from most process modelling languages, that, typically, do not provide any feature for dealing with data, offers some means, although primitive, for data storage, withdrawal and retrieval. For the sake of simplicity, we also refrain from considering process spawning, although this would not cause particular problems given that the semantic model we use deals with dynamic population sizes in a natural way. The objective of the present paper is to show that the basic notion of ODE semantics for asynchronous, shared-repository based languages is well founded. Additionally, by revisiting the benchmark example of Foraging Ants, we show that even in the restricted form we present in this paper, OdeLinda can be useful for actual system modelling and analysis.
The present paper is organised as follows: the syntax and Markovian, individual-based semantics of OdeLinda are presented in Section 2; the differential semantics of the language are presented in Section 3. An example of model specification as well as ODE analysis is given in Section 4. Finally, some conclusions and considerations for future work are discussed in Section 5.

Syntax and Markovian Semantics of OdeLinda
We recall that the main purpose of this paper is to show the basic principles for the definition of a differential semantics of asynchronous repository-based languages rather than the definition of a complete, high-level, ready-to-use process language. Consequently, the language we present here is a very minimal one, although, as we pointed out in Section 1 it can be used for the effective modelling of typical CAS systems like foraging ants, as we will show in Section 4.

Syntax
Let D be a denumerable non-empty set of data values, ranged over by d, d , ranged over by α, α , . . ., P be a denumerable non-empty set of state constants (or states), ranged over by C, C , C 1 , . . .
A system model is the result of the parallel composition of agents, i.e. processes, which are finite state machines. Thus the language has the following two level grammar for the sets Agents of agents and Proc of processes: where for each used constant C there is a definition of the form C := A, which, in the sequel, will be written as C := j∈J (R j , α j ).C j , for some finite index set J, with obvious meaning. In action prefix, (R, α). , R is the name of a rate function under the scope of a suitable definition R := E; E is a numerical expression where the special operator #C can be used which, for state name C, yields the number of agents which are in state C in the current global system state. We will refrain from giving further details on the syntax of expressions E. A process definition is the collection of definitions for the states of the process. A system state is a pair (P, D) where the set Reps of data repositories D is defined according to the following grammar: The language of expressions E for rate function definitions is extended with #d, for values d ∈ D, with the obvious meaning. A system (model) specification is composed of the set of definitions for its processes, the set of definitions for the rate functions used therein, and an initial global state (P 0 , D 0 ). It is required that for each state in the system specification there is exactly one definition. For the sake of simplicity, in the present paper we require that for all i ∈ I and x ∈ {o, i, r}, if α ij ∈ A x for some j ∈ J i , then α ih ∈ A x for all h ∈ J i (no mixed choice) and that for C = C , if C := j∈J (R j , α j ).C j and C := j∈J (R j , α j ).C j are both state definitions appearing in the system definition then {α j } j∈J ∩{α j } j∈J = ∅ implies {α j } j∈J = {α j } j∈J (in order not to incur in the possibility of circular definitions in the ODE). In the sequel we let S denote the set of global system states. Rate Functions: RA = 10 · #Reader · #a RB = 5 · #Reader · #b RR = 10 · #Comp · #r WA = 9 · #AWriter WB = 4 · #BWriter

Fig. 1. A simple model of Readers and Writers
As a simple running example we consider the specification of a readers/writers model given in Figure 1, where two kinds of writers are considered-those writing messages of type a and those writing messages of type b-and readers perform some computation using some resources r before reading the next item, modelled by synchronisation on r-with the following initial state 1

Stochastic Semantics
The stochastic semantics are given in Figure 2 using the FuTS framework [9], that is an alternative to the classical approach, based on Labelled Transition Systems (LTS). In LTS, a transition is a triple (s, α, s ) where s and α are the source state and the label of the transition, respectively, while s is the target state reached from s via a transition labeled with α. In FuTS, a transition is a triple of the form (s, α, F ). The first and second component are the source state and the label of the transition, as in LTS, while the third component F is a continuation function (or simply a continuation in the sequel), which associates a value from an appropriate semiring with each state s . In the case of Markovian process algebra, the relevant semiring is that of non-negative real numbers. If F maps s to 0, then state s cannot be reached from s via this transition. A positive value for state s represents the rate for the jump of the system from s to s . Any FuTS over R ≥0 uniquely defines a CTMC, which can obviously be built by successive application of the continuations to the set of states. Below we recall the main notions on FuTS. The reader interested in further details is referred to [9].
Given a denumerable non-empty set V , we let FS(V, R ≥0 ) denote the class of finitely supported 2 functions from V to R ≥0 . For v 1 , . . . , v n in set V and PA: For and we extend (F 1 + F 2 ) to the n-ary version j∈J F j , in the obvious way, for finite index set J. For r ∈ R we let F /r ∈ FS(V, R ≥0 ) be the defined as (F /r) v = (F v)/r if r = 0 and (F /r) v = 0 otherwise. We let ⊕F be defined as ⊕F = v∈V (F v); note that ⊕F is finite, and thus well-defined, for F ∈ FS(V, R ≥0 ). We recall standard structural congruence ≡ on Reps, with In the sequel, when dealing with data repositories, we will implicitly assume them modulo ≡. For the sake of notational simplicity we will keep D, D . . . in the notation (but actually the representatives of their equivalence classes are intended). A similar structural congruence ≡ is assumed for processes, with P 1 || P 2 ≡ P 2 || P 1 , (P 1 || P 2 ) || P 3 ≡ P 1 || (P 2 || P 3 ) , as well as similar conventions concerning notation.
The idea is that the rate of the action is the full responsibility of the modeller, being equal to R(P, D); in fact ( P ⊕P , D ⊕D )(P , D ) is equal to 1 if (P , D ) is reachable in one transition from (P, D) and 0, if it is not.

Differential Semantics of OdeLinda
In this section we define the differential semantics for the language introduced in Sect. 2. We follow a similar approach as in [13,23]: we first define a population semantics for the language and then we define the differential semantics by means of deriving, from the population semantics, suitable ODEs for the mean-field model.

Population Semantics
Assume we are given a system specification where With the definition of the update vector in place we can easily define the population-based transitions using the following rule: → X + δ τ for some α}; -x 0 ∈ Z m is the initial state of the PCTMC.

Mean-Field Model
The dynamics of the above PCTMC is as follows: if the PCTMC is currently in state X, then, every 1/r τ (X) time units, on average, a change in the population level of some agents and data items δ τ occurs. We can approximate such a discrete change in a continuous way so that for small finite time interval Δt the change in the population level is X(t + Δt) = X(t) + r τ (X(t)) · Δt · δ τ from which, for Δt → 0, we get the ODE dX(t) dt = r τ (X(t)) · δ τ . Taking all enabled transitions into account the ODE describing the approximated transient evolution of the complete population-level system dynamics is given by the initial value problem: for large populations and under suitable scalability assumptions (on the rate functions); the interested reader is referred to [2] for the technical details.
With reference to our running example of Figure 1 we get the equations of Figure 3. Note that there is no dynamics for the writer processes in this example, since each of these agents has just a single state (and a self-loop). Similarly for the resource r. . 3. ODE for the simple model of Readers and Writers of Figure 1

Example -Foraging Ants
As an example, we revisit a somewhat simplified model of a colony of foraging ants inspired by earlier work in the literature [14,12,22]. The ants initially reside at a Nest and will move between the Nest and a Food site. There are two, bidirectional, paths connecting the Nest to the Food site (and vice-versa), the Fast path and the Slow path. Each path is composed by a finite sequence of (path) stages: the number F of stages of the Fast path is smaller than the number S of stages of the Slow path. The average time it takes an ant to traverse a stage is the same for each stage, regardless of whether it is situated on the Slow or the Fast path; such traversal times are modelled by exponentially distributed random variables. The situation is depicted in Fig. 4 where FP j stands for the j-th of the F stages of the Fast path and SP j stands for the j-th of the S stages of the Slow path, for F < S . A model for foraging ants is specified in Fig. 5. The set of data values occurring in the model specification is the finite set where tuple Phe@FP j ( Phe@SP j , respectively) represents a unit of pheromone in stage j of the Fast path (Slow path, respectively). There are two process types, one modelling an ant and one modelling the expiration, i.e. decay, of pheromones; the (finite) set of states is as follows:  , read(Phe@FP1)).AntToFood@FP1 + (NFS, read(Phe@SP1)).AntToFood@SP1 AntToFood@FPj = (NFFj , out(Phe@FPj)).AntToFood@FPj+1 = (FNS1, out(Phe@SP1)).Ant@Nest State Ant@Nest (Ant@Food, respectively) represents an ant at the Nest (at the Food site, respectively). State AntToFood@FP j (AntToFood@SP j , respectively) represents an ant in stage j of the Fast (Slow, respectively) path, when travelling from the Nest to the Food. State AntToNest@FP j (AntToNest@SP j , respectively) represents an ant in stage j of the Fast (Slow, respectively) path, when travelling from the Food to the Nest. For the sake of simplicity, in this model, once an ant leaves the Nest, it can only proceed to the Food and then come back to the Nest (i.e. an ant cannot change its mind half-way a path or get stuck there). This is common in foraging ants models (see [22] and references therein). Finally, processes ExpPhe@FP j and ExpPhe@SP j are used for modelling pheromone decay. The definitions of rate functions NFF, NFS, NFF j , NFS j , PHF, and PHS are given below, where parameters k, m and p will be discussed later on in this section: The expressions for the definitions of NFF and NFS are written in accordance with results from experimental studies on colonies of Argentine ants, as discussed in [14,12,22]. The definition of functions FNF, FNS, FNF j , and FNS j are similar to those of NFF, NFS, NFF j and NFS j due to the symmetry of the model.
In the following we present some analysis results for two specific instantiations of the model and its parameters. We consider a model where the Fast path is composed of two stages while the Slow path is seven stages long, i.e. F = 2 and S = 7.
We first assume that pheromones do not decay: they accumulate in the path stages so that their total amount grows larger and larger. This is achieved by setting the decay rate p to zero. The value chosen for k is 10, while the ants rate of movement m from one path stage to the next one is set to 0.1. Fig. 6 shows the solution of the equations for the first 500 time units for an initial number of 1000 ants in the Nest, while no ants are assumed present in any other path stage, neither at the Food site, initially. One unit of pheromone is assumed to be present at time 0 at the stages of the Slow and the Fast paths closest to the Nest. Fig. 6 (left) shows that there is a quick drop in the number of ants at the Nest and that for a brief time frame of about 50 time units the cumulative number of ants on the Slow path is actually higher than that on the Fast path. This situation changes rapidly when ants start to return from the Food to the Nest providing implicitly feedback to the system by reinforcing the pheromone trace on the Fast path. This leads to a rather quick convergence of ants on the Fast path. The cumulative amount of pheromones on the Fast and the Slow path is shown in Fig. 6 (right).  Fig. 7 shows the results for a variant of the model where pheromone decays with constant rate p = 0.03. The evolution of both the cumulative number of ants in various locations and the amount of pheromone on the paths is shown over a time interval of 500 time units. Also in this case the ants converge on the Fast path and they are doing so in shorter time than in the case without decay of pheromones. Fig. 7 (left) has been obtained using Octave 4 for solving the ODE for the model specification. Fig. 7 (right) shows the results obtained via stochastic simulation for the same model with 1000 ants taking the average over 100 runs 5 . We close this section showing the application of the mean field model-checker FlyFast [19] on the foraging ants example. Fluid model-checking techniques have recently been proposed as scalable techniques for the verification of properties of one (or a few) agents in the context of large populations [4]. These techniques are based on differential semantics, or on difference equations, when considering their discrete time counterparts, as is the case for FlyFast. The input language of the model-checker does not support the specification of models with dynamic population size, but if we can assume sufficiently large upper bounds on the sizes of the data sub-populations for the time horizon of interest 6 , it is rather straightforward to translate the model of foraging ants shown in Fig. 5 into such a language, modelling data by two-state (i.e. "present" and "absent") processes and using an appropriate scaling of rates to turn the stochastic model into an equivalent probabilistic one [18]. We briefly illustrate the results for two properties for the ants model in Fig. 8. Property A shows how the probability of an ant in the nest to move to the short path within 30 time units changes over time due to the pheromones left behind by other ants. Property B shows the probability to reach a system state within t time units, where t ranges from 0 to 500, in which an ant in the nest moves to the short path within 30 time units with a probability of more than 0.95. Both properties can be expressed using the standard PCTL logic, a probabilistic extension of CTL [16]. Model-checking times for property A is 10 ms, whereas that of property B is 41,047 ms.
The purpose of the foraging ants example is to illustrate the OdeLinda language and mean field analysis approach for asynchronous, tuple based languages. Results better matching those of the original experiments in [14,12] can be obtained by a somewhat more complicated model in which ants leave the nest at a constant rate and in which the length of the paths and the constant traversal times are more accurately modelled by adding further path stages on each path implicitly using an Erlang distribution with more stages to approximate the constant traversal times. We omitted this here for the sake of simplicity.

Conclusions and Future Work
In this paper we have provided a differential semantics for languages with an asynchronous buffered communication interaction paradigm, e.g. data-repository-/ tuple-based ones. In particular, we have defined an individual-based Markovian as well as population based differential semantics for OdeLinda, a simple datarepository-based language. As example of use of the language we have shown a benchmark model of Foraging Ants and some results of its ODE-based analysis.
There are several lines of research we plan to follow for moving from a simple experimental kernel language like OdeLinda to a complete, full fledged population modelling language. One line of research focuses of the introduction of an appropriate notion of space. One possibility is to take StoKLAIM [11] as a starting point, thus using a simple, locality based approach. Another, perhaps more interesting possibility, instead, is to use a richer, predicate based, addressing mechanism, like (a possibly restricted version of) the addressing mechanism of StocS [20], where the location is just one of the agents' attributes and its values are instances of an appropriate data type, namely space. This can take different forms, from topological spaces-including bi-or tri-dimensional continuous space-to more general closure spaces-including generic graphs-as in [6]. Another issue is the inclusion of richer data and operations including templates and pattern-matching. This implies the definition of syntactical restrictions, or static analysis techniques, for guaranteeing that in all computations of a model specification the set of distinct data values is bounded, while the multiplicity of each item can of course be unbounded. This also holds for the inclusion of process spawning and processes to be stored/retrieved to/from repositories. Finally, we plan to adapt fluid model-checking techniques to tuple-based languages.