A Multidisciplinary Predictive Model for Managing Critical Infrastructure Disruptions

When communities are subjected to disruptive events, their response structure is composed of two interconnected systems: (i) a formal professional system that includes emergency services and auxiliary services professionals; and (ii) an ad hoc system formed by community members when the professional response is delayed or is inadequate. The community system typically persists until the professional system is able to take over completely. As the role of the community as responder is not well understood, community systems are often underutilized or even discouraged; this reduces the overall response e ﬃ cacy. Improved understanding of the interplay between these systems could help ensure an e ﬀ ective overall response to disruptions. This chapter describes an integrated, multidisciplinary model of the interactions between the two systems during disruptive events and their inﬂuence on capacity and recovery. The model studies how the systems inﬂuence and enable community resilience in the context of three Department of Homeland Security deﬁned sectors: emergency services, information technology and communications. The methodology com-bines agent-based modeling with cellular automata and illustrates the interplay between and among the people and systems that make up a community, the role of the community as responder and the impact of varying community resources and response capabilities. The model is designed to be transferable to a variety of disaster types and a hierarchy of jurisdictions (local, regional, state, national and international).


Introduction
A comprehensive and effective response structure in communities subjected to disruptive events may be viewed as being composed of two interconnected systems: (i) a formal professional system comprising emergency services professionals such as police and fire, and auxiliary services professionals such as the Red Cross and National Guard that are called as needed; and (ii) an ad hoc community system that is formed by the community itself when the professional response is delayed or is inadequate. This community system typically persists until the professional system can take over completely or until the community is operational in some capacity.
However, the capacity and effectiveness of a community response system is neither well understood nor well utilized. For instance, during a November 2014 snow disruption, residents of Buffalo (New York) worked together to clear four to seven feet of snow from their neighborhood streets before city and county plows could reach them [10]. If the area emergency managers knew of this available and functional capacity, they would have been able to assign resources more effectively, reflecting the increased capability and change in the environment. For example, they could have assigned smaller plows to the area because much of the cleanup work had been done or they could have diverted the assigned resources to a different area where such efforts had not taken place or they could have taken over the response from the residents. At present, such community resource systems are underutilized or even actively discouraged by professional responders because the capacity is unknown, but a failure to take these resources into account reduces the overall response efficacy.
A better understanding of the interplay between these two systems is needed to improve and ensure effective overall responses to disruptions. This chapter describes an agent-based model integrated with cellular automata that is designed to study how the interrelationships between professional and community systems affect changes in response and recovery processes during disruptions. As a proof-of-concept, the model focuses on how these systems influence and support community resilience in the context of three of the sixteen critical infrastructure/key resource (CI/KR) sectors defined by the U.S. Department of Homeland Security (DHS) [18]: emergency services, information technology and communications.

Background
Although the professional system is characterized by known capabilities (such as specialized, systematic training and structured communications networks) and defined resources (human and equipment), community systems have unknown capabilities and uncertain resources. Professional systems, with their defined procedures and protocols, have relatively predictable responses to disruptions whereas community systems display more stochastic responses with probabilities of various actions dependent on the socioeconomic backgrounds of the actors. The U.S. Federal Emergency Management Agency (FEMA) has a whole community approach, which acknowledges that technological advances in community infrastructure, especially communications and information access, cultural diversity and grassroots engagement, affect how communities thrive and react to disasters [4].
Several studies have explored the changing landscape of emergency response in the light of social change and localized problem solving [8,9]. Trainor and Barsky [16] note that it is in the "best interest of the community" to use trained and untrained human resources. Currently, the Federal Emergency Management Agency uses the "all hazards" planning taxonomy [7], based on a set of 37 target capabilities, to provide an operational framework for response as part of the National Incident Management System (NIMS). The taxonomy was created to assess and document potential gaps in local response and recovery within a standard framework.
The Federal Emergency Management Agency requires communities to measure response capabilities -which is a start, but certainly not the whole picture -particularly for communities that rely on a mix of paid and volunteer responders. As the initial reaction of a community to a disruptive event is socially driven, especially when professional response is delayed or absent [21], further investigation of this system is warranted. Community responses are often no less and, possibly, more effective than professional responses. The effectiveness of community systems is tied to the social capital of the community as well as its perception of its own abilities [5], in conjunction with the traditional capabilities assessed through professional emergency planning requirements [12,14].
As the Federal Emergency Management Agency "all hazards" taxonomy is linked to the probability of event occurrence, experience plays a major role in identifying, understanding and mitigating risk [3]. To the extent that the residents or responders understand the scope of the event, they can then increase their capacity to act whether or not the professional response plan is initiated. The ability to self-organize and act is driven by many factors, including the knowledge of the event and the need, in conjunction with local geography, economics, demographics, natural resources and imminent threats [2,6,20]. The need to act is driven by the degree to which normal community systems have broken down and cannot cope with, or be resilient to, the hazard. These systems include infrastructure systems (physical, operational and virtual) as well as human systems (collective capabilities) that make up a functional community -in other words, they are systems of systems.

Proposed Model
A major goal of the proposed model is to support the measurement and assessment of the interactions between professional and community system responses to disruptive events. It integrates agent-based modeling with cellular automata and uses examples based on emergency management, information technology and communications flows. The demographic data from the Rochester (New York) metropolitan statistical area (MSA) [19] includes more than one million people with a range of ethno-racial, wealth and age demograph-ics, and a high youth population in the inner city [17], making it a reasonable representative of a mid-sized U.S. urban area. The proposed model is designed to be flexible and generalizable to similar regions.

Model Design
Agent-based modeling can help capture the highly integrative and complex nature of interactions between professional and community responders within the context of concurrent emergency management, information technology and communications systems. This versatile modeling approach has been employed in similar studies of human systems [1,15] and addresses the weaknesses of other techniques while maintaining their strengths. For example, the geospatial distribution of responders, resources and population might be assessed through a system of partial differential equations or the high importance of communications could be assessed using graph theory. Such mathematical modeling techniques, however, tend to separate the modelers from the subject matter experts. In contrast, agent-based modeling uses agents whose behaviors are governed by rules created in concert with subject matter experts. For instance, communications experts help frame the rules governing communications modes and connections between agents that represent people, while socioeconomic experts create decision rules to be used by community responders from different demographic backgrounds. Agent-based modeling thus supports simple, but required, connections between subject matter experts and mathematical models. Complex systems and behaviors arise out of multidisciplinary, albeit simple, rules obeyed by the agents -this forms the backbone of the proposed model. In the proposed model, agents are distributed in a two-dimensional domain that incorporates the geospatial aspects of a metropolitan setting, along with heterogeneous distributions of demographics, needs, abilities and resources. The geographic region is divided into cells distributed on a grid. The cells have differing attributes such as built environment characteristics, proximity to major roads, demographics and population density.
Each cell has needs, resources and abilities, and communications links to other cells. The needs vector elements measure quantities such as injuries and hazards. The resources vector elements measure supplies and equipment such as winches, trucks and generators. The abilities vector elements measure essential skills such as doctors, electricians, law enforcement and heavy equipment licenses. The state of each cell is a composite of these three vectors. Figure 1 presents an example of a cellular grid. Cell 1,1 is negatively affected by a disruptive event and receives resources and assistance from Cell 1,2 and Cell 2,1 that are adjacent to it. Note that N denotes needs, R resources and A ability.
At each time step (unit of time in the model), the agents in a cell respond to the needs and requests from its neighboring cells according to a set of rules.

Example rules are:
If a cell has an emergency but no emergency resources, then the need can be addressed using emergency resources located at an adjacent cell. For example, a fire truck will move beyond its designated area to help fight a fire in a neighboring area.
After receiving official communications from the authorities, cells containing personnel with heavy equipment licenses (abilities) may transfer the abilities to nearby cells with heavy equipment resources. For example, a local fire chief may ask individuals who can drive bulldozers to go to where the bulldozers are parked.
Cells with medical needs do not mobilize their heavy equipment operators even after communications from the authorities. For example, an individual whose child is injured stays with the child and does not report to the heavy equipment yard.
Cells with varying characteristics such as population density and socioeconomic status respond differently to disruptive events and recovery efforts; their agents draw their behavior choices from separate probabilistic sets. The purpose of the rules is to simulate the real-life decision-making patterns of response behavior, while using the model to assess how alternative response mitigation plans and resource activations may alter the behavior patterns to optimize response and expedite recovery.
One of the powerful features of the model is that, during each time step, the needs vector, resource vector and ability vector are updated according to rules informed by local situations and subject matter experts. Rules may contain a stochastic element, enabling the cellular automata to operate as a large Monte Carlo simulation.
The distribution of resources and abilities is based on the critical infrastructure/key resource assets in the region and knowledge of the characteristics of the metropolitan statistical area such as population density, vacant housing percentage and per capita income. A database containing this information was created during a previous Department of Homeland Security project [12].
To represent the official, or professional, response to a disaster event, a threelayered network spans the entire grid. Each layer of the network corresponds to one of the three critical infrastructure sectors selected in this investigation: (i) emergency management services; (ii) information technology; and (iii) communications.
Emergency Management Services Layer: This layer represents the network of first responders (police, fire and emergency medical services), hospitals, clinics and critical personnel that make up the emergency management system of a region. The proper operation of this layer is highly dependent on information technology and communications because the correct resources must be routed to the exact locations as quickly as possible during an incident. Conversely, a lag in, or interruption of, information flow from the scene of an incident back to the centralized emergency operations center leaves decision makers with incomplete knowledge about the state of the event or scene.
Information Technology Layer: Security and privacy management in dynamic information management environments is challenging, yet critical, for an effective response. For example, data integrity has to be maintained for professional as well as community responses. Errors in geographical coordinates or loss of key pieces of information can prove to be life-threatening. Without a robust information technology infrastructure, data modification and loss (both accidental and malicious) cannot be prevented; this can lead to catastrophic situations.
Communications Layer: Robust communications reduce the time to bring the right resources to the right locations while poor communications limit the quality of information available to those who have the resources. Individuals at the scene of a disruptive event have rich information over a small footprint while professional responders and incident command have a broad, but relatively shallow, view of the situation.
Interdependencies Between Layers: The three sectors were chosen for model development because they are highly dependent on each other, especially during disruptive events. In addition, when disruptions occur, all three sectors can be supported by professional and community response systems.

Modeling Process
The metropolitan statistical area of a mid-sized city like Rochester covers a much larger geographical area relative to population than that of a major city such as New York, Los Angeles or Chicago. Therefore, to study the problem of regional resilience for the approximately 62 metropolitan statistical areas in the United States that are anchored by a mid-sized city, several cellular models of varying sizes are needed to represent municipalities in the counties that neighbor the anchoring city. Figure 2 shows an example of how such a regional resilience study of the Rochester metropolitan statistical area might be configured as a graph whose edges are the connections between large towns and county seats within the metropolitan statistical area and the anchoring city. Note that all the nodes are connected to the major city, but are not necessarily connected to every other node. In essence, the region can also be expressed as a cellular agent-based model where the cells represent municipalities. Since a neighborhood is essentially a subset of a community, each community cell can be divided into n neighborhoods as shown in Figure 2.
The proposed model helps increase the understanding of how social and economic characteristics such as ethno-racial, class, gender and age distributions impact recovery performance for various critical infrastructure/key resources, especially when the socioeconomic characteristics shape community response capabilities.
The independent variables in the model include: Level of interaction between the two response systems.
Types of critical infrastructure/key resource assets involved in the event.
Amount of disruption to the performance of the assets.
Geographical scope of the crisis.
Socioeconomic network factors, community capabilities and resources that affect community response capacity.
The approach provides for the testing of key dependent variables such as: Transfer time from community to professional systems (to assess critical infrastructure/key resource assets and human capacity).
Changes in the operational capacity of the involved critical infrastructure/key resource assets.
Resource allocation by the professional system.

Model Viability and Validation
A two-part approach was used to establish the viability and validity of the model. First, experiments were conducted at a granular level and the results were analyzed to check if the model is viable. Second, approaches were explored to establish the validity of the model in real-world critical infrastructure protection scenarios.

Model Viability
To establish a baseline, the dependent variables were first assessed with professional responders only, a fully functional critical infrastructure and the following three levels of event severity: Level 1: The event is highly isolated, easily contained and needs only a few responders (e.g., a car accident).

Level 2:
The event has a large impact area and more victims or a few hard-to-reach victims (e.g., a building collapse or multi-car pileup).

Level 3:
The event impacts more than half of a community (e.g., a flood event, power outage in combination with another event or events where first responders are already responding en masse at the site of another major event).
After a baseline is established, the model parameters (see Table 1) may be modified to analyze the effects of each component on resilience and recovery under various initial conditions. For instance, it may be necessary to learn how community systems respond in the absence of the professional response. If first responders are unable or slow to respond, community networks may be formed to deal with the event. The model is used to test group formation and efficiency of response at the same levels as the baseline, but without the professional response element. During a higherlevel disruptive event, response capacity is anticipated to rapidly diminish as the community becomes overwhelmed. Efficient crisis management is highly dependent on a functioning critical infrastructure. In the proposed model, the critical infrastructure is represented by the emergency services, information technology and communications sectors. By varying the capabilities of the sector layers, it is possible to understand the effects of critical infrastructure loss.
The model may also be extended to study the effects of uneven resource distribution. For example, an event that occurs in a working-class neighborhood may, in fact, have a more effective response given the higher density of first responders who live in the neighborhood. Similarly, an event that requires physicians may have a fast response time if it occurs near a hospital. A rural area may have fewer professional responders, but a more robust community response. By focusing on how certain types of disruptive events in specific locations in the modeled region can affect the dependent variables, it is possible to create the foundation for policies that encourage even access to response and recovery systems in the face of social, economic and resilience inequalities in the region.

Model Validation
As discussed above, several elements involved in modeling professional and community emergency responses require a multi-pronged approach to validation. Qualitative and quantitative approaches have been developed for model validation. Validation is intended to demonstrate that the proof-of-concept model and the results for subsequent variations of the model provide reasonable representations of real-world professional and community responses.
Approaches that are used traditionally for model validation include the use of expert intuition, data analytics, empirical analyses and theoretical analyses. The first two model validation approaches are used for the most part in this work. However, as discussed below, other approaches may be employed where appropriate and feasible: Expert Intuition: To informally validate the model, a set of use cases are developed that provide a narrative of common, previously-known interactions between the professional and community responses. These use cases establish that the model can effectively describe the identified functions; this is verified using experts from the stakeholder communities.
Identifying and using stakeholders as subject matter experts is the key to this model validation approach. The authors of this chapter have strong relationships with local, state and national emergency response organizations, as well as community organizations that typically provide community response. These relationships have helped identify experts who guide the development of the model and help validate it.
Data Analytical Techniques: Previous research using 911 emergency call data involved the analysis of typical disaster events that occur in the Rochester metropolitan statistical area, including time-based features, resource allocation and event classification [13]. The scenario outputs of the agent-based model are compared against expected outcomes for these events at reasonable confidence levels.
The validation of complex models, especially where real-world systems with real-time constraints are involved, has many open research questions regarding utility and correctness. A thorough discussion of validation methods is outside the scope of this work; however, the analysis and comparison of validation approaches is a topic for future research.

Base Model Framework
This section presents a limited scope model using existing critical infrastructure asset data and a set of simple rules based on extensive experience with the Rochester area and its emergency response environment. The limited model can be incrementally built up to a full-scale model with reasonable confidence that the system works as it should. Figure 3 shows the basic model consisting of two communities divided into nine neighborhoods. Both the communities are centered on an event at N 2,2 . All the parameter values in the model are between zero and one.
The communications layer response capacity at time t is denoted by C t . Let I t and E t be the independent capacity values at time t for the information technology and emergency management layers, respectively. Then, IL t , the information technology layer response capacity at time t, is computed as: and EL t , the emergency layer response capacity at time t, is computed as: The link between the community system and the professional system is represented by two parameters that modify the emergency response levels. The first parameter K t is the amount of information that the professional responders receive about community efforts at each time step. The second parameter Tr t expresses the level of trust that the first responders have in the accuracy of the community network information.
The community system parameters reflect their resource levels at time t (N i,j,t ), the probability that a neighborhood will assist the affected area at time t (H i,j,t ) and the percentage of resources that that neighborhood N i,j will share at time t (S i,j,t ): N i,j,t : Neighborhood N i,j resource level at time t.
H i,j,t : Probability that neighborhood N i,j will assist N 2,2 at time t.
S i,j,t : Maximum amount of resources that neighborhood N i,j will share with N 2,2 at time t.
Note that, in the base model, the sharing parameter S i,j,t is treated as a constant, although it would vary dynamically in a real-world model.
The two cells in the model operate using the same rules for how neighborhoods respond to an affected area and how the professional system responds. Table 2 presents a partial list of rules used in the base model; these rules must be satisfied for the community or professional network to respond.
If the rules are satisfied, then a neighborhood N i,j responds with S i,j,t level of assistance. The professional network also responds with one unit of assistance (10% of capacity) when there are no communications between the layers.
where AF is an adjustment factor that represents the effectiveness of neighborhood response. For example, an AF value of 2 indicates that two units of community medical response equal one unit of professional medical response. This is because medical response by community members may not be as complete or effective as response by trained emergency medical technicians with the appropriate equipment.   Table 3 shows the progression of response to an event by professional responders (with and without communications) and by neighbors. The following thresholds were used in the model execution: Minimum neighborhood health threshold = 0.5.
Minimum help probability threshold = 0.5.  Initial values for all the other parameters were randomly generated. The information technology and communications layer capacity values at each time step were also random values. Table 3 shows the event values for the two cells in the model. In the example, the overall neighborhood response was computed using an adjustment factor AF = 2. Therefore, two units of neighborhood response are equal to one unit of professional response. The first two columns of the table track the capacity of the emergency response layer with and without adjusting for neighborhood response. Figure 4 shows the results. In this short-duration event, it can been seen that the emergency management capacity is preserved when there are communications between the layers.
The basic model example makes some assumptions that are definitely not valid in the real world. For example, the model assumes that the emergency management capacity is "used up" and is not renewable. It also assumes that the neighborhood health values only decrease when resources are shared with an affected area. Additionally, the health of the affected area does not decrease in the model. Figure 5 shows the first five time steps of one cell in the base model. The figure is formatted according to the rules in Table 2. Bold text indicates that a rule is satisfied while normal text indicates otherwise. Since no rules are specified for the information technology and communications layers, the corresponding values are not formatted; neither is the central neighborhood N 2,2 where the event has occurred. In this run of the model, the professional responders only sent assistance at time t = 0. The neighborhoods N 1,2 , N 1,3 , N 2,3 and N 3,3 show decreasing health values, reflecting that assistance was sent to the affected cell N 2,2 .
The example involves a two-cell model. The complete model contains many more cells and rules conditioned on the data available for the Rochester area. The methodology is, however, generalizable to any region or response hierarchy. Indeed, the model framework is generic -only the rules and underlying geospatial and critical asset distributions are specific to a region. Other metropolitan statistical areas can easily create model frameworks based on their own data.

Conclusions
The proposed methodology is designed to assess the interplay between professional and community response networks for varying levels of community disruption. Central to the methodology is the approach used to understand how emergency response is enhanced or hampered by regional characteristics and community behaviors. The major contributions are the application of the base model and its use to create comprehensive models for real-world scenarios. It is hoped that this work will stimulate renewed efforts at ensuring effective, timely and efficient emergency response and management during disruptions ranging from short, minor perturbations to long-term, major disasters.