A Probabilistic Network Forensic Model for Evidence Analysis

Modern-day attackers use sophisticated multi-stage and/or multi-host attack techniques and anti-forensic tools to cover their attack traces. Due to the limitations of current intrusion detection systems and forensic analysis tools, evidence often has false positive errors or is incomplete. Additionally, because of the large number of security events, discovering an attack pattern is much like ﬁnding a needle in a haystack. Consequently, reconstructing attack scenarios and holding attackers accountable for their activities are major challenges. This chapter describes a probabilistic model that applies Bayesian networks to construct evidence graphs. The model helps address the problems posed by false positive errors, analyze the reasons for missing evidence and compute the posterior probabilities and false positive rates of attack scenarios constructed using the available evidence. A companion software tool for network forensic analysis was used in conjunction with the probabilistic model. The tool, which is written in Prolog, leverages vulnerability databases and an anti-forensic database similar to the NIST National Vulnerability Database (NVD). The experimental results demonstrate that the model is useful for constructing the most-likely attack scenarios and for managing errors encountered in network forensic analysis.


Introduction
Digital forensic investigators use evidence and contextual facts to formulate attack hypotheses and assess the probability that the facts support or refute the hypotheses [5].However, due to the limitations of forensic tools and expert knowledge, formulating a hypothesis about a multi-step, multi-host attack launched on an enterprise network and using quantitative measures to support the hypothesis are major challenges.This chapter describes a model that helps automate the process of constructing and analyzing quantitatively-supportable attack scenarios based on the available evidence.The applicability and utility of the model are demonstrated using a network attack case study.
The proposed method uses a Bayesian network to estimate the likelihood and false positive rates of potential attack scenarios that fit the discovered evidence.Although several researchers have used Bayesian networks for digital evidence modeling [3,5,12,13], their approaches construct Bayesian networks in an ad hoc manner.This chapter shows how the proposed method can help automate the process of organizing evidence in a graphical structure (called a logical evidence graph) and apply Bayesian analysis to the entire graph.The method provides attack scenarios with acceptable false positive error rates and dynamically updates the joint posterior probabilities and false positive error rates of attack paths when new items of evidence for the attack paths are presented.

Background and Related Work
Bayesian networks have been used to express the credibility and relative weights of digital and non-digital evidence [2,3,5,12,13].Several researchers have used Bayesian networks to model dependencies between hypotheses and crime scene evidence, and have employed these models to update the belief probabilities of newly-discovered evidence given the previous evidence [2][3][4][12][13][14].
Digital forensic researchers have used Bayesian networks to reason about evidence and quantify the reliability and traceability of the corresponding hypotheses [5].However, these Bayesian networks were custombuilt without using a uniform model.In contrast, the proposed model is generic and helps address the problems posed by false positive errors, analyze the reasons for missing evidence and compute the posterior probabilities and false positive rates of attack scenarios constructed using the available evidence.
Meanwhile, few, if any, tools directly support the automated construction of Bayesian networks based on the available evidence and estimate belief probabilities and potential error rates.A software tool for network forensic analysis was developed for use with the proposed probabilistic model.The tool, which is written in Prolog, leverages the MulVAL reasoning system [1,10] and employs system vulnerability databases and an anti-forensic database similar to the NIST National Vulnerability Database (NVD).The experimental results demonstrate that the tool facilitates the construction of most-likely attack scenarios and the management of errors encountered in network forensic analysis.

Logical Evidence Graphs
This section defines logical evidence graphs and shows how rules are designed to correlate attack scenarios with the available evidence.Because logical reasoning is used to link observed attack events and the collected evidence, the evidence graphs are referred to as logical evidence graphs.
Definition 1 (Logical Evidence Graph (LEG)): A logical evidence graph LEG = (N f , N r , N c , E, L, G) is a six-tuple where N f , N r and N c are three disjoint sets of nodes in the graph (called fact, rule and consequence fact nodes, respectively), is the evidence, L is a mapping from nodes to labels and G ⊆ N c is a set of observed attack events.
Every rule node has one or more fact nodes or consequence fact nodes from prior attack steps as its parents and a consequence fact node as its only child.Node labels consist of instantiations of rules or sets of predicates specified as follows: 1.A node in N f is an instantiation of predicates that codify system states, including access privileges, network topology and known vulnerabilities associated with host computers.
The rule head p is an instantiation of a predicate from N c , which is the child node of N r in the logical evidence graph.The rule body comprises p i (i = 1..n), which are predicate instantiations of N f from the current attack step and N c from one or more prior attack steps that comprise the parent nodes of N r .

3.
A node in N c represents the predicate that codifies the post-attack state as the consequence of an attack step.The two predicates execCode( host, user) and netAccess( machine, protocol, port) are used to model the attacker's capability after an attack step.
Valid instantiations of these predicates after an attack update valid instantiations of the predicates listed in (1). Figure 1 shows an example logical evidence graph; Table 1 describes the nodes in Figure 1.In Figure 1, fact, rule and consequence fact nodes are represented as boxes, ellipses and diamonds, respectively.Facts (Nodes 5, 6, 7 and 8) include network topology (Nodes 5 and 6), computer configuration (Node 7) and software vulnerabilities obtained by analyzing evidence captured by forensic tools (Node 8).Rule nodes (Nodes 2 and 4) represent rules that change the attack status using attack steps.These rules, which are based on expert knowledge, are used to link chains of evidence as consequences of attack steps.Linking a chain of evidence using a rule creates an investigator's hypothesis of an attack step given the evidence.Consequence fact nodes (Nodes 1 and 3) codify the attack status obtained from event logs and other forensic tools that record the postconditions of attack steps.
Lines 9 through 17 in Figure 2 describe Rules 1 and 2 in Table 1.The rules use the Prolog notation ":-" to separate the head (consequence fact) and the body (facts).Lines 1 through 8 in Figure 2 list the fact and consequence fact predicates of the two rules.Rule 1 in Lines 9 through 12 represents an attack step that states: if (i) the attacker is located in a "Zone" such as the "internet" (Line 10: attackerLocated(Zone)); and (ii) a host computer "H" can be accessed from the "Zone" using "Protocol" at "Port" (Line 11: hacl(Zone, H, Protocol, Port)); then (iii) host "H" can be accessed from the "Zone" using "Protocol" at "Port" (Line 9: netAccess(H, Protocol, Port)) via (iv) "direct network access" (Line 12: rule description).

Computing Probabilities
Bayesian networks can be represented as directed acyclic graphs whose nodes represent random variables (events or evidence in this work) and arcs model direct dependencies between random variables [11].Every node has a table that provides the conditional probability of the node's variable given the combination of the states of its parent variables.
Definition 2 (Bayesian Network (BN)): Let X 1 , X 2 , • • • , X n be n random variables connected in a directed acyclic graph.Then, the joint probability distribution of X 1 , X 2 , ..., X n can be computed using the Bayesian formula: A Bayesian network helps model and visualize dependencies between a hypothesis and evidence, and calculate the revised probability when new evidence is presented [9]. Figure 3 presents a causal view of hypothesis H and evidence E. Bayes' theorem can be used to update an investigator's belief about hypothesis H when evidence E is observed: where P (H|E) is the posterior probability of an investigator's belief in hypothesis H given evidence E. P (E|H), which is based on expert knowledge, is a likelihood function that assesses the probability of evidence assuming the truth of H. P (H) is the prior probability of H when the evidence has not been discovered and P (E) = P (E|H) • P (H) + P (E|¬H) • P (¬H) is the probability of the evidence regardless of expert knowledge about H and is referred to as a normalizing constant [5,9].

Computing P (H|E)
A logical evidence graph involves the serial application of attack steps that are mapped to a Bayesian network as follows: N c as the child of the corresponding N r shows that an attack step has occurred.
N r is the hypothesis of the attack step and is denoted by H. N f from the current attack step and N c ′ from the previous attack step as the parents of N r correspond to the attack evidence, showing the exploited vulnerability and the privilege the attacker used to launch the attack step.
N c propagates the dependency between the current attack step and the next attack step.N c is also the precondition of the next attack step.
Computing P (H|E) for a Consequence Fact Node.Equation ( 2) can be used to compute P (H|E) for a consequence fact node of a single attack step when the previous attack step has not been considered.Because the rule node N r provides the hypothesis H and both the fact node N f and the consequence fact node from a previous attack step N c ′ provide evidence E, the application of Bayes' theorem yields: The fact nodes from the current attack step and the consequence fact node from a previous attack step are independent of each other.They constitute the body of a rule, deriving the consequence fact node for the current attack step as the head of the rule.Consequently, their logical conjunction provides the conditions that are used to arrive at the rule conclusion.Accordingly, if a rule node has k parents N p1 , N p2 , . . ., N pk that are independent, then that ∩denotes the AND operator).Due to the independence, given rule N r , P (E|N r ) = H|E) for a consequence fact node is computed as: However, because P (E|N r ) represents the subjective judgment of a forensic investigator, it would be difficult for human experts to assign P (N p1 |N r ), P (N p2 |N r ), • • • , P (N pk |N r ) separately.Therefore, the forensic investigator has the discretion to use Equation (3) to compute P (E|N r ) directly.
Computing P (H|E) for the Entire Graph.Next, it is necessary to compute P (H|E) for the entire logical evidence graph comprising the attack paths.Any chosen attack path in a logical evidence graph is a serial application of attack steps.An attack step only depends on its direct parent attack steps and is independent of all the ancestor attack steps in the attack path.Upon applying Definition 2, the following equation is obtained: where S i (i = 1..n) denotes the i th attack step in an attack path.
Let N i,f , N i,r and N i,c be the fact, rule and consequence fact nodes, respectively, at the i th attack step.Then, Equation (5) may be written as: is the joint posterior probability of the previous i attack steps (i.e., 1..i) given all the evidence from the attack steps (e.g., evidence for attack step 1 is N 1 ,f ; the evidence for attack step i includes N i−1 ,c and N i,f where i = 2..n. ) is propagated to the i + 1 th attack step by the consequence fact node N i,c , which is also the precondition of the i + 1 th attack step.Algorithm 1 formalizes the computation of P (H|E) for the entire logical evidence graph.
Because a logical evidence graph may have several attack paths, to compute the posterior probability of each attack path, all the nodes are marked as WHITE (Lines 2 through 4 in Algorithm 1) and all the fact nodes are pushed from the first attack step of all attack paths to an empty queue (Lines 1 and 5).If the queue is not empty (Line 7), a fact node is taken out of the queue (Line 8) and a check is made to see if its child that is a rule node is WHITE (Lines 9 and 10).If the rule node is WHITE, a new attack path is created (Line 11), upon which Equation ( 6) is used recursively to compute the joint posterior probability of the entire attack path (Lines 16 through 30) and the node is marked as BLACK (Line 13) after the computation of the function PATH(N 1 ,r ) in Line 12 is complete.The above process is repeated until the queue holding the fact nodes from the first attack steps of all the attack paths is empty.

Computing the False Positive Rate
False positive and false negative errors exist in logical evidence graphs.A false negative arises when the investigator believes that the event was not caused by an attack, but was the result of an attack.A false positive arises when the investigator believes that an event was caused by an attack, but was not.Clearly, it is necessary to estimate both types of errors.
Because a logical evidence graph is constructed using attack evidence chosen by the forensic investigator, there is always the possibility of false positive errors.Therefore, the cumulative false positive rate of the constructed attack paths must be computed.False negative errors are not computed in this work.The individual false positive estimate for an attack step is expressed as P (E|¬H), where ¬H is the alternative hypothesis, usually written as "not H," and the value of P (E|¬H) can be obtained from expert knowledge.To demonstrate the computation of the cumulative false positive rate of an entire attack path, let N i,f , N i,r and N i,c correspond to the fact, rule and consequence fact nodes, respectively, of the i th attack step.Then, the cumulative false positive rate of the entire attack path is computed as follows: Note that all the evidence supporting an attack step is independent of the evidence supporting the other attack steps.
As mentioned above, E 1 in Equation ( 7) is N 1 ,f and E i includes N i−1 ,c and N i,f (i = 2..n).The symbol ∪ denotes the noisy-OR operator [7].For a serial connection, if any of the attack steps is a false positive, then the entire attack path is considered to be a false positive.Algorithm 2 formalizes the computation of P (E|¬H) for the entire evidence graph.
Lines 1 through 15 in Algorithm 2 are the same as in Algorithm 1 (i.e., they find a new attack path).Lines 16 through 29 use Equation ( 7) to recursively compute the cumulative false positive rate of an entire attack path.

Case Study
This case study demonstrates how probabilistic attack scenarios can be reconstructed using Bayesian analysis [13].

Experimental Network
Figure 4 shows the experimental network [6] used to generate a logical evidence graph from post-attack evidence.In the network, the external Firewall 1 controls Internet access to a network containing a Portal Web Server and Product Web Server.The internal Firewall 2 controls access to a SQL Database Server that can be accessed from the web servers and workstations.The Administrator Workstation has administrative privileges to the Portal Web Server that supports a forum for users to chat with the administrator.In the experiment, the Portal and Product Web Servers and the Database Server were configured to log all accesses and queries as events and Snort was used as the intrusion detection Algorithm 2 : Computing P (E|¬H) for the entire graph.
◃ cumulative false positive rate 27: color[E] ← BLACK ◃ mark all traversed evidence as black 28: end for 29: Return P f ◃ return the cumulative false positive rate of the attack path system.The evidence in the case study constituted the logged events and intrusion alerts.By exploiting vulnerabilities in a Windows workstation and a web server with access to the Database Server, the attacker was able to successfully launch two attacks on the Database Server and a cross-site scripting (XSS) attack on the Administrator Workstation.The attacks involve: (i) using a compromised workstation to access the Database   The logging system and intrusion detection system captured evidence of network attack activities.Table 2 presents the processed data.Table 3 presents the post-attack evidence obtained using forensic tools.

Constructing the Graph
To employ the Prolog-based rules for evidence graph construction, the evidence and system state were codified as instantiations of the rule predicates as shown in Figure 5.In Figure 5, Lines 1 through 3 model evidence related to the post-attack status (Table 3), Lines 4 through 10 model the network topology (system setup), Lines 11 through 14 model system configurations and Lines 15 through 21 model vulnerabilities obtained from the captured evidence (Table 2).
The input file with rules representing generic attack techniques was submitted to the reasoning system along with two databases, including an anti-forensic database [6] and MITRE's CVE [8], to remove irrelevant evidence and obtain explanations for any missing evidence.
The results are: (i) according to the CVE database, Workstation 2, which is a Linux machine using Firefox as the web browser, rendered an attack using CVE-2009-1918 unsuccessful because the exploit only succeeds on Windows Internet Explorer; (ii) a new attack path expressing that the attacker launched phishing attacks at the clients using the Administrator's stolen session ID was found; and (iii) an attack path between the compromised Workstation 1 and the Database Server was found.
The network forensic analysis tool created the logical evidence graph shown in Figure 6.The nodes in Figure 6 are described in Tables 4  and 5.The third column of each table lists the logical operators used to distinguish fact nodes, rule nodes and consequence fact nodes.A fact node is marked as LEAF, a rule node is marked as OR and a consequence fact node is marked as AND.
Figure 6 has three attack paths: The attacker used an XSS attack to steal the Administrator's session ID and obtain administrator privileges to send phishing emails to clients (Nodes: 11 The attacker used a buffer overflow vulnerability (CVE-2009(CVE- -1918) ) to compromise a workstation and then obtain access to the Database Server (Nodes: 34 → 33 → 32 → 31 → 30 → 28 → 18 → 17 → 16) (Middle).

Computations
This section uses Algorithms 1 and 2 to compute Using Algorithm 1 to Compute P (H|E1, E2..En).Algorithm 1 requires P (N 1 ,r ), All these probabilities are derived from expert knowledge.To minimize subjectivity, the average value of the probability based on the judgments of multiple experts should be computed [5].
Because the case study is intended to demonstrate the computations, for simplicity, all P (H i ) = P (¬H i ) = 50%, P (E i ) = k ∈[0, 1] (k obviously would differ for different evidence in a real scenario).Also, the P (E i |H i ) values were assigned based on the judgment of the authors of this chapter; the probability values of P (E i |H i ) are listed in Table 6.where c = 1 2 k .Algorithm 1 is used to compute P (H|E 1 , E 2 , • • • E n ) as shown in the last column of Table 6.
Note that Node 17 has two joint posterior probabilities, which are from the middle path and right path, respectively.Note also that the middle attack path has a lower probability than the right attack path.This is because the attacker destroyed the evidence obtained from the middle path that involved using a compromised workstation to gain access to the database.Additionally, the P (E|H) value is lower.Therefore, the corresponding hypothesized attack path has a much lower probability In reality, it is unlikely that the same attacker would attempt a different attack path to attack the same target if the previous attack had already succeeded.A possible scenario is that the first attack path was not anticipated, so the attacker attempted to launch the attack via the second attack path.The joint posterior probability P (H|E 1 , E 2 , • • • , E n ) could help an investigator select the most pertinent attack path. ) is the most convincing attack path because it has the largest P (H|E) value and smallest P (E|¬H) value.The left attack path is not convincing because its joint posterior probability is less than 0.5c 4 .The middle path is not so convincing because it has a higher cumulative false positive rate, suggesting that the attack path should be re-evaluated to determine if it corresponds to a real attack scenario.

Conclusions
The principal contribution of this research is a method that automates the construction of a logical evidence graph using rules and mapping the graph to a Bayesian network so that the joint posterior probabilities and false positive rates corresponding to the constructed attack paths can be computed automatically.The case study demonstrates how the method can guide forensic investigators to identify the most likely attack scenarios that fit the available evidence.Also, the case study shows that the method and the companion tool can reduce the time and effort involved in network forensic investigations.However, the method cannot deal with zero-day attacks; future research will attempt to extend the underlying model to address this deficiency.This paper is not subject to copyright in the United States.Commercial products are identified in order to adequately specify certain procedures.In no case does such an identification imply a recommendation or endorsement by the National Institute of Standards and Technology, nor does it imply that the identified products are necessarily the best available for the purpose.

Table 1 .
Descriptions of the nodes in Figure1.

1 :
Qg ← Ø ◃ set Qg to empty 2: for each node n ∈ LEG do

:
ENQUEUE(Qg, N 1,f ) ◃ push all fact nodes from the first attack step to queue Qg 6: j ← 0 ◃ use j to identify the attack path being computed 7: while Qg ̸ = Ø do ◃ when queue Qg is not empty 9:N1,r ← child[n]◃ find a rule node as the child node of n 10: if (color[N1,r] ≡ WHITE) then ◃ if the rule node is not traversed (white) 11: j ← j+1 ◃ must be a new attack path 12: P[j] ← PATH(N1,r) ◃ compute joint posterior probability of the path 13: color[N1,r ] ← BLACK ◃ mark the rule node as black Ni,r ← child[Ni−1,c] ◃ rule node as H of the i th attack step 24: E ← parents[Ni,r] ◃ evidence for the i th attack step 25: Ni,c ← child[Ni,r] ◃ consequence fact node of the i th attack step 26: P[Ni,c] ← P (Ni,r|E) ← P (N i,r )P (E|N i,r ) P (E) ◃ posterior probability of the i th attack step

1 :
Qg ← Ø ◃ set Qg to empty 2: for each node n ∈ LEG do ENQUEUE(Qg, N 1,f ) ◃ push all fact nodes from the first attack step to queue Qg 6: j ← 0 ◃ use j to identify the attack path being computed 7: while Qg ̸ = Ø do ◃ when queue Qg is not empty if (color[N1,r] ≡ WHITE) then ◃ if the rule node is not traversed (white)Ni,r ← child[Ni−1,c] ◃ rulenode as H of the i th attack step 24: Ni,c ← child[Ni,r] ◃ consequence fact node of the i th attack step 25: E ← parents[Ni,r] ◃ evidence for the i th attack step 26:

Table 2 .
Evidence comprising logged events and alerts.
Figure 5. Input file for generating the logical evidence graph.

Table 4 .
Descriptions of the nodes in Figure6.

Table 5 .
Descriptions of the nodes in Figure6(continued).