An Improved Technique for Logic Gate Susceptibility Evaluation of Single Event Transient Faults

,


Introduction
The manufacturing precision limitations and supply voltage reduction combined with higher operating frequency and power dissipation are the primary concern for the technology scaling in nanoscale designs.These challenges severely impact the reliability of a system and consequently influence the need for reliability.The circuit reliability has been pointed out as one of the major challenges in deep sub-micron CMOS circuits [1].Meanwhile, with the omnipresence of electronics in our daily lives, there is even more demand for reliable system design.Despite these difficulties and the fact that the chips cannot be retested at the factory, users expect the system to remain reliable and to continue to deliver the rated performance [2].
These limitations in the fabrication process may increase the number of faults in circuits, reducing their reliability.To mitigate the problem above, it is explored different kinds of redundancy.These redundancy guarantees circuits that produce correct outputs even in the presence of errors [3].However, they are usually based on redundancy in time, hardware, and/or information [4].Although any redundancy-based strategy would impose extra overhead, it is still of high interest since the fabrication yield is predicted to become extremely low in nanoscale designs [5].
To avoid the overdesign and guarantee the best option in the fabricated circuit, many reliability evaluation methods may be used.An accurate method is the Probabilistic Transfer Matrix (PTM) [6], and it is the basis for other methods, as the Signal Probability Reliability Multi-Pass (SPRMP) [7].Since the test of a circuit is a high-cost task, the probabilistic methods used to estimate the reliability of a circuit based on the reliability of the gate are even more highlighted.Besides, these methods are prone to reliability analysis under multiple faults scenario.It is known that the limitation of these methods is the simplification of the assumption of the same error probability values for all logic gates.The work proposed in [8] shows a method in transistor-level to create the logic gate probabilistic matrices.The matrices created shown that it is essential to observe the transistor arrangements to produce more accurate matrices for the logic gates that feed the probabilistic methods.
The models for evaluating the reliability of logic gates made so far consider transistor arrangement information to calculate the reliability of a gate given a type of fault.As the transient fault occurs on the sensitive nodes of the gates, stick diagram information becomes necessary for an accurate estimation.Therefore, this work proposes a probabilistic method capable of evaluating the susceptibility of logic gates concerning SET without the need for electrical simulations.It is important to emphasize that the evaluation is independent of technology since the stick diagrams are evaluated according to the number of sensitive areas.Another important point is that the specific effects of charge sharing in the transistors are not considered, always being observed the presence of the fault at the affected node, given the incidence of the particle.This chapter is organized as follows: Section 2 presents a brief overview of basic reliability concepts and introduce three methods that explore PTMs to estimate the circuit reliability.Section 3 introduces Single Event Transient effect and its definitions.In Section 4, the methodology proposed to calculate the susceptibility of a logic gate described as a stick diagram is explained as well as a case study using a two-input NAND.Section 5 presents results and characteristics of the impact of the layouts in the single event transient susceptibility.Finally, in Section 6, the conclusions are presented.

BackGround
This section initially introduces reliability metrics and the PTM concept.Later, three reliability estimation methods are discussed.These methods have been chosen due to accuracy or runtime.They are chosen since all of them explores the PTM concept to estimate the circuit reliability.

Reliability Concepts
Metrics The reliability (R or Q) of a circuit is defined as the probability of a circuit operates correctly during a time interval.Therefore, its complement, the probability of a failure, is defined as fault probability (P), as shown in Eq. 1.
The failure rate (λ), calculated using Eq. 2, is one of the metrics used for digital circuit reliability estimation.This failure rate indicates the number of failures that a circuit can present in one hour of operation.Similarly, the Mean Time Between Failures (MTBF) is used to represent the, as the name indicates, the time between failures in the evaluated circuit.Equation 3 presents as this metric is obtained.Once the MTBF value corresponds to the mean time between failures, as higher this value more reliable is the circuit.Both are important metrics used to compare reliabities of different systems, calculated using.
Probabilistic Transfer Matrix The probabilistic transfer matrix, abbreviated as PTM, aim to represent the probability of success or failure of each input vector given a logic gate.This representation is very important in reliability analysis since it is used in several circuit estimation methods, as the three that are discussed later.This matrix maps the possible inputs and the respective outputs of a given circuit.To understand how the PTM is generated it is necessary to know the ideal transfer matrix (ITM) that represents the behavior of a logic gate or circuit in a fault-free scenario.Through the truth table of a given logic gate, it is possible to determine the ITM matrix and consequently the output that supposed to be the correct, correlating this to the chosen probability the PTM is fill.In the presence of faults, there are conditions that the correct output not always occurs.If we know how frequently it happens, it is possible to map all possible conditions of this gate by using a PTM.Fig. 1 shows how to generate a PTM of two-input NAND gate based on it is truth table and ITM matrix.In this case, the PTM considers that the correct output occurs with probability "q".At the same way, the erroneous output can also occurs with a probability represented by the complement of q, defined as "1-q".

Circuit Reliability Estimation Methods
With the basic reliability concepts reviewed, this section is dedicated to introduce three circuit reliability estimation methods.All three methods explores the PTM concept.The first one uses the same name of the concept.To avoid misunderstandings, we always used "PTM Method" to refer to the estimation method and only PTM to refer to the concept.
Probabilistic Transfer Matrices Method -PTM Method Many methods to estimate the reliability of a circuit have been proposed in the literature [9].The Probabilistic Transfer Matrix Method (PTM Method), proposed by Patel et al [10], is able to produce an exact reliability evaluation of a logic circuit, in a straightforward process [11].The method was extensively explored by Krishnaswamy et al [12].In the PTM Method, the reliability of a circuit is obtained by a combination of the individual gates reliability and the circuit's topology.The individual gates reliability and the circuit's reliability are represented by PTM and ITM matrices.
In a simplified way, each gate can be modeled by a PTM, and the PTM of larger circuits can be computed by multiplying the PTMs of series logic functions and applying the Kronecker tensor in the PTMs of logic gates that are in the same deep level of the circuit.The circuit reliability is extracted according to the Equation 4, where p(i) denotes the probability of input vector i [13].If all input vectors have the same probability, the Equation 4 can be simplified in Equation 5.
The main limitation of the PTM Method is the size of the matrices that must be stored and manipulated.Each level in a logic circuit is represented by a PTM.The size of a PTM is a function of the number of inputs and outputs that are being modeled.The number of rows in a PTM is equal to 2 n , where n is the number of inputs in the circuit level.The number of columns in a PTM is equal to 2 m , where m is the number of outputs in the circuit level.Then for a circuit level with 24 inputs and 12 outputs, for example, the dimensions of the PTM of the level will be 2 24 rows by 2 12 columns, or 512 GB of storage space for 8 bytes floating point representation of probabilities.Given this scenario, the application of the PTM is limited to small size circuits, even with techniques that improve the efficiency of the method [13] Signal Probability Reliability -SPR The SPR method is another method that explores PTM and ITM matrices to map the reliability behavior of logic gates in a circuit.The method proposed by [14] introduce the concept of Signal Probability matrix.This new concept avoid the generation of large matrices to represent the intermediate circuit states as in PTM Method.
The signal probability matrix is a 2 × 2 matrix.It represents the 4 possible states of a signal: a correct 0 (# 0 ), a correct 1 (# 3 ), an incorrect 0 (# 2 ) and an incorrect 1 (# 1 ) as shown in Figure 2. The probability matrix of an output gate signal is easily computed through the simple multiplication of the input signals probabilities matrices by the logic gate PTM.From this assumption, it is possible to affirm that the SPR complexity is linear to the number of gates [15].This makes the method scalable and can be applied to circuits with thousands of logic gates.The reliability of the entire circuit RC can be extracted according Equation 6, where Rj is the reliability of each circuit output signal and m is the amount of circuit output [14].Despite these advantages, the SPR method doesn't takes into account the probability dependence of reconvergent signals, producing reliability values that are inaccurate, depending on the number of reconvergent fanout signals in the circuit [16].
Signal Probability Reliability Multi-Pass -SPR-MP Considering the accuracy limitations of the SPR method, which is a straightforward algorithm, an alternative of the SPR method based on multiple passes of probabilities propagation was also proposed by [17], and was referred to as the SPR Multi-pass, or SPR-MP.In the SPR-MP method, the probabilities associated to each reconvergent signal are propagated 4 times, with a single signal state being propagated at a time.The values computed at each pass of the algorithm are accumulated to produce the final value.
As with the SPR method, there is no memory limitation associated with the SPR-MP method, but processing time is dependent on the number of reconvergent fanout signals [16].Equation 7represents the number of passes of the algorithm to compute the reliability of a circuit with F reconvergent fanouts.The main advantage of the SPR-MP method is the possibility to restrict the number of fanouts (and so, the number of passes of the algorithm) to be considered in the reliability computation.This characteristic allows a tradeoff between processing time and accuracy, leading to a better scalability than the PTM method and a better accuracy than the SPR method [18].

Single Event Transient
Many advances in the integrated circuits are achieved due to technology scaling.The fabrication of even more capable computing architectures has been enabled by smaller, faster, and cheaper fundamental microelectronic building blocks.However, voltage scaling has dropped lower and lower.It results in a reduction in the amount of charge that represents stored information, increasing the sensitivity of CMOS devices to single-particle charge collection transients.Also, the higher frequency achieved by the circuits can intensify the soft errors due to the reduction in the timing masking.
In the case of Single event transient (SET), it is caused by the generation of charge due to a single particle passing through a sensitive node in the combinational circuit.This strike in a sensitive node within a combinational logic circuit can produce a wrong output value during a time interval.The pulse generated by the particle strike can have a positive or negative magnitude, depending on whether the particle hits at the sensitive node of the NMOS or PMOS transistors.
In the literature, the consideration of a sensitive node for CMOS circuits has some misunderstands.For example, in [19], [20], and [21], it is considered as a sensitive node the drain of the OFF transistors, considering an inverter gate as an example.This assumption is not mistaken for the example, as shown in Fig. 3.The inverter gate biased with the logic value "1" presents as the only sensitive node, the drain of the PMOS OFF-transistor, as illustrated herein.Then, it is possible to affirm that the sensitive PN junction of the gate is the drain of the OFF-transistor (just in case of the inverter).However, the ideal affirmation is that a sensitive node is the reverse-biased PN junction [22] [23].When these particles hit the silicon bulk, the minority carriers are created.If collected by the source/drain diffusion regions, the change of the voltage value of those nodes occurs [24].Fig. 3. Single Event Transient Mechanism: Inverter example of a particle strike at a sensitive node [25] Besides, considering the NAND2 gate as an example shown in Fig. 2. The output node G, which belongs to the transistor M1 is sensitive when the input vector DE="10" is applied, albeit it is an ON-transistor.Furthermore, as the behavior of the SET faults is different for a PMOS/NMOS particle strike, it is assumptive that the primary condition for reverse-biased PN junctions is satisfied with the complementary OFF-plane of the gate, instead of in the OFFtransistors.
Moreover, some internal nodes of a gate are not always sensitive to the particle strike.The pulse generated due to the particle strike in an internal node may not propagate if there is not a logical sensitized path to the output.Then, the pulse propagation from a sensitive node to the output depends on the state of the inputs [20].Fig. 4 shows an example of a sensitive node and pulse propa-gation in a NAND gate.When DE="10", then there is a sensitive path between N3 and G, making N3 a sensitive node for this specific input vector.However, the input vector "11" also makes a sensitive path between N3 and G, although, in this condition, the node is not reverse biased.

SET susceptibility analysis
The reliability concept of a circuit is related to the probability of this circuit to perform the function to which it was designed, under certain conditions during a given time interval [26].The results for error probability (EP) in [8] indicate that the equal EP values of logic gates traditionally used in reliability evaluation underestimate the real logic gates EPs, and consequently the circuit reliability.This chapter presents a method able to evaluate Single Event Transient fault susceptibility in a logic gate.A preliminary version of our work appeared in [27].The previous work was extended providing a more detailed evaluation of stick diagram level and also a electrical validation of the results.The method proposed in [8], which evaluates logic gates at transistor-level, does not evaluate precisely when parallel transistors association results in two or more nodes in layout level.Also, it is known that a logic cell can be designed in different ways, then the need for a stick level analysis is highlighted.
This section presents the method proposed to evaluate the susceptibility of logic gates to transient faults.At first, a definition of fault as a probabilistic event is presented, which is the base to the method that analyzes stick diagrams.

Definition of fault as a probabilistic event
A logic gate is defined as X, which has a set of nodes N. Considering that the probability of a particle incidence in a node i ∈ N is defined as p.Then the probability is obtained considering P(i) = p.
The probability of a particle occurrence on a specific logic gate, in this case, is an independent event.It means that it is necessary to calculate the probability of this same particle to cause an error as the probability of the particle strike any sensitive node, given an input vector.The main reason a particle incidence in a node is considered an independent event is defined through the concept of probability theory.When two events are said to be independent of each other, it means that the probability that one event occurs does not affect the probability of the other event occurring.
Therefore, considering that a logic gate has k sensitive nodes, the output error probability is defined as the union of the probability of a particle incidence in any sensitive node of the gate.As the definition of the probability theory of independent events, the occurrence of an event i ∈ N , being N Eq. 8 gives the total number of events that cause a fault at the output.
where S k is defined by Eq. 9 [28].Note that k represents the total elements present in the equation.For instance, assuming k=2 and three events (A 1 , A 2 and A 3 ).The value S k corresponds to the sum of the intersection of each pair of possible elements, for example After the definition of the equation necessary to calculate the susceptibility, it is possible to apply the method considering the number of sensitive nodes in an input vector of a logic gate.The next subsections present the method to evaluate the susceptibility of logic gates that depend on the probability of a particle incidence.

Simplified method
The stick diagram method relies on the theory previously presented for its operation.The flowchart described in Fig. 5 represents the analysis of the stick model.
Consider, for example, the stick diagram in Fig. 6 for a two-input NAND function.This diagram has six nodes in total, two connected to VDD (n1 and n3), and one connected to GND (n4).In this case, none of these nodes are considered sensitive by the method, as they are connected to the circuit power supplies.The other nodes (n2, n5, and n6) depend on the input vector to be considered sensitive.Also, consider that the probability of occurrence of a particle in a sensitive node is set to p.
For input vector AB = 00, the expected output of the logic function is the logical value "1" .It means that the gatepull-up plane is conducting and that the transient fault may only affect the circuit if it occurs in the pull-down plane.Node 6 becomes sensitive as it is reverse-biased.Node 5 is not sensitized due to the lack of a conductive path to the exit.Thus, the susceptibility is given by the probability of the incidence of a particle in node 6.
For the input vector AB = 01, the expected output of the logic function is the logical value "1" .That is, just like the previous state, the fault may only affect the circuit occurring in the pull-down plane.As in the previous vector, only node 6 is sensitive because it is reverse-biased.Thus, the susceptibility is given by the probability of the incidence of a particle in node 6.
For the input vector AB = 10, the expected output of the logic function is the logical value "1" .That is, just like the previous states, the fault may only affect the circuit occurring in the pull-down plane.In this vector, nodes 5 and 6 are reverse-biased and have a conductive path to the output.Thus, the susceptibility is given by the probability of the incidence of a particle at node 5 or node 6.
For the input vector AB = 11, the expected output of the logic function is the logical value "0".That is, the fault may only affect the circuit occurring in the pull-up plane.In this vector, node 2 is reverse-biased.Thus, the susceptibility is given by the probability of the incidence of a particle at node 2. Table 1 summarizes the values and equations for each vector.

Method Validation
In this section, it is explained the methodology used to validate the proposed method.First of all, the method is based on two main rules to determine the sensitive nodes of a logic gate.To a node be sensitive, this node must present a reverse biased condition.Furthermore, a low resistance path must exist between the affected node and the output of the gate.Then, the flowchart of the method validation is presented in Fig. 7.
From the conditions mentioned above, and the Single Event Effects behavior in NMOS and PMOS transistors, it is presented the methodology to evaluate the proposed method.A total of eighteen logic gates from FREEPDK45 was used to validate.The first step performs a search for the minimum energy required (LET threshold ) to produce a bit flip in any input vector of any logic gate.The NGSPICE electrical simulator was used in this step to evaluate the gates.The search is performed to guarantee that a particle incidence on a sensitive node (node that corresponds to the two rules previously described) produces a voltage change on the output.Based on this information, it is found that the minimum LET value capable of producing an error in any logic gate found in these cells is in the output node of the NOR4 gate when ABCD=1111, presenting a LET=15.46MeV.Thereby this LET value is used as particle energy to evaluate which node of the gates is sensitive.
Then, it is selected each logic gate to evaluate the sensitive nodes.The node evaluation of the gates is performed for each input vector.In this evaluation, it is analyzed the list of node candidates to be sensitive.Each node is individually evaluated.To perform the analysis, the electrical simulator software NGSPICE was used.Then, one particle incidence is performed on each sensitive node candidate at a time.For example, a logic gate containing five sensitive node candidates and three inputs is simulated 2 3 * 5 times to evaluate each node if it is sensitive in each input vector.For each particle insertion, the output node is observed to verify if the particle incidence has changed the output of the gate.To an error be observed at the output, it is necessary that the pulse must result in energy of 50% of the input voltage of the gate.The Fig. 8 and Fig. 9 show two conditions of the validation methodology.The first one, on the left, is the particle incidence on a node that does not fit on the specified conditions (not reverse-biased and no low resistance path to the output).The second, on the right, is the result of the same particle incidence on a reverse-biased node.For the 45nm technology node, 50% of the respective supply voltage corresponds to 0.5V.
The NAND2 gate presented in Fig. 6 is considered to exemplify the flowchart of the validation.The gate has a total of six nodes (named n1 to n6).First of all, the nodes n1, n3, and n4 are not sensitive due to being connected to VDD or GND terminals.Then, the nodes n2, n5, and n6 could present a reverse biased condition depending on the input vector.Analyzing the NAND2 gate, when input vector AB=00 is applied, the only sensitive node is n6, presenting a reverse biased condition.This behavior is also repeated when input vector AB=01 is applied.
When the input vector is AB=10, then, on the pull-down network, there are two nodes in reverse biased condition and presenting a low resistance path to the output.Then, in this input vector, the sensitive nodes that the particle strike causes a voltage change on the output are n5 and n6.
Finally, when input vector AB=11 is applied, the only node that the particle strike produces a voltage change is the node n2.It was expected since, in the pull-up network, there is only one node that be sensitive, because n1 and n3 are connected to VDD, and they do not present the reverse biased condition.Figure 10 presents the sensitive nodes for each input vector of this logic gate.As expected, the validation process produced the same sensitive nodes of all logic gates analyzed by the method.This means that the defined conditions to a node being sensitive are correct.Also, the method does not need any electrical simulation to perform its analysis, resulting in the same result than the electrical simulation with less time spending.

Results
The results produced by the proposed method are shown considering a particle strike probability p = 1.98e −6 was used as an estimate.This value defines the probability of the incidence of a particle in a sensitive node with sufficient energy to cause a voltage change.For the inputs of the gates, the same probability of being "1" equal to 50% was used.Then, the method was applied in a total of 19 logic gates.The results presented are a function of the mean susceptibility and even the standard deviation (σ) obtained from the values of each input vector of each function.The results obtained from the application of the method in the 45nm library are presented in Table 2. Observing the results obtained by the proposed method, it is possible to notice that the INVERTER logic gate was the only gate that presented a zero standard deviation.It means that this gate was the only one within the cell library that showed no difference in the calculated susceptibility for its vectors.
Table 3 shows the susceptibility values obtained by applying the proposed method on the inverter logic gate.As can be observed, there is no difference in the obtained values between both input vectors of this gate, resulting in a zero standard deviation.
Another important point in the results is the behavior observed among the logic gates with complementary planes.For example, the NAND and NOR gates  Finally, it is also important to note that the average susceptibility values tend to increase, according to the number of transistors in these logic gates.Another critical detail to note is the standard deviation value of these gates.Logic gates with high standard deviation values are more sensitive to different probabilities of the input vectors.A high standard deviation means that the gate has vectors in which the susceptibility can decrease or increase considerably, applying different input vectors probability.
To observe the difference between the input vectors that result in higher standard deviation values, take as an example an AOI21 logic gate.The susceptibility calculated for each input vector is shown in Fig. 11.Note that the most susceptible conditions of this gate are observed on input vectors 001, 011, and 101.Considering this information, it was performed three different scenarios considering different input vector probabilities for this logic gate.Fig. 12 shows the probability for each input vector for three situations: a) The probability of being logical one for each input is B2=B1=A= 25%.
b) The probability of being logical one for each input is B2=B1= 50% and A= 75%.c) The probability of being logical one for each input is B2=B1=A= 75%.In the first simulation, considering 50% for the input vector probabilities, it results in a mean susceptibility of 3.70 for this logic gate.Considering the first situation presented in Fig. 12a, it results in a mean susceptibility calculated equal to 3.18.In this scenario it is possible to observe that the critical vectors have less probability wich results in less susceptibility for the gate.
The second situation present in Fig. 12b was performed to show the sensitivity to pin-assessment of the gate.When B2=B1=50% and the input C=75%.In this situation, the gate presents a mean susceptibility equal to 4.32.The small difference in the input vector probabilities causes an increase of almost 36% on the mean susceptibility of this gate.
Finally, in the last scenario presented in Fig. 12c, the inputs have probability of being logical one equal to 75%.As can be observed, this scenario results in a probability of occurrence of input 111 equal to 42%, in this input vector, the gate presents a good behavior in terms of susceptibility.Then, this scenario results in mean susceptibility equal to 3.36.This situation resulted in a difference less than 6% in susceptibility, compared to scenario "a".Observing the three scenarios, it shows that this gate can be highly dependent on the pin-assessment when calculating the mean susceptibility.

Conclusions
This work proposes a method to predict Single Event Transient susceptibility for logic gates.The results show that the susceptibility of a gate can be highly dependent on its implementation.Moreover, proposed method can be used to generate probabilistic matrices for several logic gates.Also, these matrices can be used by probabilistic methods to estimate the reliability of a circuit, such as PTM or SPR-MP, for example.In the proposed method, it is not necessary to consider the possible masking conditions of SET, since they are regarded in reliability estimation techniques for circuits.
The proposed method can calculate the susceptibility of any single-stage logic function implementation, merely providing the stick diagram, input probability of being "1" and the value for particle strike probability.The susceptibility value can be an important measure for choosing the best candidate for logic functions.The results for a set of logic gates have shown the importance of considering the stick implementation in order to evaluate the logic gates susceptibility.[29]

Fig. 10 .
Fig. 10.Sensitive areas identified for NAND2 gate for each input vector

Table 1 .
NAND2 analysis provided by the stick diagram model Fig. 6.Stick representation of a NAND2 logic gate

Table 2 .
Average Susceptibility (in 10 −6 ) and standard deviation (σ) calculated by the method for 45nm library cell

Table 3 .
Susceptibility calculated for the inverter logic gate when the proposed method was applied Both gates have a network with n transistors in series, which is the sensitive network in most of the gates input vectors.Likewise, AOI / OAI ports also exhibit this behavior.Table4presents the susceptibility calculated for NAND2 and NOR2.Note that the same mean obtained between both cells only occurs because the input vectors have the same occurrence probability.