## Microelectronics Reliability 52 (2012) 1848-1852

Contents lists available at SciVerse ScienceDirect

# Microelectronics Reliability

journal homepage: www.elsevier.com/locate/microrel



# Failure and reliability analysis of STT-MRAM

W.S. Zhao<sup>a,b,\*</sup>, Y. Zhang<sup>a,b</sup>, T. Devolder<sup>a,b</sup>, J.O. Klein<sup>a,b</sup>, D. Ravelosona<sup>a,b</sup>, C. Chappert<sup>a,b</sup>, P. Mazoyer<sup>c</sup>

<sup>a</sup> IEF, Univ. Paris-Sud 11, Orsay 91405, France

<sup>b</sup> UMR8622, CNRS, Orsay 91405, France

<sup>c</sup> STMicroelectronics, 850 Rue Jean Monnet Crolles, Grenoble 38026, France

#### ARTICLE INFO

Article history: Received 3 June 2012 Accepted 15 June 2012 Available online 6 July 2012

#### ABSTRACT

Spin Transfer Torque Magnetic RAM (STT-MRAM) promises low power, great miniaturization prospective (e.g. 22 nm) and easy integration with CMOS process. It becomes actually a strong non-volatile memory candidate for both embedded and standalone applications. However STT-MRAM suffers from important failure and reliability issues compared with the conventional solutions based on magnetic field switching. For example, a read current could write erroneously the stored data, the variability of ultra-thin oxide barrier drives high resistance variation and the injected current in the nanopillar induces lower lifetime etc. This paper classifies firstly all the possible failures of STT-MRAM into "soft errors" and "hard errors", and analyzes their impact on the memory reliability. Based on this work, we can find some efficient design solutions to address respectively these two types of errors and improve the reliability of STT-MRAM.

© 2012 Elsevier Ltd. All rights reserved.

## 1. Introduction

Spin Transfer Torque Magnetic RAM (STT-MRAM) is regarded as a promising non-volatile memory candidate and it features fast speed, infinite endurance and great scalability (e.g. 22 nm) [1,2]. Only a bi-directional low current (<100uA@ 65 nm) is used to pass through the MRAM storage element: Magnetic Tunnel Junction (MTJ) for switching operation (see Fig. 1) and this simplifies greatly the integration with CMOS circuits. These advantageous features attract much attention of R&D, a number of pre-industrial prototypes have been demonstrated since 2005 [3,4] and one expects to commercialize it in the next few years. Its non-volatility, infinite endure and logic compatibility also allow conceiving the STT-MRAM based logic circuit [5–7], such as Magnetic Look-Up Table (MLUT), Magnetic Flip-Flop (MFF), Magnetic Full-Adder (MFA) and Magnetic shift register etc. They expose a great potential on low power dissipation, small die area, fast speed and is considered to have ability to replace the other types of current logic circuits. However, unlike the memory chip which usually has error correction codes (ECC) circuits, the reliability is a challenge for the magnetic logic circuits. Normally STT switching mechanism causes much higher failure rate than the conventional approach based on magnetic field switching and this leads to important reliability degradation such as erroneous writing by read current [8,9] and high resistance variation due to the variability of thin oxide barrier. They limit its interest towards practical applications requiring good

E-mail address: weisheng.zhao@u-psud.fr (W.S. Zhao).

trade-off among speed, density, power and error rate etc. These reliability issues have become major obstacles for STT-MRAM, however they were not analyzed systematically in the literature.

The paper presents firstly a global failure analysis of STT-MRAM based on its physical nature and classifies these errors into "soft error" (i.e. wrong signal) and "hard error" (i.e. device damage). The former is mostly related to the parameters of free layer (see Fig. 1) like thermal stability factor  $\varDelta$  and current density  $J_c$  [10]. These errors can be corrected by a new signal. The latter is mainly caused by the parameters of oxide barrier like its thickness  $t_{ox}$  and TMR ratio [11] (see Fig. 1). These errors are uncorrected, but we can propose some methods to tolerate them. This work is important for the future R&D of STT-MRAM as it helps the memory and system designers to find efficient solutions to address these failures and errors respectively.

In the next Sections 2–5, we categorize the failures of STT-MRAM based on their nature and analyze their impact on memory reliability. Some efficient design solutions to respectively address these two types of errors are presented in the Section 6.

# 2. Soft errors due to stochastic switching

Although STT switching has proven sub-nanosecond potential [12], the operation is intrinsically stochastic and some desired data may fail to be stored correctly on the MTJs [13]. The reversal duration of STT writing mechanism can vary significantly from one event to the next, with a standard deviation almost as large as the average switching duration and sigmoidal distributions with exponential tails [14], as exemplified in Fig. 2, for a MgO barrier based MTJ [15]. This results from unavoidable thermal fluctuations



<sup>\*</sup> Corresponding author at: IEF, Univ. Paris-Sud 11, Orsay 91405, France. Tel.: +33 16915 6292; fax: +33 16915 4000.

<sup>0026-2714/\$ -</sup> see front matter © 2012 Elsevier Ltd. All rights reserved. http://dx.doi.org/10.1016/j.microrel.2012.06.035



**Fig. 1.** Magnetic Tunnel Junction (MTJ) is mainly composed of three thin films: two ferromagnetic layers and one oxide barrier (e.g. CoFeB (1.2 nm)/MgO (0.85 nm) / CoFeB (2 nm)). It presents two resistance values ( $R_P$  or  $R_{AP}$ ) depending on the relative magnetization of two ferromagnetic layers (Parallel or Anti-Parallel). The resistance difference is characterized by Tunnel Magnetoresistance Ratio (TMR) = ( $R_{AP} - R_P$ )/ $R_P$  [11]. Through Spin transfer torque mechanism, a low bidrectional current,  $I_{WR}$  higher than the threshold current  $I_{CO}$  can switch the MTJ between two states [1].

of magnetization [14], which randomly interfere to activate or slow down magnetization reversal.

According to the experimental measurements shown in Fig. 2 and the theoretical model (Eqs. (1) and (2)) [9,15], increasing the write current value  $I_{WR}$  (e.g.  $I_{WR} = 2 \times$  the threshold switching current value  $I_{C0}$ ) or adding extensive margins on the driver pulse duration  $t_{pulse}$  (e.g. 20 ns) are the most efficient methods to avoid the writing failures, however they may lead to significant power, speed and surface overhead. For instance, a cell area 56 F<sup>2</sup> is required [4], which is not suitable for high-density storage. Furthermore the strengthened writing pulses could drive the breakdown or damage of oxide barrier, which will be detailed in the Section 5.

$$\Pr(t_{pulse}) = 1 - \exp\left(-\frac{t_{pulse}}{Duration}\right) \tag{1}$$

$$\frac{1}{Duration} = \left[\frac{2}{C + \ln(\frac{\pi^2 A}{4})}\right] \frac{\mu_B P}{em(1+P^2)} (I_{WR} - I_{c0})$$
(2)

where *Duration* is the mean duration for STT switching,  $t_{pulse}$  driver pulse duration,  $Pr(t_{pulse})$  the switching probability under  $t_{pulse}$ ,  $C \approx 0.577$  is the Euler's constant, *P* the tunneling spin polarization of ferromagnetic layers, *e* magnitude of the electron charge, *m* the free layer magnetic moment,  $\Delta = E/k_BT$  the thermal stability factor, *E* energy barrier,  $k_B$  the Boltzmann constant and *T* is the temperature.

#### 3. Soft errors due to limited thermal stability

Read operations of STT-MRAM drive also soft errors as the read current can erroneously change the stored data. The thermal stability factor  $\Delta$  is often used to quantify the reliable retention



**Fig. 2.** Experimental measurement of STT stochastic switching behaviors [15], high  $I_{WR}$  values drive faster speed and higher switching probability.

of magnetic data storage [17], which is expected to be as large as possible. According to the Neel-Brown model [9] or Eq. (3) [17], we can investigate the impact of reading operation (current amplitude  $I_R$ , duration  $\tau$  and storage density) on the required  $\Delta$  while keeping an acceptable chip failure rate (see Fig. 3).

In Fig. 3, the  $\triangle$  varies from 40 to 75 k<sub>B</sub>T, the data retention for the lowest energy barrier is 10 years in zero current. We compare with different bits per word (8 bits/word and 32 bits/word), different reading duration ratio in data retention (10% and 1%) and different ratio of read current/critical current (1/5 and 1/15). It shows that a higher  $\triangle$  is needed for longer read duration (comparison between blue solid and green point lines) and larger the current amplitude (comparison between green point and red triangle lines). Furthermore, if there are more bits per word during parallel reading, higher  $\triangle$  is necessary to avoid the failures (comparison between red triangle and cyan square lines).

$$F_{chip} = 1 - \exp\left[-N\frac{\tau}{\tau_0}\exp\left(-\Delta\left(1 - \frac{I_R}{I_{CO}}\right)\right)\right]$$
(3)

where *N* is the number of bits per word in the memory array,  $F_{chip}$  is the error switching rate due to the cell read current  $I_R$ ,  $\tau_0$  is the attempt period = 1 ns and  $\tau$  is accumulated read duration.

The design solution based on lower  $I_R$  and shorter  $\tau$  allows the reliable reading operation with a relatively low  $\Delta$ , however the scaling down of MTJ size will continue to reduce the energy barrier  $E = \mu_0 M_s \times Vol \times H_k/2$ , where  $H_k$  is the anisotropy field,  $\mu_0$  permeability in free space,  $M_s$  the saturation magnetization and Vol is the volume of free layer. Therefore, obtaining a high  $\Delta$  becomes then one of the key challenges for small node STT-MRAM (e.g. 22 nm). Data storage using perpendicular magnetic anisotropy (PMA) instead of conventional solution based on in-plane shape anisotropy has been demonstrated to be one of the most palatable solutions, which could provide a higher anisotropy field  $H_k$  comparing with in-plane anisotropy [10,16]. This helps to keep a large energy barrier for high-density device. Furthermore lower critical current and faster speed observed for PMA also benefit the integration of STT-MRAM into logic circuits [18].

## 4. Hard errors due to oxide barrier breakdown

One of the essential advantages of STT-MRAM is the high switching speed, which allows it to be used in the embedded and



**Fig. 3.** High thermal stability factor is required to reduce the erroneous sensing rate. Dynamic sensing (low  $I_R$  and short  $\tau$ ) is more suitable for STT-MRAM than the static sensing from the reliability point of view.

logic applications [4–7]. However, as the switching current is inversely proportional to the switching duration, a high current density  $J_c$  is normally required to achieve this purpose (see Fig. 4 and Eq. (2)). Moreover, we should take into account important margin to improve the switching probability shown in Section 2. According to the Eq. (4), there are mainly two solutions to obtain high  $J_c$ . The first one is to increase the bias voltage *V* and the second one is to reduce the Resistance.Area product (R.A) or  $t_{ox}$ .

$$V = \mathbf{R}.\mathbf{A} \times J_c \tag{4}$$

As shown in the Fig. 4, 0.8 V bias voltage can enhance  ${\sim}7$  times of speed comparing with that of 0.6 V. Also, by decreasing R.A from 10 to 7.5  $\Omega\mu m^2$  while keeping bias voltage at 0.6 V, the speed can be significantly improved. However, both the two solutions may lead to the oxide barrier breakdown and shorten the lifetime of MTJ [19].

## 5. Hard errors due to barrier thickness variability

In order to achieve low R.A value, ultra-thin oxide barrier is preferred and this drives important variation in 300 mm wafer. As the resistance  $R_p$  is exponential to the  $t_{ox}$  (Eq. (5)) [9] and the TMR ratio is reduced under bias voltage (Eq. (6)) [11], the resistance variation ratio (VR) may be larger than TMR. In this case, VR can disturb the sensing operation, which should be governed by TMR effect, the errors (hard errors) will thus occur. With the comparison between VR and TMR in Fig. 5, we find that the thickness variation should be lower than 5% to avoid these hard errors. It is important to note that the real TMR ratio during the data sensing with a bias voltage decreases as shown in the inset of Fig. 5 [11]. This suggests that a good sensing strategy is to use a low bias voltage.

$$R_{\rm P} = \frac{t_{\rm ox}}{F \times \overline{\varphi}^{1/2} \times Area} \times \exp(1.025 \times t_{\rm ox} \times \overline{\varphi}^{1/2}) \tag{5}$$

$$TMR = \frac{TMR(0)}{1 + \frac{V^2}{V_{\star}^2}}$$
(6)

where  $\overline{\varphi} = 0.4$  the potential barrier height of MgO [11]. *F* is a factor calculated from the R.A value of MTJ, if R.A is  $10 \ \Omega \mu m^2$ , *F* = 332.2 with Eq. (5). *TMR*(0) = 120% is the TMR ratio with 0 V bias voltage, *V*<sub>h</sub> is the bias voltage as *TMR* = 0.5 × *TMR*(0).



**Fig. 4.** In order to obtain high  $J_c$  for fast and reliable switching, either high voltage or low R.A is required, which lead to short lifetime of oxide barrier.



**Fig. 5.** Variability of oxide barrier thickness will lead to hard errors, as the resistance is exponential to the thickness. Furthermore, a bias voltage for reading can reduce greatly the TMR.

## 6. Potential solutions

To improve the hardness of STT-MRAM to the soft and hard errors, some design strategies and considerations [20–24] for reliability enhancement of STT-MRAM have been presented recently. For instance, to reduce the error rate for the suppressed read current, concept of '1'/'0' dual-array equalized reference is able to generate a precise reference for stable read operation [21].



**Fig. 6.** Self-enabled "error-free" switching circuits and strategy with adaptive driver pulse duration. (a) Scheme of proposed switching circuit. (b) Varied switching duration driven by the "self-enable" operations, two examples are provided to show the evident energy saving thanks to the stochastic switching of STT switching mechanism.



**Fig. 7.** Pre-Charge Sense Amplifier (PCSA) for STT-RAM, which allows lower  $I_R$  and shorter  $\tau$ .



**Fig. 8.** The reading error rates of PCSA increase rapidly as the R.A value is reduced. Here the TMR ratio of STT-MTJ is fixed to 150% and 100 runs have been performed to obtain the error rate. X is the minimum area of PCSA.

## 6.1. High reliability design for "soft error"

Self-enable switching circuit allows the stochastic switching to be avoided and it relaxes also the bias voltage stress on the oxide barrier [22] (see Fig. 6). A sense amplifier (S.A) connected to the MTJs detects their states and outputs a logic value. The "self-enable" signal is activated only while the stored data is different from "Input" data. As the stochastic behavior of STT magnetic switching, the MTJ can switch its state in a short write pulse. The fixed long writing pulse is thus replaced by a sequence of short duration including both switching and sensing operations (see Fig. 6b), which permits the write pulse duration to be shortened and the number of switching operation to be reduced, the lifetime of oxide barrier can thus be greatly improved. Pre-Charge sensing method allows the read current value and duration ( $\sim$ 200 ps per operation) to be greatly minimized compared with conventional static data sensing, this provides high reliability for the STT-MRAM with the same  $\Delta$  [23]. The large transistor in 1T + 1MTJ structure can make I<sub>read</sub> exceed to the disturb margin. The design of two word selection transistors (one for reading operation, the other for switching operation) per MTJ cell can solve this sensing problem [23].

#### 6.2. High reliability design for "hard error"

As mentioned above, hard errors are mainly due to the low values issue of R.A. Triple Modular Redundancy (TMR) or direct increase of active transistors (MN0-1, MP0-1) in Pre-Charge Sense Amplifier (PCSA) (see Fig. 7) has been found to overcome that [23]. TMR technique eliminates errors occurring on one of triplicate output by using the majority vote. TMR logic block includes two additional SAs and a voting circuit; thereby these two solutions will certainly degrade the area efficiency. Fig. 8 shows the reading error rate comparison for three cases (increase of the transistor size without adding TMR logic block, with TMR technique and minimum sized transistors) through Monte-Carlo statistical simulation. We can find that only enlarging the transistor size is sufficient for reliability improvement of conventional applications. TMR technique can be used for applications requiring extreme sensing hardness.

### 7. Conclusion

In this paper, we categorized firstly the different types of failures of STT-MRAM into free layer dominated "soft errors" (stochastic effects and sensing errors) and oxide barrier dominated "hard errors" (device mismatch). Based on different physical models like switching probability, error switching rate, resistance and TMR, we studied the impacts of thermal stability factor, current density, oxide barrier thickness and TMR ratio on the reliability of data storage. These analysis help to investigate efficient solutions to address these issues and enhance the reliability of STT-MRAM.

### Acknowledgements

The authors wish to acknowledge financial support from French NANOINNOV program through contract SPIN and the European FP7 program through contract MAGWIRE (257707).

### References

- Chappert C, Fert A, Van Dau FN. The emergence of spin electronics in data storage. Nat Mater 2007;6:813–23.
- [2] International Technology Roadmap for Semiconductor (ITRS). Emerging research device chapter. <a href="http://www.itrs.net/>.2010">http://www.itrs.net/>.2010</a>.
- [3] Kang SH. Development of embedded STT-MRAM for mobile system-on-chips. IEEE Trans Magn 2011;47:131–6.
- [4] Tsuchida K, et al. A 64Mb MRAM with clamped-reference and adequatereference schemes. In: Procs in IEEE international solid-state circuits conference (ISSCC); 2010. p. 258–9.
- [5] Zhao WS et al. Spin transfer torque (STT)-MRAM based run time reconfiguration FPGA circuit. ACM Trans Embed Comput Syst 2009;9(2).
- [6] Prenat G, et al. CMOS/magnetic hybrid architectures. In: Proc in IEEE international conference on electronics, circuits and systems (ICECS); 2007. p. 190–3.
- [7] Gang Y et al. A high reliability, low power magnetic full adder. IEEE Trans Magn 2011;47:4611–6.
- [8] Takemura R, et al. High-scalable disruptive reading scheme for Gb-scale SPRAM and beyond. In: Proc in IEEE international memory workshop (IMW); 2010. p. 1–2.
- [9] Faber L, et al. Dynamic compact model of spin-transfer torque based magnetic tunnel junction (MTJ). In: Proc in IEEE design & technology of integrated systems (DTIS); 2009. p. 130–5.
- [10] Ikeda S et al. A perpendicular-anisotropy CoFeB-MgO magnetic tunnel junction. Nat Mater 2010;9:721–4.
- [11] Yuasa S et al. Giant room-temperature magnetoresistance in single-crystal Fe/ MgO/Fe magnetic tunnel junctions. Nat Mater 2004;3:868–71.
- [12] Devolder T et al. Spin-torque switching window, thermal stability, and material parameters of MgO tunnel junctions. Appl Phys Lett 2011;98:162502.
- [13] Devolder T et al. Single-shot time-resolved measurements of nanosecondscale spin-transfer induced switching: stochastic versus deterministic aspects. Phys Rev Lett 2008;100:057206.
- [14] Devolder T et al. Subnanosecond spin-transfer switching: comparing the benefits of free-layer or pinned-layer biasing. Phys Rev B 2007;75:224430.
- [15] Marins de Castro M, et al. Processional spin-transfer switching in a magnetic tunnel junction with a synthetic anti-ferromagnetic perpendicular polarizer. J Appl Phys 2012:111:07C912.

- [16] Worledge DC et al. Spin torque switching of perpendicular Ta/CoFeB/MgO-
- based magnetic tunnel junctions. Appl Phys Lett 2011;98:022501.
  [17] Takemura R et al. A 32-Mb SPRAM with 2T1R memory cell, localized bi-directional write driver and '1'/ '0' dual-array equalized reference scheme. IEEE J Solid-State Circuits 2010;45:869-75.
- [18] Zhang Y et al. A compact model of perpendicular magnetic anistropy magnetic tunnel junction. IEEE Trans Electron Dev 2012;59:819-26.
- [19] Panagopoulos G, et al. Modeling of dielectric breakdown-induced time dependent STT-MRAM performance degradation. In: Proc in device research conference (DRC); 2011. p. 125-6.
- [20] Zhao WS et al. High stability and low power sensing amplifier for MTJ/CMOS hybrid logic circuits. IEEE Trans Magn 2009;45:3784-7.
- [21] Kawahara T et al. Spin-transfer torque RAM technology: review and prospect. Microelecs Reliab 2012;52:613.
- [22] Lakys Y et al. Self-enabled 'error-free' switching circuit for spin transfer torque MRAM and logic. IEEE Trans Mag 2012. http://dx.doi.org/10.1109/TMAG.2012. 2194790. [23] Zhao WS et al. Design considerations and strategies for high-reliable STT-
- MRAM. Microelectron Reliab 2011;51:1454-8.
- [24] Li J et al. Design paradigm for robust spin-transfer torque magnetic RAM (STT-MRAM) from circuit/ architecture perspective. IEEE Trans VLSI 2010;18:1710.