# Beyond CMOS computing with spin and polarization 

Sasikanth Manipatruni*, Dmitri E. Nikonov© and Ian A. Young


#### Abstract

Spintronic and multiferroic systems are leading candidates for achieving attojoule-class logic gates for computing, thereby enabling the continuation of Moore's law for transistor scaling. However, shifting the materials focus of computing towards oxides and topological materials requires a holistic approach addressing energy, stochasticity and complexity.


Computing efficiency has made exponential gains over the past five decades stemming directly from size scaling, technological breakthroughs in the control of transport in semiconductors ${ }^{1}$, lithography ${ }^{2}$ and the success of von Neumann computing architectures ${ }^{3}$. The cornerstones of transistor size scaling are Moore's law ${ }^{4}$ and Dennard's scaling ${ }^{5}$, which have led to a reduction in transistor costs ${ }^{4}$, shrinking of the circuit area ${ }^{5}$, lowering of the supply voltage, and growth of complexity and parallelism in the computer architecture ${ }^{3}$ - scaling of nanoelectronics based on complementary metal-oxide-semiconductor (CMOS) transistors is now approaching a characteristic size of $10 \mathrm{~nm}\left(\right.$ ref. $\left.{ }^{1}\right)$. In the past 15 years, however, scaling has deviated from Dennard's trend, which states that the power density of circuits stays roughly the same as transistors gets smaller, concurrent with voltage reductions. And Moore's law, which is the observation that the number of transistors on integrated circuits approximately doubles every two years, has relied increasingly on material advances to enhance the mobility ${ }^{6,7}$ along with metal-gate electrode and high- $k$ dielectrics in order to improve the electrostatic control of carriers in 3D transistors ${ }^{8}$.

To make significant improvements in the energy efficiency and speed of integrated circuits in the future, the continuation of Moore's law scaling will require the introduction of non-traditional materials and structures, as well as beyond-CMOS logic devices ${ }^{6,9-13}$ that are based on quantum nanoelectronic or nanomagnetic principles. The introduction of new state variables - physical quantities that store and transmit the logic state - for computing, interconnects and memory, such as electron dipole, spin, orbital state and light intensity/helicity, is one way to continue Moore's law scaling. This is a revolutionary materials approach to continuing Moore's law, where new materials and physical phenomena are utilized for enabling fundamentally better computing devices at the physical layer. In particular, computing with spintronics/multiferroics is emerging as a leading candidate for memory and logic ${ }^{9-13}$.

Central to the need for a device technology beyond CMOS is an effect known as the Boltzmann tyranny ${ }^{14}$ - this is a consequence of the thermal energy distribution of electrons/holes at room temperature, in any device that is switched by modulating charge conductivity by an energy barrier. It dictates that the ratio of on-current and off-current in a device is related to the voltage swing, and thus it prevents the supply voltage of high-performance CMOS devices from going below $\sim 0.5 \mathrm{~V}$ (ref. ${ }^{3}$ ). While new materials, such as III-V semiconductors and two-dimensional materials, have promising characteristics of improved carrier transport and reduced dynamic energy, they are still subject to Boltzmann tyranny and can at best
be a continuation of the existing CMOS scaling trend. In contrast, new state variables for computing, interconnects and memory can provide a break from this paradigm, relying on order parameters such as polarization, magnetization and strain, which exhibit collective switching, strong thresholding behaviour and non-volatility. It is also possible to circumvent Boltzmann tyranny with tunnelling field-effect transistors (TFETs) ${ }^{9}$, where the tunnelling transport physics allows for under 60 mV per decade current modulation.

In this Perspective, we describe a path for computing with spintronic and multiferroic devices, and discuss the milestones that need to be surpassed for enabling this transition. We first define a beyond-CMOS collective switch in terms of the reversal of a material's order parameter $(\Theta$ to $-\Theta)$ - defining a metric for the energy required for switching $\left(E_{\mathrm{sw}}\right)$, which is related to the stored energy of the order parameter $(E(\Theta))$. Second, we consider the minimal energy and voltages that are required for transmitting a logic variable on an interconnect from the point of view of the thermodynamic limits given by photonic/electronic shot noises ${ }^{15,16}$. This new perspective enables a definition of the key milestones for spintronics/multiferroics computing, which can be viewed as experimental grand challenges. We identify experimental targets for magnetization switching efficiency, detection of the state of the magnet, and interconnects for spintronics. We also propose the holistic paradigm of energy scaling, error rate scaling and complexity scaling for new computational devices, non-traditional/neuromorphic architectures and new computing techniques, such as stochastic ${ }^{17-20}$ and Shannon-inspired computing ${ }^{18}$.

## A collective switch

We restate the concept of a beyond-CMOS switch as a collective switch that reverses a materials order parameter $(\Theta)$. Examples of order parameters from Landau's theory are magnetization (M), antiferromagnetic order $(L)$, polarization ( $\mathbf{P}$ ), and strain ( $\boldsymbol{\sigma}$ ). A collective switch is a device that reverses this order parameter in a volume of the material (Fig. 1a) in a manner that allows for nonlinear input-output transfer characteristics. The collective switch exhibits a nonlinear transition when the input exceeds a threshold. The switch must transduce the state variable to carry a logic signal $\eta$ and couple to an interconnect (Fig. 1b), which carries the signal to the next stage of the logic circuit. The switch must also respond to an input logic signal $\eta$ from the previous circuit stage to reversibly change the sign of the order parameter.

The thermal stability of the switch is given by the value of the retention energy barrier $(\Delta E)$ obtained from the dependence of
a


Order parameter $\Theta$ Retention barrier $E(\Theta)$
c


Example of a collective switch (magnetoelectric spin-orbit device)
b

d

|  | Ferromagnetism | Ferroelectricity | Ferroelasticity |
| :--- | :--- | :--- | :--- |
| Order <br> parameter $\Theta$ | Magnetization | Polarization | Strain |
| Carriers | Electron spin, <br> magnon | Electron | Phonon |
| Control $(\eta)$ | Spin current, voltage, <br> photon angular <br> momentum | Voltage | Strain, voltage |
| $\lambda=E_{\text {switch }} / E(\Theta)$ | $>2$ (ME), >1,000 (STT) | $>2$ (FE) | $>2$ |

Fig. 1 | Definition of a collective switch. a, Collective state switch for using the materials' order parameter. The two states are given by values of $\pm \Theta$. $\mathbf{b}$, Interconnect providing an input to and output from the switch carries a signal $\pm \eta$. The state of the device is detected and transduced to the output $\pm \eta_{\text {out }}=$ $R( \pm \Theta)$. c, Example of a collective switch, a magnetoelectric spin-orbit logic device ${ }^{13}$ where the order parameters are ferroelectric/antiferromagnetic (FE/ AFM) of the magnetoelectric (ME), and the read-out is via spin-charge conversion. d, Potential order parameters, carriers and control variables are shown. The figure of merit $\lambda=E_{\text {sw }} / \Delta E(\Theta)$ allows identification of potential for an efficient logic device/switch. STT, spin-transfer torque.

Table 1 | Materials targets for computing with spin and polarization for beyond-CMOS devices

| Target | Device/material figure of merit | Challenge target ( $T>420 \mathrm{~K}$ ) | Example of state of the art |
| :---: | :---: | :---: | :---: |
| Magnetic/FE/MF <br> switching ( $10 \times 10 \mathrm{~nm}^{2}$ ) | Switching energy | 1-10 a J | 20 aJ (all optical); for example, ref. 400 fJ per bit; for example, ref. |
|  | Switching voltage | $100-300 \mathrm{mV}$ | $100 \mathrm{kV} \mathrm{cm}^{-1} \mathrm{LaBiFeO}_{3}$ (ref. ${ }^{43}$ ) $250-400 \mathrm{mV}$ perpendicular spin transfer torque (STT) ${ }^{44,45}$ |
|  | Switching speed | 10-1,000 ps | 120 ps (nominal) ${ }^{42}$, <3 ns (STT) ${ }^{45}$ |
|  | Write error rate | $10^{-1}$ (stochastic) $46,47-10^{-12}$ (von Neumann) | $10^{-10}(\mathrm{STT})^{45}, 10^{-5}(\mathrm{ME})^{56}$ |
|  | ME/FE | $P_{c} \sim 0.5-5 \mu \mathrm{Cm} \mathrm{cm}^{-2}$ | Refs ${ }^{43,47}$ |
|  |  | Converse magnetoelectric coefficient $\Delta H / \Delta V \sim 10 C^{-1}$ | $\mathrm{BiFeO}_{3}\left(\right.$ refs ${ }^{22-26}$ ), $\mathrm{BiFeO}_{3} / \mathrm{CoFe}_{2} \mathrm{O}_{4}$, Terfenol-D/PZT; for example, ref. |
|  | Spin-orbit coupling (SOC) switching | $\begin{aligned} & \lambda_{\text {REE }}(\text { for switching })>10 \mathrm{~nm}, \\ & \rho_{\text {soc }}<10 \mu \Omega \mathrm{~cm} \end{aligned}$ | Refs ${ }^{49,51,52,66}$ |
|  |  | $\Delta / I_{c}>10$ | For example, ref. ${ }^{53}$ |
| Spin detection$\left(10 \times 10 \mathrm{~nm}^{2}\right)$ | Spin to charge efficiency | $I_{\text {d }} I_{\text {s }}>90-100 \%$ | For example, refs ${ }^{54,55}$ |
|  | Read-out voltage | $>100 \mathrm{mV}$ | For example, ref. ${ }^{54}$ |
|  | SOC detection | $\lambda_{\text {IREE }}$ (for read-out) $>10 \mathrm{~nm}, \rho_{\text {soc }}>10 \mathrm{~m} \Omega$ cm | Refs 49,51,52,66 |
| Interconnect | Switching voltage, currents | $100 \mathrm{mV}, 1-10 \mu \mathrm{~A}$ |  |
|  | Dimensions - local interconnect | 30-nm width, $100 \mathrm{~nm}-0.1-\mathrm{mm}$ range | For example, ref. ${ }^{2}$ |
|  | Spin-optical/vice versa conversion | $<10$ aJ per bit, $1 \mathrm{Gbit} \mathrm{s}^{-1}$ | For example, ref. ${ }^{41,67}$ |
|  | Dimensions for optical | 200-nm width, >100- $\mu \mathrm{m}$ range | Ref. ${ }^{57}$ |
| Nanomagnet/FE/MF | Stability ( $\Delta / k_{\mathrm{B}} T$ ) | 40 (logic)-80 (memory) | For example, ref. ${ }^{44}$ |
|  | Spin injection | > 80 \% | For example, Heusler alloys ${ }^{58}$ |

energy $E(\Theta)$ on the order parameter $\Theta$. The value of the energy barrier is related to the device's retention time and determines the non-volatile nature of the switch. The logic state is retained in the
order parameter $(\Theta)$ and the output logic signal is generated via an efficient read-out (through transduction of the state to a communication/interconnect state variable) mechanism, where the read
signal is $R(\Theta)=-R(-\Theta)$. For example, for magnetoelectric spin-orbit logic (Fig. 1c) based on magnetoelectric switching of a multiferroic $^{13}, \Theta$ is the order parameter of the multiferroic (coupled polarization $\mathbf{P}$ and antiferromagnetic order $L$ ) and $\eta$ is the charge voltage on the interconnect. For spin torque logic with spin interconnects ${ }^{21}$, $\Theta$ is magnetization $\mathbf{M}$ and $\eta$ is the spin current $I_{s}$. For a magnon transistor ${ }^{12} \Theta$ is magnetization and $\eta$ is the phase of magnons on the interconnect.

We define a figure of merit for collective switches as:

$$
\lambda=E_{\mathrm{sw}} / \Delta E(\Theta)
$$

where $\Delta E(\Theta)$ is the energy barrier relative to the stable order parameter, and $E_{\text {sw }}$ is the total energy dissipated in switching. Lower values of $\lambda$ enable computing switches to operate at lower energy for a given energy barrier. The energy barrier $\Delta E(\Theta)$ of a collective switch is set by the technology requirements. For example, a logic circuit may need to preserve the state of the switch long enough for computation. The factor $\lambda$ is the figure of merit due to the following reasons: (a) for a given type of logic operation, $E(\Theta)$ is set by the stability of the logic state needed - the derivation of the $E(\Theta)$ requirement follows the arguments of Landauer, and is discussed later; (b) $E_{\text {sw }} / E(\Theta)$ provides the efficiency of switching, normalized for state retention; and (c) the ratio relates $E(\Theta)$, a parameter of the materials stability given by the order (magnetism/polarization/strain), with the switching energy, which comprehends the losses and dynamics of switching.

Lower bounds to $\lambda$ under technology constraints provide an insight into the choice of beyond-CMOS devices (Table 1). The factor $\lambda$ is $>2$ for capacitive, magnetoelectric and ferroelectric devices, and it can be as high as $10^{4}$ for spin torque devices. Let us contrast the switching of a device with spin torque and ferroelectric/magnetoelectric effects. The inefficiency of switching a magnet with spin currents and magnons (spin waves) can be an intrinsic limitation. When we compare the energy stability, a single Bohr magneton contributes $E_{\mathrm{M} \mu \mathrm{B}}=(1 / 2) \mu_{\mathrm{B}} B_{\mathrm{k}} \sim 0.1-1 \mathrm{meV}$ at an equivalent magnetic anisotropy of $\left.B_{\mathrm{k}} 0.1-1 \mathrm{~T}\right)$. In contrast, electric charge of a single electron contributes to large electrostatic energy ( $E_{\mathrm{Ee}}=e V_{\mathrm{c}} \sim 100-1,000 \mathrm{meV}$ at $V_{\mathrm{c}} 0.1-1 \mathrm{~V}$ ). This directly contributes to the disparity in the number of carriers needed for switching and the figure of merit $\lambda$. Due to the small magnetic stability of a single Bohr magneton, the number of electron spins required to form an energy barrier of 1 eV is $\sim 2 \Delta E(\Theta) / \mu_{\mathrm{B}} B_{\mathrm{k}} \sim$ $1,000-10,000$, requiring injection of $1,000-10,000$ spins for magnetization reversal. In contrast, the number of electrons required to form an electrostatic energy barrier of 1 eV is $\Delta E(\Theta) / e V_{c} \sim 1-10$, allowing for very efficient polarization/magnetization reversal. Furthermore, the Joule energy losses due to large spin currents further increase the inefficiency of switching with spin torque.

This fundamentally indicates that a switch with polarization (electric dipole) as the primary order parameter provides an intrinsically better path to energy efficient switches. Magnetoelectric mechanisms ${ }^{22-24}$, especially in multiferroic materials, allow the use of ferroelectricity as the dominant order parameter providing a potential for extreme energy efficiency ${ }^{25-28}$.

## Interconnects from a thermodynamic perspective

The interconnect carries a signal parameter $\eta$, such as charge, spin number or helicity of photons, which can induce a change of state in the switch. Figure 1b shows the interconnect as a physical connection between two switches. Interconnects should carry the state of the prior logic stage via amplitude/phase/angular momentum of the carriers and trigger a switching event (reversal of $\Theta$ ) at the receiving logic stage switch. The read-out mechanism from the switch provides a sign-dependent read-out of the interconnect state - the readout mechanism must have odd symmetry in order to carry the information of the state of the switch.


Fig. 2 | A unified computing framework comprising three axes for scaling. Top: energy scaling axis, representing the reduction in switching energy per device. Bottom-left: error rate scaling axis, representing the ability to function/compute with higher error rates. Bottom-right: complexity scaling axis, representing the ability to productively utilize an increased number of devices in scalable architectures.

We next consider interconnects from the perspective of detector thermodynamics and show the potential for extremely low-voltage ( $<100 \mathrm{mV}$ ) nanoelectronic interconnects, working with a potentially low-voltage beyond-CMOS device. The fundamental limit to the operating voltage of an electrical interconnect, assuming a switch is also able to operate at the given low voltage, is given by the Shannon-Nyquist relationship between noise voltage $v_{\mathrm{n}}$ and the receiver capacitance $C: \sqrt{\delta v_{\mathrm{n}}^{2}}=\sqrt{k_{\mathrm{B}} T / C}$, where $k_{\mathrm{B}}$ is the Boltzmann constant and $T$ is temperature. It can be expressed as variation of number of electrons $n_{\mathrm{n}}$ as $\sqrt{\delta n_{\mathrm{n}}^{2}}=\sqrt{k_{\mathrm{B}} T C} / e$. At a capacitance of 10 aF , the noise voltage is 20 mV and the charge noise is $1.27 e$. For example, a $100-\mathrm{mV}$ charge interconnect driving a ferroelectric/multiferroic capacitor with a charge density $P_{\mathrm{c}}=10 \mu \mathrm{C} \mathrm{cm}^{-2}$ operating at $100 \mathrm{~nm}^{2}$ of device area can operate above the electronic ShannonNyquist noise limits. In addition, it is critical to lower the electrical current of the interconnect along with the operating voltage swing to compensate for the rise in electrical resistivity at reduced dimensions ( $W$ ) (ref. ${ }^{29}$ ) where the dimension of the wire $(W)$ is comparable to the electrical mean free path of the carriers.

We next consider interconnects from intrinsic insertion losses (loss of signal strength per unit length) and size scalability, and note a fundamental shortcoming of the spin current/diffusion interconnects. For spin interconnects the carriers are spin currents, which are not conserved $\left(\nabla s \neq \mathbf{J}_{s}\right)$ due to spin scattering. Practically, this leads to insertion losses of the spin currents exceeding $4.3 \mathrm{~dB} \mu \mathrm{~m}^{-1}$ at $1-\mu \mathrm{m}$ channel widths, assuming spin diffusion lengths of $1 \mu \mathrm{~m}$. Practical interconnects used in integrated circuits have already been scaled to sub-100 nm, with the densest interconnects $\sim 30 \mathrm{~nm}$ in width to match to the size-scaled transistors ${ }^{1}$. This implies that all new interconnect technologies must be considered at scaled width sizes. It is a commonly held notion that pure spin interconnects could be energy efficient due to dissipationless propagation of the spin currents. However, for practical computational logic circuits, the need for regeneration - spin signal repeaters or regenerators, which comprise a switch - to compensate for spin-scattering insertion losses imposes a high penalty for spin interconnects ${ }^{30,31}$.

In contrast to their spin counterparts, electronic interconnects provide long-range signal propagation (exceeding hundreds of

## Box 1 | Milestones and challenges for spintronic logic

We describe the grand scientific and technological challenges for implementing spintronic logic devices. We divide these challenges into three classes: magnet/spin switching, magnet/spin detection and interconnect and complexity challenges. We also provide a list of figures of merit (Table 1) that will accelerate the introduction of the spintronic integrated circuits.

## Problems of magnetic/multiferroic switching.

1. How to switch a magnetic/multiferroic (MF) state in volume of $1,000 \mathrm{~nm}^{3}$ with a stability of $100 k_{\mathrm{B}} T$ and an energy of 1 aJ $\sim 6.25 \mathrm{eV} \sim 240 k_{\mathrm{B}}$ T?
2. What are the timescales involved with magnetoelectric/ferroelectric ( FE$)^{59} / \mathrm{MF}^{60}$ switching of a magnet/FE/MF at scaled sizes? How to overcome the Larmor precession timescale of a ferromagnet ${ }^{61}$ ?
3. How to switch a scaled magnet/polarization switch with low stochastic errors ${ }^{62}$ ? What are the fundamental mechanisms governing the switching errors, fatigue for scaled FE/ME switching ${ }^{63}$ ?
4. What is the right combination of materials/order parameters for practical magnetoelectric switching (for example, multiferroic $\mathrm{FE} /$ antiferromagnet (AFM) plus $\mathrm{FM}^{2223-26}$, paraelectric/AFM plus $\mathrm{FM}^{27}$, piezoelectric plus magnetostriction ${ }^{64}$ )?

Problems of magnetic state detection. MgO-based tunnel junctions ${ }^{10}$ have enabled a practical solution to the detection of a magnetic state in solid-state devices. However, low tunnelling magnetoresistance, high impedance, which requires ultrathin MgO to meet practical constraints, and high voltage (due to tunnelling) limit the long-term potential for spintronic integration. Hence, fundamentally new read-out mechanisms not reliant on MgO tunnel barriers, such as giant magnetoresistance, non-tunnelling metallic/
semimetallic read-out and spin to charge conversion methods, are a technology priority.
5. How to detect the state of a magnet/ferroelectric with high read-out voltage $>100 \mathrm{mV}$ ? For inverse spin-orbit effects, such as the spin galvanic effect/Edelstein effect ${ }^{65,66}$, how to achieve $\lambda_{\text {IREE }}>10 \mathrm{~nm}$ with high resistivity ${ }^{49,53}$ ?
6. What is the scaling dependence of spin-orbit detection of the state of a magnet? How to detect the state of a perpendicular magnet with spin-orbit effect?
Problems of interconnects and complexity.
7. How to transfer the state of a magnet/FE over long distances on scaled wire sizes ( $<30-\mathrm{nm}$-wide wires with pitch $<60$ nm )? In particular, how to improve the spin diffusion interconnects in non-magnetic conductors and magnon interconnects in magnetic interconnects?
8. How to transduce a spintronic/multiferroic state to a photonic state (and vice versa) to enable very long distance interconnects ( $>100 \mu \mathrm{~m}$ ) ${ }^{67}$ ?
9. The back-end of CMOS comprises multiple layers of metal wires separated by a dielectric. Thus making logic devices between these layers requires starting with an amorphous layer and a template for growth of the functional materials. How to integrate the magnetic/FE/MF materials in the back-end of the CMOS chip ${ }^{50,68}$ ?
10. How to utilize stochastic switches ( $\mathrm{spin} / \mathrm{FE}$ ) operating near practical thermodynamic conditions in a computing architecture ${ }^{17,18,69}$ ?
11. How to utilize the extreme scaling (with size, logic efficiency and three-dimensional integration) feasible with spin/FE devices in a computer architecture in order to achieve 10 billion switches per chip ${ }^{18,19}$ ?
micrometres) ${ }^{32}$ owing to the charge conservation ( $\nabla Q=\mathrm{J}_{\mathrm{c}}$ ). The insertion losses at scaled sizes for electrical interconnects is close to zero, limited by dynamic resistor-capacitor leakage currents. Hence, from an insertion-loss and size-scalability perspective, electrical interconnects continue to be the most suitable for short-distance, highly scaled interconnects. For longer range interconnects, with $>100 \mu \mathrm{~m}$ length and $>200 \mathrm{~nm}$ width, for high bandwidth density ( $>100 \mathrm{Gbit} \mathrm{s}^{-1} \mu \mathrm{~m}^{-1}$ ), nanophotonic interconnects come into play ${ }^{32}$.

Limit to computing energy per device at practical switching The intrinsic limit to computing energy per device at practical switching speeds (few GHz ) and retention times (few seconds) calculated below are $\sim 100 k_{\mathrm{B}} T$, which is a factor of $\sim 50$ smaller than for the aggressively scaled CMOS. This suggests that it is possible for computational devices to be created that would enable orders of magnitude improvements in computational scaling. The Landauer limit for energy dissipation is $k_{\mathrm{B}} T \ln (2)$ for irreversible logic operations. This limit holds asymptotically as the delay of the logic operation goes to infinity ${ }^{33}$. For finite switching-time operation and finite switching error rate requirements, the minimal retention barrier $\Delta E(\Theta)$ and minimal switching energy $E_{s w}$ as a function of the retention error probability $\epsilon$, a finite switching time $T_{\mathrm{s}}$ and the characteristic time of thermal fluctuations $T_{\text {therm }}$ are given as follows:

$$
\Delta E(\Theta)>k_{\mathrm{B}} T\left[\ln \left(\frac{1}{\epsilon}\right)+\ln \left(\frac{T_{\mathrm{s}}}{T_{\text {therm }}}\right)\right]
$$

$$
E_{\mathrm{sw}, \min }>\lambda k_{\mathrm{B}} T\left[\ln \left(\frac{1}{\epsilon}\right)+\ln \left(\frac{T_{\mathrm{s}}}{T_{\text {therm }}}\right)\right]
$$

As noted by Landauer, accounting for switching errors due to finite switching speeds and for retention error due to a finite barrier leads to this correction ${ }^{33,70}$. For $\varepsilon=10^{-15}, T_{s} / T_{\text {therm }}=10^{9}$ and $\lambda=2, E_{\text {sw,min }} \sim 110 k_{\mathrm{B}} T$. Please see Box 1 where we propose a technological milestone for a switch operating at $250 k_{\mathrm{B}} T$ or 1 aJ. Note that computing methods tolerant to stochasticity, due to retention errors or finite switching time, will allow for further reductions in the computing energy per bit, as discussed in the unified computing framework (Fig. 2).

## A unified computing framework

We finally describe a unified computing framework to represent computing evolution along three distinct dimensions: energy per switching device, device switching error rate and architectural complexity of the computational unit (excluding the memory) (Fig. 2). These three axes represent the beneficial scaling enabled by novel physics, materials and devices (energy/switch axis), application of information theory/error resilience techniques (device switching error rate), and architectural innovations allowing for larger number of devices to be utilized in a productive manner (complexity/ computational unit).

Traditional scaling of transistors, enabled by size scaling, new structures and materials, have propelled energy per device and
complexity scaling. On the energy axis, gains in switching efficiency due to a reduction in the device size and applied voltages led to the highly scaled CMOS transistor operating at $\sim 10^{4} k_{\mathrm{B}} T$. The future device options enabled by spintronics/ferroelectrics have a possibility to scale the device switching energy to $100 k_{\mathrm{B}} T$, provided the milestones (Box 1) for logic technology can be surpassed.

The ability to productively utilize larger numbers of switches enabled by energy and size scaling is represented on the complexity axis. The number of transistors per CPU has increased, keeping in sync with the increasing transistor density (10-50 million transistors per CPU), but ultimately being limited by the scalability of traditional von Neumann architectures ${ }^{3}$. However, recent architectural innovation in neuromorphic/in-memory/artificial intelligence represent a new opportunity for allowing higher complexity architectures that take advantage of the high transistor density allowed by modern CMOS processes.

Traditional von Neumann/Turing architectures ${ }^{3}$, neuromorphic architectures ${ }^{20,34}$, collective processors, such as networks of nanooscillators ${ }^{35}$, and emerging artificial intelligence architectures ${ }^{35-38}$ are positioned on the axis of complexity scaling. Complexity theory provides great insights into the collective behaviour of macroscopic (mesoscale) objects, including the ability to provide computation via emergent behaviour ${ }^{39,40}$. Historically, the complexity of electronic computer architectures has stagnated near 10-100 million transistors per core due to design trade-offs between computing dynamic power and leakage power, operating voltage versus clock speed, and the optimum instructions per clock - and increasing transistor density has been applied to increase the size of the on-chip memory and additional functionality. Advances in neuromorphic/cognitive computing and emergent behaviour of collective systems can play an important role (Box 1, grand challenge 11).

In sharp contrast to energy per bit and complexity per CPU, the computational switching error rates have been kept extremely low $\left(<10^{-14}\right)$ via classical computer/circuit design techniques. Exceptions utilizing high error rates have been limited to data interconnects operating over long distances or large memory banks. Error rates in communication, computation and memory arise from intrinsic/ extrinsic noise sources, static variations and the choice of digitization (quantization) representation. In present digital computation, the physical digital logic layers operate at nearly error-free regimes (logic error rate $<10^{-14}$ ) since modern computing is built under the assumption of nearly error-free dynamic operation. Process variations (lithographic imperfections, dopant fluctuations) are overcome by strong overdesign of the circuits.

In the field of communications, the great success of Shannon's information theory has enabled communications at length scales from $10^{6} \mathrm{~km}$ to 10 m , providing a tool set based on the model of a noisy channel. We posit that Shannon's approach ${ }^{18,69}$ can be extended to computing (logic and memory) starting with a well described theory for the stochasticity for scaled devices. We propose that new architectures and methods be developed to allow computing fabrics to be erroneous. Advances on the computational theory in approximate and stochastic computing can play an important role (Box 1, grand challenge 10). The recent development of approximate computing processors with reduced and variable precession also lie on this axis, where the systematic quantization/digitization error may be increased due to the nature of the computational work (for example, inference or recognition tasks) ${ }^{37,38}$.

Scaling along all the three axes will lead to a unified computing paradigm that needs to switch at $100 k_{\mathrm{B}} T$ per event, tolerate high switching error rates, due to intrinsic/extrinsic stochasticity and thermodynamic constraints, and be able to utilize $>10$ billion switches operating in a collective/cooperative way.

In conclusion, a distinct opportunity and direction to continue Moore's law scaling via new materials, devices and state variables exists. Spintronics and multiferroics are the leading candidates
owing to the potential for ultralow switching energy ( 1 aJ per switch) at ultralow switching voltages $(<100 \mathrm{mV})$. However, this requires great advances in experimental and theoretical understanding of the materials, devices and circuits. We provide a list of grand challenge milestones, which systematically address the key performance metrics. We also describe a unified computing framework, which maps scaling along energy, switching error rates and complexity.

Received: 1 September 2017; Accepted: 5 March 2018;
Published online: 6 April 2018

## References

1. Auth, C., A. et al. in 2017 IEEE Int. Electron Devices Meeting 29-1. (IEEE, 2017).
2. Xu, M. \& Arce, G. R. Computational Lithography Vol. 77. (Wiley, New York, NY, 2011).
3. Danowitz, A., Kelley, K., Mao, J., Stevenson, J. P. \& Horowitz, M. Coтmии. ACM 55, 55-63 (2012).
4. Moore, G. E. ISSCC Dig. Tech. Pap. 20-23 (2003).
5. Dennard, R. H. et al. IEEE J. Solid-State Circuits 9, 256-268 (1974).
6. Holt, W. M. in 2016 IEEE International Solid-State Circuits Conf. 8-13 (IEEE, 2016).
7. Ghani, T. et al. in 2003 IEEE Int. Electron Devices Meeting 11-6 (IEEE, 2003).
8. Ferain, I., Colinge, C. A. \& Colinge, J.-P. Nature 479, 310-316 (2011).
9. Nikonov, D. E. \& Young, I. A. IEEE J. Explor. Solid-State Computat. Devices Circuits 1, 3-11 (2015).
10. Chappert, C., Fert, A. \& Nguyen Van Dau, F. Nat. Mater. 6, 813-823 (2007). 11. Allwood, D. A. et al. Science 309, 1688-1692 (2005).
11. Chumak, A. V., Vasyuchka, V. I., Serga, A. A. \& Hillebrands, B. Nat. Phys 11, 453-461 (2015).
12. Manipatruni, S., Nikonov, D. E. \& Young, I. A. Preprint at https://arxiv.org/ abs/1512.05428 (2015).
13. Meindl, J. D., Chen, Q. \& Davis, J. A. Science 293, 2044-2049 (2001).
14. Nyquist, H. Phys. Rev 32, 110-113 (1928).
15. Saleh, B. E. A. \& Teich, M. C. Fundamentals of Photonics (Wiley, New York, NY, 1991).
16. Camsari, K. Y., Faria, R., Sutton, B. M. \& Datta, S. Preprint at https://arxiv. org/abs/1610.00377 (2016).
17. von Neumann, J. Automata Studies 34, 43-98 (1956).
18. Merolla, P. A. et al. Science 345, 668-673 (2014).
19. Hopfield, J. J. Proc. Natl Acad. Sci. USA 79, 2554-2558 (1982).
20. Behin-Aein, B., Datta, D., Salahuddin, S. \& Datta, S. Nat. Nanotech 5, 266-270 (2010).
21. Spaldin, N. A. \& Fiebig, M. Science 309, 391-392 (2005).
22. Khomskii, D. Physics 2, 20 (2009).
23. Birol, T. et al. Curr. Opin. Solid State Mater. Sci 16, 227-242 (2012).
24. Heron, J. T. et al. Nature 516, 370-373 (2014).
25. Chu, Y.-H. et al. Nat. Mater. 7, 478-482 (2008).
26. He, X. et al. Nat. Mater. 9, 579-585 (2010).
27. Maruyama, T. et al. Nat. Nanotech 4, 158-161 (2009).
28. Mayadas, A. F., Shatzkes, M. \& Janak, J. F. Appl. Phys. Lett. 14, 345-347 (1969).
29. Iraei, R. M., Manipatruni, S., Nikonov, D., Young, I. \& Naeemi, A. IEEE J. Explor. Solid-State Computat. Devices Circuits 3, 47-55 (2017).
30. Pan, C., Chang, S.-C. \& Naeemi, A. in 2016 IEEE Int. Interconnect Technology Conf./Advanced Metallization Conf. (IITC/AMC) 56-58 (IEEE, 2016).
31. Manipatruni, S., Lipson, M. \& Young, I. A. IEEE J. Sel. Topics Quantum Electron. 19, 8200109 (2013).
32. Landauer, R. IBM J. Res. Dev 5, 183-191 (1961).
33. Mead, C. Proc. IEEE 78, 1629-1636 (1990).
34. Nikonov, D. E. et al. IEEE J. Explor. Solid-State Computat. Devices Circuits 1, 85-93 (2015).
35. Davies, M. et al. IEEE Micro 38, 82-99 (2018).
36. Jouppi, N. P. et al. Preprint at https://arxiv.org/abs/1704.04760 (2017).
37. Köster, U. et al. Preprint at https://arxiv.org/abs/1711.02213 (2017).
38. Strogatz, S. Sync: The Emerging Science of Spontaneous Order (Penguin, London, 2004).
39. Anderson, P. W. Science 177, 393-396 (1972).
40. Stupakiewicz, A., Szerenos, K., Afanasiev, D., Kirilyuk, A. \& Kimel, A. V. Nature 542, 71-74 (2017).
41. Rowlands, G. E. et al. Appl. Phys. Lett. 98, 102509 (2011).
42. Chu, Y. H. et al. Appl. Phys. Lett. 92, 102909 (2008).
43. Nowak, J. J. et al. IEEE Magn. Lett 2, 3000204 (2011).
44. Jan, G. in 2016 IEEE Symp. on VLSI Technology 1-2 (IEEE, 2016).
45. Shiota, Y. et al. Appl. Phys. Lett. 111, 022408 (2017).
46. Mundy, J. A. et al. Nature 537, 523-527 (2016).
47. Wang, Y., Hu, J., Lin, Y. \& Nan, C.-W. NPG Asia Mater 2, 61-68 (2010).
48. Shiomi, Y. et al. Phys. Rev. Lett. 113, 196601 (2014).
49. Bakaul, S. R. et al. Nat. Commun. 7, 10547 (2016).
50. Song, Q. et al. Sci. Adv. 3, el602312 (2017).
51. Cheng, C. et al. Preprint at https://arxiv.org/abs/1510.03451 (2015).
52. Jamali, M, et al. Preprint at https://arxiv.org/abs/1703.03822 (2017).
53. Omori, Y. et al. Appl. Phys. Lett. 104, 242415 (2014).
54. Sagasta, E. et al. Phys. Rev. B 94, 060412 (2016).
55. Noguchi, H, et al. in 2016 IEEE Int. Electron Devices Meeting 27-5 (IEEE, 2016).
56. Chen, L., Preston, K., Manipatruni, S. \& Lipson, M. Opt. Express 17, 15248-15256 (2009).
57. Hamaya, K. et al. Phys. Rev. B 85, 100404 (2012).
58. Liu, S., Grinberg, I. \& Rappe, A. M. Nature 534, 360-363 (2016).
59. Stengel, M. \& Íñiguez, J. Phys. Rev. B 92, 235148 (2015).
60. Yang, Y. Sci. Adv. 3, el603117 (2017).
61. Butler, W. H. et al. IEEE Trans. Magn. 48, 4684-4700 (2012).
62. Warren, W. L., Tuttle, B. A. \& Dimos, D. Appl. Phys. Lett. 67, 1426-1428 (1995).
63. D'Souza, N., Fashami, M. S., Bandyopadhyay, S. \& Atulasimha, J. Nano Lett. 16, 1069-1075 (2016).
64. Edelstein, V. M. Solid State Commun 73, 233-235 (1990).
65. Rojas Sánchez, J. C. et al. Nat. Commun. 4, 2944 (2013).
66. Kirilyuk, A., Kimel, A. V. \& Rasing, T. Rev. Mod. Phys. 82, 2731-2784 (2010). 68. Brewer, R. T. et al. J. Appl. Phys. 97, 034103 (2005).
67. Patil, A. D., Manipatruni, S., Nikonov, D., Young, I. A. \& Shanbhag, N. R.

Preprint at https://arxiv.org/abs/1702.06119 (2017).
70. Kish, L. B. \& Granqvist, C.-G. PLoS ONE 7, e46800 (2012).

## Acknowledgements

We sincerely acknowledge the discussions with R. Ramamoorthy, N. Shanbhag, D. Schlom, S. Salahuddin, F. Rana, B. Hillebrands, J.-P. Wang and A. Patil.

## Competing interests

The authors declare no competing interests.

## Additional information

Reprints and permissions information is available at www.nature.com/reprints.
Correspondence and requests for materials should be addressed to S.M.
Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

