

# Optimizing Delay and Enhancing Energy Efficiency of Carry Bypass-Adder using Quantum Dots Logic

E.Ranjitha<sup>1</sup>, V.Sindhuja<sup>2</sup>, S.Sree vijayalakshmi<sup>3</sup> and S.Vanathi<sup>4</sup>

<sup>1</sup>E.Ranjitha , ECE/ SNS college of engineering/ Anna University, India

<sup>1</sup>ranjithaeswaran5@gmail.com

<sup>2</sup>V.Sindhuja, ECE/ SNS College of engineering /Anna University, India

<sup>2</sup>sindhujav1996@gmail.com

<sup>3</sup>S.Sree vijayalakashmi, ECE/ SNS College of engineering /Anna University, India

<sup>3</sup> sreeviji2204@gmail.com

<sup>4</sup>S.Vanathi / ECE/ SNS College of engineering/ Anna University, India

<sup>4</sup> akshara261996@gmail.com

# **ABSTRACT**

In this paper, we present a carry bypass adder (CBA) structure that includes a higher speed with lower energy consumption compared with the conventional one and additionally we used the quantum dots logic in our modified structure to reduce the delay. we presents an efficient quantum-dot cellular automata (QCA) style for the Ladner-Fischer prefix adder. We then present an efficient QCA design of a hybrid adder that mixes the Ladner-Fischer adder with a ripple carry adder, we have a tendency to show that the hybrid adder has higher performance (in terms of latency) in QCA than a Ladner-Fischer or a ripple carry adder. We also show that the hybrid adder contains a smaller area-delay product than existing adder designs in QCA. Additionally, rather than utilizing brend kung adder the modified structure makes use of ladner fischer adder that has lower power consumption than the proposed designs, whereby the latter a lot of improves the speed and energy parameters of the adder. In the modified hybrid variable latency CBA we've replaced the parallel prefix adder with the ladner fischer sixteen bit adder that used three stages pre method stage that targeted on the generating and propagating the signal, carry generation network targeted on generating carry for each bit and post process stage focused on final result. The modified structures area unit assessed by their speed, power, and energy parameters with those of different adders using a 45-nm static CMOS technology for a good vary of supply voltages. The results that ar obtained exploitation XILINX simulations reveal, on average, 55% and 48% enhancements within the delay and energy, severally, compared with those of the proposed-CBA. Finally compared with completely different structures ladner fischer prefix adder was with considerably smaller area and delay. It disclosed reduction within the delay consumption with high speed. we mainly aims to hide these parameters like area, power consumption, delay, number of blocks.

**Key words:** Carry bypass adder, Ladner fischer adder ,energy efficient ,high performance

#### 1.INTRODUCTION

Adders are the building block in arithmetic and logic unit (ALUs) therefore increasing their speed and reducing the facility consumption extremely have an effect on the speed and power consumption of the processors. There are several works on optimizing the speed and power of these unit, that are reported in clearly, its very desirable to achieve higher

speed at low power consumptions, which will be a challenge for the designers of general purpose processors. One of the efficient techniques to reduce the ability consumption of digital circuits is to reduce the provision voltage

due to quadratic dependence of the switching energy on the voltage. Moreover, the sub threshold current, which is that the



main leakage component in OFF devices, has an exponential dependence on the supply voltage level through the draininduced barrier lowering result. Depending on the number of the provision voltage reduction, the operation of ON devices could reside within the super threshold, near-threshold, or sub threshold regions. Operating within the super threshold region provides US with lower delay and higher change and leakage powers compared with the near/sub threshold regions. in the sub threshold region, the logic gate delay and leakage power exhibit exponential dependences on the provision and threshold voltages. Moreover, these voltages are (potentially) subject to method and environmental variations within the nano scale technologies. The variations increase uncertainties within the same performance parameters. additionally, the threshold current causes a large delay for the circuits operating within the sub threshold region. The CBA, which is an efficient adder in terms of power consumption and space usage, was introduced .The essential path delay of the CBA is much smaller than the one within the RCA, whereas its space and power consumption are kind of like those of the RCA. Additionally, the power-delay product (PDP) of the CBA is smaller than those of the CSLA and PPA structures. The tiny variety of transistors, the CBA edges from comparatively short wiring lengths similarly as a regular and easy layout. The relatively lower speed of this adder structure, however, limits its use for high-speed applications.

# 2.EXISTING WORK

# A. Variable Latency Adders relying on adjust Clock stretching.

The basic idea behind variable latency adders is that the critical paths of the adders are activated rarely. Hence, the supply voltage may be scaled down without decreasing the clock

frequency.

It should be mentioned that since the input bits of the PPA block are used in the predictor block, this block becomes components each SLP1 and SLP2. In the hybrid structure, the prefix network of the Brent-Kung adder is used for constructing the nucleus stage (Fig. 1). One the benefits of the this adder compared with different prefix adders is that during this structure, using forward paths, the longest carry is calculated sooner compared with the intermediate carries, that are computed by backward ways. Additionally, the fan-out of adder is a smaller amount than different parallel adders, whereas the length of its wiring is smaller. Finally, it's a simple and regular layout. The inner structure of the stage p, as well as the changed PPA and skip logic, is shown in Fig. 1. Note that, for this figure, the scale of PPA is assumed to be eight (i.e., As shown within the figure, in the preprocessing level, the propagate signals (Pi) and generate signals (Gi) for the inputs are calculated. In the next level, using Brent-Kung parallel prefix network, the longest carry (i.e., G8:1) of the prefix network along with P8:1, that is the product of the all propagate signals of the inputs, are calculated prior different intermediate signals during this network. The signal P8:1 is used within the skip logic to see if the carry output of the previous stage (i.e., CO,p-1) ought to be



Fig. 1 Structure of the existing hybrid variable latency CBA



skipped or not. This signal is exploited because the predictor signal within the variable latency adder. It ought to be mentioned that each one of those operations are performed in parallel with different stages. within the case, wherever P8:1 is one, CO,p-1 ought to skip this stage predicting that some vital ways are activated. On the opposite hand, once P8:1 is zero, CO, p is adequate the G8:1. No vital path are activated during this case. When the parallel prefix network, the intermediate carries, that are functions of CO,p-1 and intermediate signals, are computed (Fig 1). Finally, within the post processing level, the output sums of this stage are calculated. It ought to be noted that this implementation is predicated on the similar ideas of the concatenation and incrementation ideas employed in the CI-CBA mentioned in Section IV. It ought to be noted that the tip a part of the SPL1 path from CO,p-1 to final summation results of the PPA block and also the starting a part of the SPL2 ways from inputs of this block to CO, p belong to the PPA block (Fig. 1). Additionally, kind of like the proposed CI-CBA structure, 1st|the primary} purpose of SPL1 is that the first input little bit of the primary stage, and also the last purpose of SPL2 is that the last bit of the total output of the incrementation block of the stage q. The steps for deciding the sizes of the stages within the hybrid variable latency CBA structure are similar to those discussed in Section IV. Since the PPA structure is additional efficient once its size is adequate an number power of two, we can choose a larger size for the nucleus stage consequently. This means that the third step mentioned in this section is modified. The larger size (number of bits), compared with that of the nucleus stage within the original CI-CBA structure, results in the decrease within the variety of stages similarly smaller delays for SLP1 and SLP2. Thus, the slack time will increase additional.

# B. Existing Hybrid Variable Latency CSKA Structure

The basic plan behind exploitation VSS CSKA structures was supported nearly equalization the delays of ways such the delay of the critical path is decreased compared with that of the FSS structure. This deprives us from having the chance of exploitation the slack time for the provision voltage scaling. To produce the variable latency feature for the VSS CSKA structure, we tend to replace a number of the middle stages in our pro-posed structure with a PPA modified during this paper. It ought to be noted that since the Conv-CSKA

structure includes a lower speed than that of the planned one, during this section, we do not consider the traditional structure. The planned hybrid variable latency CSKA structure is shown in Fig. 6 wherever an Mp -bit changed PPA is employed for the pth stage (nucleus stage). Since the nucleus stage, that has the most important size (and delay) among the stages, is present in both SLP1 and SLP2, replacement it by PPA the reduces the delay of the longest Critical path.

# 3.PLANNED SYSTEM MODEL

In this section we are making a small changes to the planned hybrid variable latency CBA by simply exchange the parallel prefix network block by ladner fischer prefix adder block. This logic uses the majority logic adders. The result vary per reactance value.fig2. Structure of modified hybrid variable latency CBA.here parallel prefix adder block is replaced with ladnerfischer adder this block generates the carry. In proposed block the four bit is given as input and it generates a carry which carry goes to consecutive block that has the another four bit as input which manufacture a another carry the second block waits for the primary block to finish its addition operation it takes longer delay thus, we tend to move for the LFA that is same as parallel prefix adder and this produces carry for eight bits, by doing this one entire block is reduced and time delay is reduced in stage Mp is considered to be the nuclear stage.LFA generates carry for each four or eight bits thus compared to PPA the response was smart and power consumption was nearly 1milliwatt.the time was also reduced to some milliseconds.

# A. QCA style of 8- and 16-bit Ladner–Fischer Adder:

Parallel prefix circuits that take n inputs x1,x2,...xn and produce outputs x1x2Qx1,....xnQxn-1Q....Qx1wherever is an associative binary operation, are used to realize powerful adders [22]-[25]. QCA designs of such adders are out not there, to the best of our knowledge. In this section, we investigate QCA realization of an adder based on the Ladner Fischer prefix circuit [23] to add 2 8-bit numbersx7....x0, and x7...x0. The Ladner-Fischer adder is chosen in view of the tiny number of stages required to obtain



the carries. we begin by presenting the prefix graph for a Ladner–Fischer adder. Fig. nine depicts the prefix graph of a 16-bit Ladner–Fischer adder (assuming nonzero initial carry and denoted byc0). it's price noting that the 8-bit prefix graph may be a set of Fig. nine and corresponds to the portion to the proper of . assumes convenience of 's and 's, wherever and are outlined as and , severally. the tiny shaded circle represents the associative operation "o and is outlined xiyi and xi+yi. Our 1st objective is to gift a QCA design for an8-bit Ladner–Fischer adder. Towards this end, we present the equations of the eight carry outputs 1st.With reference to Fig. 5, the carryC1 is calculated .This will be written in terms of

the operation as adder. The elements in square bracket denoted by[...] are calculated in stage one whereas the elements within the curly brackets denoted by are calculated in stage two as shown in Fig. 9. The remaining calculations (for example, "o" half in Q(c4.0) (c8.0) are done in stage three. The calculation of carries of an 8-bit Ladner-Fischer adder thus needs 3 stages (while four stages are needed for a 16-bit Ladner–Fischer adder) excluding the stage that involves calculation Now we come to QCA realization. The direct calculation of carryc1=g0+p0c0 needs 2 majority gates, particularly one for



AND

Fig.2 structure of modified variable latency CBA

operation and another for OR operation. We currently present a new proposition that c1 shows that requires just one majority gate.

Ladner fischer adder is same as parallel prefix adder .it is used to do the addition operation.It is looking like tree structure to do arithmetic operation.LFA is employed for high performance addition operation .it contains black cells and gray cells, each black cells contains 2 AND gates and one logic gate. grey cells contains one AND circuit Pi denotes the propagate and contains just one AND circuit. Gi denotes generate and it contains one AND circuit and one logic gate. it's

#### Pre process stage:

In pre process stage generate and propagate are from every combine of inputs. the propagate offers "XOR" operation of input bits and generates offers "AND" operation of input bits. The propagate is (Pi) and generate is (Gi).

# Carry generation stage:

In this stage carry is generated for every bit then is termed as carry generate (Cg).the carry propagation and carry generate is

generated for the additional operation however final cell gift within the every bit operations offers carry. The last bit can help to supply total of consecutive bit. Carry generate and propagate are Pg and Gg.

#### Post propagation stage:

It is the ultimate stage of an efficient LFA, the carry of a primary bit is "XOR ed" with consecutive little bit of propagates then the output is given.

#### 4.RESULT AND DISCUSSION:



| DESIGN                                   | NO OF SLICES | DELAY(ns) |
|------------------------------------------|--------------|-----------|
| Hybrid CBA 16<br>bit brend kung<br>adder | 27           | 17.285    |
| 64 bit<br>ladnerfischer<br>adder         | 126          | 62.372    |
| 16 bit<br>ladnerfischer<br>adder         | 18           | 12.2      |

Fig3.Comparison table.

| Adder<br>structur<br>e | CONV<br>CBA |     | CI CBA |     | HYBRI<br>D | MOD<br>I - |
|------------------------|-------------|-----|--------|-----|------------|------------|
| Para-                  | FSS         | VS  | FSS    | VS  |            | FIED       |
| meters                 | 1.99        | S   | 1.99   | S   |            |            |
| AREA                   | 254         | 253 | 246.   | 241 | 17.52      | 16.13      |
| AKEA                   | .2          | .1  | 3      | .5  | 17.32      | 10.15      |
| POWE                   |             |     |        |     |            |            |
| R                      | 0.50        |     | 0.43   |     | 0.48       | 1MW        |
| (MW)                   |             |     |        |     |            |            |
| DELA                   |             |     |        |     |            |            |
| Y                      | 30.5        |     | 29.3   |     | 27.3       | 20.1       |
| (ns)                   |             |     |        |     |            |            |
| transist               | 145         | 146 | 137    | 133 | 25.4       | 760        |
| or                     | 6           | 4   | 0      | 2   | 23.4       | 760        |



Fig.5 Change of delay power energy, number of transistors for modified hybrid variable latency CBA

# COMPARISION OF DIFFERENT PARAMETERS

Fig.4 Comparison table for different parameter



Fig.7 Area -delay versus various adder



Fig.6 Delay versus adder size

As shown on the above graphs the first Fig.5 shows the comparison of space ,power, energy and number of transistors of different adders. From the comparison itself we can see that the space ,power and number of transistors are much reduced when compared to existing one .The next Figure shows the delay graph of ripple carry adder, carry lookahead and ladner fischer adder, here also the delay of the proposed structure is very much improved.

Most of the system space is that the major issue in planning an adder, due to increase in space the delay is additionally enhanced at the same time. So by reducing the scale, delay may be reduced.



# 5. CONCLUSION

Thus as mentioned in the abstract the time was reduced up to 1milliseconds and additionally the results are simulating in modelsim then the ability consumption was additionally reduced to be 1milliwatts.we have reduced the area and simulated the end in XILINX (and the) and therefore the and additionally the variety of blocks were also reduced. We succesfully reduced the above mentioned parameters like space, power and delay with high speed. In our style we tend to was ready to reduce the area, power consumption, and delay. In our existing paper the speed was slightly less and that we have overcome the drawbacks of the prevailing system by using LFA. We know that reducing the ability and increasing the speed are troublesome and that they can't occur at same time however we tried to reduce the consiquences. The scale of the blocks ends up in delay and also the delay was reduced by reducing the block in our system.the proposed system operates with high speed and with efficient energy, also it reduced the stages.

# REFERENCES

- [1]High-Speed and Energy-Efficient Carry bypass adder operating under a large range of Supply Voltage Levels. Milad Bahadori, Mehdi Kamal, Ali Afzali-Kusha, Senior Member, IEEE, and MassoudPedram, Fellow, IEEE, volume 24.no.2.Feb 2016.
- [2] K. Du, P. Varman, and K. Mohanram, "High performance reliable variable latency carry select addition," in Proc. Design, Autom., TestEur. Conf. Exhibit. (DATE), Mar. 2012, pp. 1257–1262.
- [3]B. Ramkumar and H. M. Kittur, "Low-power and area-efficient carry select adder," IEEE Trans. very large Scale Integer. (VLSI) Syst., vol. 20, no. 2, pp. 371–375, Feb. 2012.
- [4]Efficient design of a Hybrid Adder in Quantum-Dot Cellular Automata VikramkumarPudi and K. Sridharan, Senior Member, IEEE, volume 19,no 9,Sep 2011
- [5]H. Cho and E. E. Swartzlander, "Adder and multiplier designs in quantum-dot cellular automata," IEEE Trans. Computer, vol. 58, no. 6, pp. 721–727, Jun. 2009.
- [6]S. Srivastava, S. Sarkar, and S. Bhanja, "Estimation of upper bound of power dissipation in QCA circuits," IEEE Trans. Nano technol., vol. 8, no. 1, pp. 116–127, Jan. 2009.

- [7]T. Dysart and P. M. Kogge, "Analyzing the inherent reliableness of moderately sized magnetic and static QCA circuits via probabilistic transfer matrices," IEEE Trans. very large Scale Integer. (VLSI) Syst., vol. 17, no. 4, pp. 507–516, Apr. 2009.
- [8] Y.Chenetal., "Variable-latency adder (VL-adder) designs for low power and NBTI tolerance," IEEE Trans. very large Scale Integer. (VLSI) Syst., vol. 18, no. 11, pp. 1621–1624, Nov. 2010.
- [9] S. Ghosh, D. Mohapatra, G. Karakonstantis, and K. Roy, "Voltage scalable high-speed strong hybrid arithmetic units using adaptive clock-ing," IEEE Trans.very large Scale Integer. (VLSI) Syst., vol. 18, no. 9, pp. 13 –1309, Sep. 2010.
- [10]Y. Liu, Y. Sun, Y. Zhu, and H. Yang, "Design methodology of variable latency adders with multistage function speculation," in Proc. IEEE 11thInt. Symp. Quality electron. design (ISQED), Mar. 2010, pp. 824–830.
- [11]Y.-S. Su, D.-C.Wang, S.-C.Chang, and M. Marek-Sadowska, "Performance optimization using variable-latency design style," IEEE Trans. Very large Scale Integer. (VLSI) Syst., vol. 19, no. 10, pp. 1874–1883,Oct. 2011.
- [12] V. G. Oklobdzija, B. R. Zeydel, H. Q. Dao, S. Mathew, and R. Krishnamurthy, "Comparison of high-performance VLSI adders within the energy-delay area," IEEE Trans. terribly massive Scale Integr. (VLSI)Syst., vol. 13, no. 6, pp. 754–758, Jun. 2005.
- [13]K. Walus and G. A. Jullien, "Design tools for an emerging SoC technology: Quantum-dot cellular automata," Proc. IEEE, vol. 94, no. 6, pp. 1225–1244, Jun. 2006.
- [14]H. Cho and E. E. Swartzlander, "Adder designs and analyses for quantum-dot cellular automata," IEEE Trans. Nanotechnol., vol. 6, no. 3, pp. 374–383, May 2007.
- [15] K. Kim, K. Wu, and R. Karri, "The robust QCA adder designs using composable QCA building blocks," IEEE Trans.

  Computer Aided Des. Integer. Circuits Syst., vol. 26, no. 1, pp. 176–183, Jan. 2007