ArticlePDF Available

High Performance Wallace Tree Multiplier Using Improved Adder

Authors:

Figures

Content may be subject to copyright.
ISSN: 2395-1680 (ONLINE) ICTACT JOURNAL ON MICROELECTRONICS, APRIL 2017, VOLUME: 03, ISSUE: 01
DOI: 10.21917/ijme.2017.065
370
HIGH PERFORMANCE WALLACE TREE MULTIPLIER USING IMPROVED ADDER
Meenali Janveja1 and Vandana Niranjan2
1Department of Electronics and Communication Engineering, G. L Bajaj Institute of Technology and Management, India
2Department of Electronics and Communication Engineering, Indira Gandhi Delhi Technical University for Women, India
Abstract
Multiplier is a crucial block of most of the digital arithmetic
applications. With the advancement in the field of VLSI, achieving
high speed and low power consumption has become a major concern
for the designers. As multiplier block consumes large amount of power
and has a major role to play in the speed of the circuit therefore its
optimization will improve the performance of the circuit. The process
of multiplication is implemented in hardware using shift and add
operation, so use of efficient adder circuit will lead to improved
multiplier. In this paper, reduced complexity Wallace tree multiplier
circuit is proposed that uses efficient and improved adder. The circuits
are designed using 90nm technology and simulated in Cadence
Virtuoso. The proposed Wallace tree structure offers a decrement of
approximately 70% in dissipation of power, approximately 86% in
power delay product and 60% in area. The proposed multiplier is
suitable to use in applications such as DSP structures, ALU’s and
several low power and high speed arithmetic applications.
Keywords:
Wallace Tree, Full Adder, Pass Transistor Logic, Power Dissipation,
Delay
1. INTRODUCTION
With the increase in integration scale, more and more
advanced and compact signal processing systems are needed to be
actualized on VLSI chip. These processing applications consume
a hefty amount of power and require good computation capacity.
With performance and area, power dissipation has also become a
concerning factor for design of integrated circuits. There are two
main factors that led to this budding of low power systems.
Firstly, increased integration has led to increase in processing
capacity due to which large flow of currents takes place leading
to heating up of the chip. Secondly in portable electronic devices
the battery life is limited and hence prolonged operation of these
portable devices can be obtained by achieving low power design.
It is known that in most of the signal processing algorithms,
multiplication have a fundamental role to play. The system’s
performance is generally determined by the performance of the
multiplier because the multiplier is generally the slowest element
in the system. Furthermore, it is generally the most area
consuming. Hence, optimizing the speed and area of the multiplier
is a major design issue. All the multipliers use full adders and
hence can be optimized using the modified full adders.
In this paper, authors have proposed a compact Wallace tree
multiplier structure to meet the present day needs of low power
and high speed applications. The paper is dived as follows. The
proposed multiplier uses improved adder designed using pass
transistor logic in 90nm technology. The proposed adder offers
lesser delay and area than the conventional techniques discussed
in literature.
The paper is organized as follows: Section 2 discuss the
conventional approaches. Section 3 presents the proposed
Wallace Tree structures. The results and discussion are compiled
in section 4. Section 5 concludes the paper.
2. WALLACE TREE MULTIPLIER
The important design consideration for any chip designer are
power consumption, delay and area. Speed of the circuit changes
with the speed/delay of the multiplier therefore a lot of research
has been done to increase the speed of multiplier so that delay of
the overall circuit can be reduced. Wallace Tree is a high speed
and area efficient multiplier and is therefore of great importance
in high speed applications [1].
It implements easy and efficient hardware methodology that
multiplies integers using the column compression technique.
Wallace tree offers fast speed because instead of linear
dependency as in array multiplier, the total delay is proportional
to the logarithm of word length of the operand of multiplier. The
operation of Wallace Tree Multiplier involves three steps:
formation of partial products, grouping of these formed partial
products and addition using adders.
To improve the performance of Wallace Tree lot of research
has been done [2-8]. In [2], author has proposed to use parallel
prefix adders instead of conventional half and full adders in
Wallace multiplier, leading to reduction in delay but the area and
power dissipation constraints are not looked into. In [3], to reduce
area and latency booth encoding with compressor approach is
used.
Further in [4], XOR-XNOR based 3:2, 4:2 and 5:2
compressors are used in place of half and full adders in second
stage of Wallace algorithm, leading to increase in speed. Though
[3] and [4] has led to improvement in the speed of the multiplier
but the area is not reduced considerably and also the use of 4:2
and 5:2 compressors increases the complexity resulting in
complex routing. Improvement is also done by estimating power
using probabilistic gate level power estimator in each stage [5] or
by rearranging the partial products in such a way so that switching
activity is reduced.
This offers a significant power reduction but area and speed
remains unaltered. Besides this the improvement depends on the
transition activity of the inputs. In [6], a full adder using 4:1
multiplexer is used in reduction phase leading to power reduction.
In [7], full adder using 2:1 multiplexer is used also reducing
power. These techniques of implementing full adder has led to
power reduction but the critical path delay is more than that of [8].
From all the previous studied literature [8] offers the best
performance on ground of area, power and speed. The Fig.1
shows the schematic of the adder used in [8].
ISSN: 2395-1680 (ONLINE) ICTACT JOURNAL ON MICROELECTRONICS, APRIL 2017, VOLUME: 03, ISSUE: 01
371
Fig.1. Conventional Full Adder [8]
3. PROPOSED WORK
As discussed in the previous section of the paper Wallace Tree
multiplier offers best speed compared to other multiplier circuits
and hence various techniques to improve its structure have been
proposed as discussed in [2-8]. In this paper a modified wallace
tree structure has been proposed that offers better performance
compared to existing approaches. The proposed structure is
designed using reduced complexity algorithm [9-10] and
modified adder subcircuit to process the intermediate addition of
bits. The Fig.2 shows the proposed adder, designed using Pass
Transistor Logic based 2:1 multiplexers. The Fig.3 displays the
2:1 pass transistor based mux. The use of pass transistor logic has
led to considerable decrease in number of transistors and hence
the area. Besides this it has best advantage of least static leakage.
Due to very few Vdd to ground connections during switching the
short circuit power is also least. The modified expressions for the
sum and carry of the full adder circuit are given as following
Eq.(1) and Eq.(2):
 
 
 
   
 
 
 
'''
''
''
' ' ' ' '
''
XOR XOR
= XOR XOR
= + C
=
=
= XOR XOR
SUM A B C
A B C
AB AB C A B AB
AB C ABC ABC A BC
A B C BC A B C BC
B C A B C A

 
 
(1)
 
 
 
 
 
 
 
'
'
''
'
=
=
=
=
=
CARRY AB BC CA
C B A AB
C B A B B AB
B C A C C ACB
BC ABC ACB
B B XORC A B XORC
  

 
 

(2)
Working of the circuit can be explained as follows and is
verified from the truth table as shown in Table.1. If B = C = 0/1
then Sum = A and Carry = B. If B != C then Sum = A! and Carry
= C.
In our proposed Wallace Tree Structure this modified adder is
used alongwith the reduced complexity algorithm for wallace tree
multiplier [9-10]. Unlike conventional wallace multiplier in
which both full adders and half adders are used to process three
and two bits respectively, it uses only full adders unless the
number of stages remain equal to that of conventional wallace
algorithm. The performance of multiplier is not affected by
eliminating half adders as they don’t compress the number of
partial bits, two bits added gives two bits in output (Sum and
Carry).
Fig.2. Proposed Adder
Fig.3. Pass Transistor Logic Based 2:1 Multiplexer
Table.1. Truth Table of Proposed Adder
A
B
C
Sum
Carry
0
0
0
0(A)
0(B)
0
0
1
1(~A)
0(A)
0
1
0
1(~A)
0(A)
0
1
1
0(A)
1(B)
1
0
0
1(A)
0(B)
1
0
1
0(~A)
1(A)
1
1
0
0(~A)
1(A)
1
1
1
1(A)
1(B)
The Fig.4 shows 44 bit multiplication using reduced
complexity algorithm. It can be seen that only S3 and C2 are
MEENALI JANVEJA AND VANDANA NIRANJAN: HIGH PERFORMANCE WALLACE TREE MULTIPLIER USING IMPROVED ADDER
372
processed as two bits so that number stages does not exceed the
conventional approach. The intermediate addition are performed
by using the proposed adder. The proposed wallace tree multiplier
structure is show in Fig.5.
X3Y2
X3Y1
X3Y0
X2Y0
X1Y0
X0Y0
X2Y3
X2Y2
X2Y1
X1Y1
X0Y1
N = 4 = r0
X1Y3
X1Y2
X0Y2
X0Y3
X3Y3
X3Y2
S3
S2
S1
X1Y0
X0Y0
H.A is used here
so
C3
C2
C1
X0Y1
r2 will contain
only two
X2Y3
H.A
X0Y3
so that stages
will not increase
S6
S5
S4
S1
X1Y0
X0Y0
C5
C4
X0Y1
r2 = 2
P7
P5
P4
P3
P2
P1
P0
Fig.4. 44 bit Multiplication using Reduced complexity Wallace
Algorithm
Fig.5. Proposed 44 bit Wallace Tree Multiplier
4. RESULTS
Simulations of all the circuits were performed in Cadence
Virtuoso using 90nm technology. The proposed adder has been
compared with the conventional version of [8] in Table.2.
The Fig.6 and Fig.7 shows the result for the conventional and
proposed adder. From the output waveforms it is observed that
circuit perform the correct functionality. For the sake of
comparison from the conventional circuits of [8], same inputs
were given to both the full adders.
Fig.6. Output Waveform of Proposed Adder
Fig.7. Output Waveform of Conventional Adder
Table.2. Comparative Result of Adders
Adder
Circuit
Existing
Adder
Proposed
Adder
No. of
Transistor
56
18
Delay (ps)
60.58
48.48
Logic
Used
CMOS XOR
gate + Mux
Pass Transistor
Logic
From Table.2, it is observed that the area i.e number of
transistors and the delay of the proposed adder has been reduced
considerably from the conventional adder.
Simulated results of the proposed wallace tree structure and
shown in Fig.7. The Table.3 summarizes the obtained results of
the wallace tree structures. It can be seen that number of
transistors are considerably reduced in the proposed wallace tree
structure.
This reduction has led to reduction in the power dissipation in
wallace structures i.e. the power that is dissipated due switching
activity of the pass transistor logic is less compared to the power
that is dissipated by the extra number of transistor in the
conventional wallace structure. Also the power delay product is
less compared to the conventional.
The Fig.10, Fig.11 and Fig.12 shows the layout of the
proposed adder block, pass transistor 2:1 multiplexers and the not
ISSN: 2395-1680 (ONLINE) ICTACT JOURNAL ON MICROELECTRONICS, APRIL 2017, VOLUME: 03, ISSUE: 01
373
gate respectively. The layout of proposed adder has been made
48.5 30.9m boundary using design rules for 90nm technology
and finally GDSII file was generated.
Fig.8. Output Waveform of Proposed RCWTM
The above waveforms of Fig.8 can be explained as follows
using the algorithm of Fig.4. Here, X = 1111 AND Y = 1111.
Therefore, the after multiplication we get P = 11110001. The
Fig.9 shows the algorithmic procedure.
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
0
0
0
1
P7
P6
P5
P4
P3
P2
P1
P0
Fig.9. Verification Of Result Of Proposed Multiplier
Table.3. Comparative Study of Wallace Tree Structures
Multiplier Circuit
Exixsting
Wallace
Tree [9]
Proposed
wallace
Tree
Percentage
improvement
Number of Transistor
768
312
59.3%
Power Dissipation
(mW)
2.283
.694
69.6%
Power Delay Product
473.72
65.57
86.1%
Fig.10. Layout of proposed Adder
MEENALI JANVEJA AND VANDANA NIRANJAN: HIGH PERFORMANCE WALLACE TREE MULTIPLIER USING IMPROVED ADDER
374
Fig.11. Layout of P.T.L based 2:1 Mux
Fig.12. Layout of NOT gate
5. CONCLUSION
In this paper Wallace tree multiplier has been investigated and
then modified Wallace tree multiplier circuits were proposed and
simulated using 90nm technology in Cadence Virtuoso. Initially
we simulated the existing adder and Wallace multiplier proposed
in [8]. Then a new improved adder is proposed that uses PTL
based 2:1 multiplexer. The number of transistors used is
comparatively less than that of existing adder and hence the area
is minimized and also delay offered is less.
The proposed Wallace Structure offers an improvement of
59.3% in reduction of power, 86.1% of reduction in power delay
product and 60.6% reduction in the number of transistors used i.e.
area. The power dissipation, area and power delay product of the
proposed Wallace multiplier has been minimized by a
considerable magnitude. With the increase in demand of greater
speed with less battery usage and minimum area, our proposed
structures will prove beneficial in all the applications that perform
mathematical operations using multipliers. Some of the
applications in which it can be widely used are ALU’s and DSP
structures.
REFERENCES
[1] C.S. Wallace, “A Suggestion for A Fast Multiplier”, IEEE
Transaction on Electronic Computers, Vol. 13, No. 1, pp.
14-17, 1964.
[2] S. Rajaram and K. Vanithamani, “Improvement of Wallace
Multipliers using Parallel Prefix Adders”, Proceedings of
IEEE International Conference on Signal Processing,
Communication, Computing and Networking Technologies,
pp. 781-784, 2011
[3] M.J. Rao and S. Dubey, “A High Speed and Area Efficient
Booth Recoded Wallace Tree Multiplier for Fast Arithmetic
Circuits”, Proceedings of Asia Pacific Conference on
Postgraduate Research in Microelectronics and
Electronics, pp. 220-223, 2012.
[4] S. Karthick, S. Karthika and S. Valannathy, “Design and
Analysis of Low Power Compressors”, International
Journal of Advanced Research in Electrical, Electronics and
Instrumentation Engineering, Vol. 1, No. 6, pp. 487-493,
2012.
[5] Saeeid Tahmasbi Oskuii, Per Gunnar Kjeldsberg and Oscar
Gustafsson, “Power Optimized Partial Product Reduction
Interconnect Ordering in Parallel Multipliers”, Proceedings
of Nordic Circuits and Systems Conference, pp. 1-6, 2007.
[6] S. Murugeswari and S.K. Mohideen, “Design of Area
Efficient and Low Power Multipliers using Multiplexer
based Full Adder”, Proceedings of 2nd International
Conference on Current Trends in Engineering and
Technology, pp. 388-392, 2014.
[7] Yingtao Jiang, Abdulkarim Al-Sheraidah, Yuke Wang,
Edwin Sha and Jin-Gyun Chung, “A Novel Multiplexer-
based Low-Power Full Adder”, IEEE Transactions on
Circuits and Systems, Vol. 51, No. 7, pp. 345-348, 2004.
[8] Kokila Bharti Jaiswal, Nitish Kumar, Pavithra Seshadri and
G. Laxminarayan, “Low Power Wallace Tree Multiplier
using Modified Full Adder”, Proceedings of 3rd
International Conference on Signal Processing,
Communication and Networking, pp. 1-4, 2015.
[9] R.S. Waters and E.E. Swartzlander, “A Reduced Complexity
Wallace Multiplier Reduction”, IEEE Transactions on
Computers, Vol. 59, No. 8, pp. 1134-1137, 2010.
[10] Sandeep Kakde, Shahebaj Khan, Pravin Dakhole and
Shailendra Badwaik, “Design of Area and Power Aware
Reduced Complexity Wallace Tree Multiplier”,
Proceedings of International Conference on Pervasive
Computing, pp. 1-6, 2015.
... Various articles have addressed MAC optimization by modifying the multiplication and addition techniques. Existing literature proposes different multiplication methods, such as vedic [18], array, wallace tree, booth [19], shift and add [20,21], and modified booth [22,23]. Researchers have also focused on optimizing the addition and quantized accumulation process using techniques such as approximation, quantization/ data resize, bits-serial, and reduced precision, as discussed in a study by Garland et al. [9]. ...
Article
Full-text available
Deep Neural Networks (DNNs) form the backbone of contemporary deep learning, powering various artificial intelligence (AI) applications. However, their computational demands, primarily stemming from the resource-intensive Neuron Engine (NE), present a critical challenge. This NE comprises of Multiply-and-Accumulate (MAC) and Activation Function (AF) operations, contributing significantly to the overall computational overhead. To address these challenges, we propose a groundbreaking Precision-aware Neuron Engine (PNE) architecture, introducing a novel approach to low-bit and high-bit precision computations with minimal resource utilization. The PNE’s MAC unit stands out for its innovative pre-loading of the accumulator register with a bias value, eliminating the need for additional components like an extra adder, multiplexer, and bias register. This design achieves significant resource savings, with an 8-bit signed fixed-point implementation demonstrating notable reductions in resource utilization, critical delay, and power-delay product compared to conventional architectures. An 8-bit sfixed < N, q > implementation of the MAC in the PNE shows 29.23% savings in resource utilization and 32.91% savings in critical delay compared with IEEE architecture, and 24.91% savings in PDP (power-delay product) compared with booth architecture. Our comprehensive evaluation showcases the PNE’s efficacy in maintaining inferential accuracy across quantized and unquantized models. The proposed design not only achieves precision-awareness with a minimal increase (≈\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\approx$$\end{document} 10%) in resource overhead, but also achieves a remarkable 34.61% increase in throughput and reduction in critical delay (34.37% faster than conventional design), highlighting its efficiency gains and superior performance in PNE computations. Software emulator shows minimal accuracy losses ranging from 0.6% to 1.6%, the PNE proves its versatility across different precisions and datasets, including MNIST (on LeNet) and ImageNet (on CaffeNet). The flexibility and configurability of the PNE make it a promising solution for precision-aware neuron processing, particularly in edge AI applications with stringent hardware constraints. This research contributes a pivotal advancement towards enhancing the efficiency of DNN computations through precision-aware architecture, paving the way for more resource-efficient and high-performance AI systems.
... Whereas resource utilization, power dissipation, and delay are the concern. The Wallace tree logic-based algorithm is designed to perform the multiplication in [13,17]. It has a lower critical path delay than the previously mentioned technique, but the design complexity is high and power dissipation is also high compared to the previous technique. ...
Article
Full-text available
Contemporary hardware implementations of deep neural networks face the burden of excess area requirement due to resource-intensive elements such as a multiplier. A semi-custom ASIC approach-based VLSI circuit design of the multiply-accumulate unit in a deep neural network faces the chip area limitation. Therefore, an area and power-efficient architecture for the multiply-accumulate unit is imperative to down the burden of excess area requirement for digital design exploration. The present work addresses this challenge by proposing an efficient processing and bit-serial computation-based multiply-accumulate unit implementation. The proposed architecture is verified using simulation output and synthesized using Synopsys design vision at 180 nm and 45 nm technology and extracted all physical parameters using Cadence Virtuoso. At 45 nm, design shows 34.35% less area-delay-product (ADP). It shows improvement by 25.94% in area, 35.65% in power dissipation, and 14.30% in latency with respect to the state-of-the-art multiply-accumulate unit design. Furthermore, at lower technology node gets higher leakage power dissipation. In order to save leakage power, we exploit the power-gated design for the proposed architecture. The used coarse-grain power-gating technique saves 52.79% leakage/static power with minimal area overhead.
Chapter
The Multiply and Accumulate Unit is a significant module in graphic hardware accelerators which is widely used in Signal processing and Neural Network applications where high computational over-head is involved. In general, re-configurability in hardware will provide an additional advantage for optimization when process intensive or computationally expensive operations are considered. In this paper, a reconfigurable architecture is designed for performing multiply - accumulate, multiplication and addition operations with improved efficiency by incorporating a multiplier algorithm. After analyzing the available literature, it is observed that the speed of existing conventional multipliers used in the Multiply - Accumulate unit is limited by the processing delays in the adders used for partial products addition. Hence, a ripple carry adder is included in Multiply - Accumulate unit to perform partial product addition which will show significant improvement in area and power consumption. This also includes a Sign Identification Unit for handling the signed and unsigned data before and after processing the operands. The reconfiguration logic is designed with the help of a set of multiplexers and de-multiplexers to improve data throughput rate and to optimize power consumption. The Power and Timing analysis is done for the reconfigurable MAC unit using the Synthesis tools.
Chapter
Sai Roshan, R.Nawaz, ShaikVuppala, AkshithaRavindra, J. V. R.Reversible Logic is an emerging field of research which finds its applications in low power computing, Nanotechnology and Quantum Computing. Reversible circuits should have one to one mapping i.e. one input can have only one output so that input vectors can be realized using output vectors. Reversible Circuits require Ancilla(constant inputs) and Garbage Outputs to retain reversibility. An efficient Reversible Circuit can be designed by optimizing their performance parameters. In this paper a 4×4Melior Quantum Multiplier has been proposed which consists of an optimized Partial Product Generation and Multi-Operand Addition using primitive Quantum gates to reduce the count of Ancilla and Garbage Outputs. This proposed multiplier shows an improvement of 21.73% and 18.18% reduction of Ancilla and Garbage Outputs respectively. This multiplier has been implemented in Cadence Virtuoso with average power dissipation of 106.79 nW at 45 nm technology node and used in the implementation of a Linear Phase FIR filter with an average power dissipation of 456.1 nW.
Conference Paper
Full-text available
When designing the reduction tree of a parallel multiplier, we can exploit a large intrinsic freedom for the interconnection order of partial products. The transition activities vary significantly for different internal partial products. In this work we propose a method for generation of power-efficient parallel multipliers in such a way that its partial products are connected to minimize activity. The reduction tree is designed progressively. A simulated annealing optimizer uses power cost numbers from a specially implemented probabilistic gate-level power estimator and selects a power-efficient solution for each stage of the reduction tree. VHDL simulation using ModelSim shows a significant reduction in the overall number of transitions. This reduction ranges from 15% up to 32% compared to randomly generated reduction trees and is achieved without any noticeable area or performance overhead.
Article
Full-text available
The 1-bit full adder circuit is a very important component in the design of application specific integrated circuits. This paper presents a novel low-power multiplexer-based 1-bit full adder that uses 12 transistors (MBA-12T). In addition to reduced transition activity and charge recycling capability, this circuit has no direct connections to the power-supply nodes, leading to a noticeable reduction in short-current power consumption. Intensive HSPICE simulation shows that the new adder has more than 26% in power savings over conventional 28-transistor CMOS adder and it consumes 23% less power than 10-transistor adders (SERF and 10T ) and is 64% faster.
Conference Paper
This paper presents the modification of existing prominent multipliers like Wallace multiplier and Truncated Multiplier in order to improvise them in terms of power and area. In the existing Wallace multiplier architecture, the Carry Save Adder is replaced with Modified Carry Save Adder (MCSA)and further the full adder in the MCSA is implemented using Multiplexer. Similarly the regular full adder in the Truncated multiplier has been replaced with mux based full adder to achieve low area and power. Simulation of 8 × 8 Multiplier has been carried out with Modelsim 6.3c and Synthesis is carried out by Xilinx10.1. Results obtained show that the proposed modified multipliers offer low power and reduced area than the existing Multipliers.
Conference Paper
A Wallace tree multiplier using Booth Recoder is proposed in this paper. It is an improved version of tree based Wallace tree multiplier architecture. This paper aims at additional reduction of latency and area of the Wallace tree multiplier. This is accomplished by the use of Booth algorithm and compressor adders. The coding is done in Verilog HDL and synthesized for Xilinx Virtex 6 FPGA device. The result shows that the proposed architecture is around 67 percent faster than the existing Wallace-tree multiplier, 53 percent faster than the Vedic multiplier, 22 percent faster than the radix-8 Booth multiplier, 18 percent faster than the radix-16 Booth Multiplier. In terms of area also, the proposed multiplier is much efficient.
Article
Wallace high-speed multipliers use full adders and half adders in their reduction phase. Half adders do not reduce the number of partial product bits. Therefore, minimizing the number of half adders used in a multiplier reduction will reduce the complexity. A modification to the Wallace reduction is presented that ensures that the delay is the same as for the conventional Wallace reduction. The modified reduction method greatly reduces the number of half adders; producing implementations with 80 percent fewer half adders than standard Wallace multipliers, with a very slight increase in the number of full adders.
Article
It is suggested that the economics of present large-scale scientific computers could benefit from a greater investment in hardware to mechanize multiplication and division than is now common. As a move in this direction, a design is developed for a multiplier which generates the product of two numbers using purely combinational logic, i.e., in one gating step. Using straightforward diode-transistor logic, it appears presently possible to obtain products in under 1, ¿sec, and quotients in 3 ¿sec. A rapid square-root process is also outlined. Approximate component counts are given for the proposed design, and it is found that the cost of the unit would be about 10 per cent of the cost of a modern large-scale computer.
Improvement of Wallace Multipliers using Parallel Prefix Adders
  • S Rajaram
  • K Vanithamani
S. Rajaram and K. Vanithamani, "Improvement of Wallace Multipliers using Parallel Prefix Adders", Proceedings of IEEE International Conference on Signal Processing, Communication, Computing and Networking Technologies, pp. 781-784, 2011
Design and Analysis of Low Power Compressors
  • S Karthick
  • S Karthika
  • S Valannathy
S. Karthick, S. Karthika and S. Valannathy, "Design and Analysis of Low Power Compressors", International Journal of Advanced Research in Electrical, Electronics and Instrumentation Engineering, Vol. 1, No. 6, pp. 487-493, 2012.
Design of Area and Power Aware Reduced Complexity Wallace Tree Multiplier
  • Sandeep Kakde
  • Shahebaj Khan
  • Pravin Dakhole
  • Shailendra Badwaik
Sandeep Kakde, Shahebaj Khan, Pravin Dakhole and Shailendra Badwaik, "Design of Area and Power Aware Reduced Complexity Wallace Tree Multiplier", Proceedings of International Conference on Pervasive Computing, pp. 1-6, 2015.