# DESIGN OF LOW POWER 2-D MULTIPLIER USING 2-D BYPASSING TECHNIQUE <sup>1</sup>Vinod Kumar D., <sup>2</sup>Krishnamacharya C., <sup>3</sup>Gangaraju B. <sup>4</sup>Uma Maheswara Rao K.V., <sup>5</sup>Avinash K. <sup>1</sup> Jawaharlal Nehru Technological University-Kakinada <sup>1</sup> JNTUK University, India Emai: <sup>1</sup> vinod.dadi93@gmail.com, <sup>2</sup> anil92.dec@gmail.com, <sup>3</sup> bgrajkumar411@gmail.com, <sup>4</sup> umesh\_kvu@yahoo.co.in, <sup>5</sup> avinash.kanthimahanthi@gmail.com ### **Abstract** Based on the simplification of the addition operations in a low-power bypassingbased multiplier, a low-cost low-power bypassingbased multiplier is proposed. Compared with rowbypassing multiplier, column-bypassing multiplier and 2-dimensional bypassing-based multiplier for 20 tested examples, the experimental results show that our proposed low-cost low power multiplier saves 15.1% of hardware cost and reduces 29.6% of the power dissipation on the average for 4x4, 8x8 and 16x16 multipliers. **Keywords**— Row bypassing multiplier, Column bypassing multiplier, 2-dimensional bypassing multiplier, and power dissipation. ### I. INTRODUCTION As we get closer to the limits of scaling in complementary metal oxide semiconductor (CMOS) circuits, speed issues are becoming more and more important. In recent years, the impact of pervasive computing and the internet have accelerated this trend. The applications for these domains are typically run on battery-powered embedded systems. The resultant constraints on the speed require design for speed as well as design for performance at all layers of system design. Thus increasing speed is a key design goal for portable computing and communication devices that employ increasingly sophisticated signal processing techniques. Flexibility is another critical requirement that mandates the use of programmable components like FPGAs in such devices. However, there is a fundamental trade-off between efficiency and flexibility, and as a result, programmable designs incur significant performance and speed compared to application specific solutions. Consequently various digital signal processing chips are now designed with high speed performance. Signal processing applications typically exhibit high degrees of parallelism and are dominated by a few regular kernels of computation such as multiplication, that are responsible for a large fraction of execution time and energy. In such systems, multiplier is a fundamental arithmetic unit. Shrinking feature sizes are responsible for increasing delay-related problems as well. #### II. DESIGN CONSIDERATIONS # A. INTRODUCTION This chapter reveals the design considerations of High Speed parallel multiplier and explains the development of source code (VHDL code) module wise. The design of efficient logic circuits is a fundamental problem in the design of high performance processors. The design of fast parallel multipliers is important, since multiplication is a commonly used and expensive operation. This is particularly critical for specialized chips that support multiplication intensive operations, such as digital signal processing and graphics. It can also be useful for pipelined CPUs, where faster multiplier components and multipliers can result in smaller clock cycles and/or shorter pipelines. The various multipliers are: - 1. 4×4 Braun multiplier - 2. $4 \times 4$ Row bypassing multiplier - 3. $4 \times 4$ Column bypassing multiplier - 4. $4 \times 4$ 2-D by passing multiplier The detailed description of the above modules with respect to relevant schematic diagrams and necessary source code in VHDL has been explained. Here in High speed parallel multiplier all the codes are developed by using VHDL in behavioral and structural styles. All sub module codes are developed in data flow style. For high speed parallel multiplier designs, precautions need to be taken at each abstraction level, from system level to technology process level. Higher in the abstraction level an appropriate decision is taken to increase speed, the higher the impact will be. In general design practices to reduce switching activity reduction can be controlled at various levels of the design flow. # B. 4×4 BRAUN MULTIPLIER The Braun multiplier removes the extra correction circuitry needed. Also number of adders is less. But, the limitation of this technique is that it cannot stop the switching activity even if the bit coefficient is zero that ultimately results in unnecessary time delay. Another high speed designs disable the operation in some rows, designed a technique that reduces the switching to fairly good extent. Fig. 1. Schematic diagram of Braun multiplier ## C. 4×4 ROW BYPASSING MULTIPLIER The Row bypassing multiplier reduces the switching activity by bypassing the row in which the multiplicand bit is zero. That means in the multiplier if a bit is zero then that row of adders will get disabled. For example consider the multiplication of $1011 \times 1010$ . Here the multiplier consists of zero in first and third positions. During multiplication the first and third row of adders get disabled and previous sum is taken as the present sum. Fig. 2 Schematic diagram of Row by passing multiplier Here a special circuitry called adding cell is used instead of full adders. It consists of three state gates, full adder and multiplexers. The inputs i.e. the partial products to be summed up are given to the full adder through three state gates. The enable input to the three state gates and multiplexers is the corresponding multiplier bit. If this bit is zero then the three state gates goes into high impedance state and thus inputs are not given to the full adder. The previous sum is only taken as the present sum. If this bit is one then the three state gates gets enabled and the inputs are given to the full adder. Thus the sum is generated and this is taken as the present sum. Fig. 3. Internal structure of adding cell In this adding cell the three state gate will enabled only when Xj = 1 and then the adder will get input. If Xj = 0 then the previous sum and carry only will be taken as the present sum and carry. Thus row bypassing can be done by this adding cell (AC). In this way the switching activity can be reduced if the multiplicand bit is zero. Thus switching activity in row bypassing multiplier is less than that of Braun multiplier. But the only disadvantage of this row bypassing multiplier is that it needs extra circuitry than Braun multiplier. This limitation can be overcome by the column bypass multiplier. # D. 4×4 COLUMN BYPASSING MULTIPLIER Consider the multiplication of $1010 \times 1000$ . Since the multiplicand contains two zeros, the corresponding columns i.e. first and third will get disabled. Now, consider another multiplication of $1111 \times 1000$ . Since multiplicand contains no zero, all columns will get switched. Fig. 4. 4 × 4 Schematic Diagram of Column bypassing multiplier The limitation of this technique is that number of columns Switched depends on the number of ones in the multiplicand. For example if the multiplicand is 16 bit in length as 11111111111111111 then all the full adders in all the columns will get switched and consume more power. Less switching activity of the components can be achieved if the multiplicand contains more zeros than ones # E. 4×4 2-D MULTIPLIER USING BYPASSING TECHNIQUE: For a low-power 2-dimensional by passing based multiplier[9], it is desired that the addition operations in the(i+1)-th column or the j-th row can be bypassed if the bit, ai, in the multiplicand is 0 or the bit, bj, in the multiplier is 0.To correct the carry propagation in the multiplication result, the carry bit must be considered in the bypassing condition as follows: If the bit, ai and bj, are 0 and the carry bit, ci, j-1, is 1, the addition operations in the (i+1)-th column or the j-throw cannot be bypassed. Hence, the bypassing circuit must be added into the necessary FA to form a correct adder cell (AC). However, the inserted bypassing circuit in AC isso complicated that the ability of the power reduction is decreased. In Fig. 3, a $4\times4$ 2-dimensional bypassing-based multiplier with a correct AC can be illustrated. Fig. 2.5 $4 \times 4$ 2D Bypassing Multiplier in Braun Multiplier # III. TOOL USED # A. INTRODUCTION The Xilinx ISE 11.1i provides Xilinx PLD designers with the basic design process using ISE 11.1i. In this chapter you will understand of how to create, verify, and implement a design. This chapter contains the following sections: - Getting Started" - Create a New Project" - Create an HDL Source" - Design Simulation" - Implement Design and Verify Constraints" Getting Started Software Requirements: - ISE 11.1i Starting the ISE Software To start ISE, double-click the desktop icon, or start ISE from the Start menu by selecting: ISE 11.1i - Create a New Project Start - All Programs - Xilinx ISE 11.1i - Project Navigator Comparison of Time delays: By comparing the time delays obtained in synthesis reports we can decide the high speed Multiplier. The below table gives the comparison results. Table 1: Comparison of time delays. | Multiplier (4 × 4) | Maximum<br>combinational<br>Path Delay(ns) | |-----------------------------|--------------------------------------------| | Braun Multiplier | 16.723 | | Row bypassing Multiplier | 15.4091 | | Column bypassing Multiplier | 14. 8071 | | 2-D bypassing Multiplier | 13.608 | From the table it is understood that the time delay for the 2-D bypassing multiplier is less than any other multiplier. ### VI. CONCLUSION Thus the hardware implementation of Braun, Row, column multiplier and 2-D multiplier gave the difference in time delays for each computation is compared in the previous slide through graphical representation. By combining both row bypassing and column bypassing multipliers we have produced a 2-D bypassing multiplier with less delay and power consumption. Based on the simplification of the MAC operations and by using low-power bypassing technique, we have designed a two dimensional multiplier which has less computations and less switching activities with low power consumption. ### **REFERENCES** - [1] Oscal T. -C. Chen, Sandy Wang, and Yi-Wen Wu, 2003 .Minimization of Switching Activities of Partial Products for Designing Low-Power Multipliers., IEEE Transactions on VLSI Systems, vol. 11, no.3. - [2] Hichem Belhadj, BehroozZahiri, 2003 Albert Tai .Power-sensitive design techniques on FPGA devices. Proceedings of International conference on IC Taipa. - [3] Hong S., Kim S., Papaefthymiou M.C., 1999, and Stark W.E., .Low power parallel multiplier design for DSP applications through coefficient optimization. In Proc. of Twelfth Annual IEEE Int. ASIC/SOC onf., pp. 286-290. - [4] Abu-Khater I. S., Bellaouar A., and Elmasry M., 1996. Circuit techniques for CMOS low-power high-performance multipliers., IEEE J. Solid-State Circuits,, vol. 31, pp. 1535.1546. - [5] Ohban J., Moshnyaga V.G., and Inoue K., 2002. Multiplier energy reduction through bypassing of partial products, Asia-Pacific Conf. on Circuits and Systems. Vol.2, pp. 13-17.