The following topics are covered via the Lattice Diamond ver.2.0.1 Design Software.
• Overview of the Booth Radix-4 Sequential Multiplier
• State Machine Structure and Application of Booth Algorithm
• Booth Radix-4 Word-Width Scalability
• Testing the Multiplier with a Test Bench
This Verilog module uses a simple 2-state finite state machine (FSM) to evaluate groupings of 3 bits held in a product register and chose one of five possible operations based on those groupings. The state diagram for this 2-state FSM is found below in Figure 1. This 3-bit recoded shift and add process is known as the Booth algorithm. The version used in this module is known as the Booth Radix-4 multiplication algorithm.
Figure 1 – Booth Radix-4 FSM State Diagram
The Booth Radix-4 algorithm reduces the number of partial products by half while keeping the circuit’s complexity down to a minimum. This results in lower power operation in an FPGA or CPLD and provides for multiplication when no hard multipliers are otherwise available such as in a Lattice MachXO2 PLD which was used in this example. Booth Recoding makes these advantages possible by skipping clock cycles that add nothing new in the way of product terms. The Radix-4 Booth Recoding is simply a multiplexor that selects the correct shift-and-add operation based on the groupings of bits found in the product register. The product register holds the multiplier. The multiplicand and the two’s complement of the multiplicand are added based on the recoding value. The recoding is found in Table 1 below. An example multiplication can be found in Figure 2 below.
Figure 2 – Radix-4 Booth Algorithm Example
The Software required/used for this design:
• Lattice Diamond Design Software version 2.0.1 with third party software Synplify Pro for Lattice and Active-HDL Lattice Edition.
The Booth Radix-4 multiplier can be scaled from 4 bits up in even values such as 6, 8, 10… The user is limited by the logic density and speed of the PLD. Larger word widths require larger circuits with longer propagation delays. This being said larger circuits will require a slower clocking. A 6-bit multiplier was benchmarked at 157 MHz in a MachXO2, while an 18-bit multiplier was able to run at 140 MHz.
The design has five input ports (“clk”, “n_reset”, “start”, “mcand”, and "mplier”) and two output ports (“done” and “product”). The multiplier requires a start pulse to initialize the FSM with values from the “mcand” and “mplier” inputs and put the FSM in the “BUSY” state. Math steps can be sequenced by using the “start” and “done” signals between instantiations.
The RTL diagram for a 64-bit implementation can be found in Figure 3 below.
Figure 3 – RTL Diagram for Radix-4 Booth Multiplier
The included test bench was created from the “generate test bench template” command in the “HDL Diagram” window. Inspect the “booth_mult_tf.v” file by reading the Verilog comments for understanding. Aldec Active-HDL waveforms for 8-bit and 64-bit implementations can be found below in Figures 4 and 5.
Figure 4 – Active-HDL Test Bench Output for 8-bit Implementation
Figure 5 – Active-HDL Test Bench Output for 64-bit Implementation
Lattice Diamond Design Software version 2.0.1 was used to develop the “booth_mult.v” with supporting software from Synopsis (Synplify Pro for Lattice) and Aldec (Active-HDL Lattice Edition). Diamond can be used as a stand alone development environment with alternative synthesis and simulation software.
Further design support, product tutorials, application notes, users guides and other documentation can be found on the Lattice website
The complete Lattice Diamond project can be downloaded here.