Conference PaperPDF Available

Abstract

The paper presents the Verilog coding of Fast Fourier transform implementation on Vivado. The butterfly diagram used to design the Fast Fourier transform of given input signals. The FFT is useful and efficient tool in digital signal processing especially in the area of digital signal and image processing. It has also been used in new technologies like internet of medical things for extracting desired information from signal. The Verilog implementation of FFT which includes complex numbers addition and multiplication due to twiddle factor. It is an efficient and fast method which requires less computations and produces output in short time.
Fig. 1 Butterfly Representation [9]
Fig: 2 Bit Reversal [11]
FAST FOURIER TRANSFORM BY VERILOG
Hina Magsi
Department Of Electrical Engineering
Sukkur Institute of Business Administration, University
Sukkur, Pakistan
hina.mece17@iba-suk.edu.pk
Ali Hassan Sodhro
Department Of Electrical Engineering
Sukkur Institute of Business Administration, University
Sukkur, Pakistan and DISP LAB, University Lumiere
Lyon2, Lyon, France
ali.hassan@iba-suk.edu.pk, alihassan.sodrho@uni-lyon2.fr
AbstractThe paper presents the Verilog coding of Fast
Fourier transform implementation on Vivado. The butterfly
diagram used to design the Fast Fourier transform of given input
signals. The FFT is useful and efficient tool in digital signal
processing especially in the area of digital signal and image
processing. It has also been used in new technologies like internet
of medical things for extracting desired information from signal.
The Verilog implementation of FFT which includes complex
numbers addition and multiplication due to twiddle factor. It is
an efficient and fast method which requires less computations
and produces output in short time.
Keywords fast fourrier trasnsform(FFT) ,dicrete fourier
transform(DFT),twiddle factor.
I. INTRODUCTION
The main purpose in digital signal processing is to reduce the
requirements of hardware and increased the computation. The
Fast Fourier transform (FFT) is an algorithm which samples
the signal over some space and splits them into frequency
points. This method is used to get the discrete Fourier
transform of the signal and it converts the signal into its
frequency domain. It has advantage over DFT because it
reduces the complexity of DFT from O(N^2) to O(NlogN)
where N represents the data points. The fast method to
compute the DFT (Discrete Fourier transform) is the FFT (fast
Fourier transform). It has many applications like spectral
analysis, matched filtering, image processing and disease
extractions. When FFT was developed the real time digital
signal become realized in many fields such as radar and
communications. There are many situations where FFT
computation is required to face the real-time data like medical
diagnosis and earthquake monitoring and so on [17]. The
detailed information about spectrum of signal in frequencies
and amplitude would be difficult to analyze in time domain
[19]. Due to this reason the signal would be converted into
frequency domain to get the complete information. In IoMT
(Internet of Medical Things) FFT plays an important role. As
EEG and ECG are highly noisy and difficult for processing.
That’s why extraction information form this signal is the main
issue. To extract the detailed and meaningful information from
noisy signal in order to get the best analytics FFT is used. It
converts the noisy data into desired information and reduces
the noise level. It provides the detailed information in
frequency domain. The frequency domain representation of
signals educes the computational complexity of various digital
processing systems. In the field of ultrasonic echo signals FFT
increased the computational speed and accuracy which makes
the system performance better. The FFT is the center part in
Doppler blood flow spectrum analysis and also used in
Doppler imaging system [1]. The results of DFT and FFT are
same only difference is the reduction in the calculations. FFT
divides the data into smaller sets (even and odd) [12]. There
are two types of division known division in time and division
in frequency. The division in time requires bit reversed output
and produces normal output order while division is frequency
takes normal input order and generates bit reversed output
[14]. This paper uses the division in time method. Each stages,
the results of previous stage are combined. The Fig. 1 shows
the butterfly diagram of 8-point FFT [9]. The butterfly is part
of the FFT computation which combines smaller DFT point
into larger points. The name butterfly implies that the shape of
the flow of data as shown in Fig. 1. In every stage first inputs
are multiplied by twiddle factor and add or subtract to other
input. The points defines the stages of butterfly like 8-point
have 3 stages, 16-point have 4 stages and so on. In FFT
Fig: 3 RTL diagram
calculation there is concept of bit reversal the inputs are to be
bit reversed to produce the outputs bits in the normal order.
The input bits are divided into even and odd parts and these
parts are divided into even and odd parts. This process is
continued till only two points left [11]. In the paper the input
sequence is (x0,x1,x2,x3,x4,x5,x6,x7) and after bit reversed
the sequence becomes (x0,x4,x2,x4,x1,x5,x3,x7) . This is
shown in Fig. 2[11].
The main contribution of this paper is to design FFT
algorithm using radix-2 Verilog code, implemented and
verified on vivado. The block diagram implemented on
Vivado is shown in Fig.3.
Remaining part of this paper is organized as follow.
Section II discussed the literature review, Section III explained
the proposed algorithm on Vivado, results are shown in
Section IV and finally the paper is concluded in Section V.
II. LITERATURE REVIEW
There are many researchers who have worked on FFT
algorithm and implement the different methods. Some of the
work are discussed below.
Zhou et al proposed the FFT processor for high speed and
real time signal processing [1]. They present FFT design on
field programmable gate array (FPGA) by using radix-2
algorithm and also pipeline structure for butterfly algorithm
and Ping-Pone for memory unit. They compared the theoretical
and practical values in their paper. This structure is very easy
and expandable by changing the twiddle factor and some
changes in timings of ping-pone operations.
Sridhanya et al presents the [2] application of FFT in
MIMO-OFDM (multiple input multiple output orthogonal
frequency division multiplexing) systems. They used radix Ns
(number of data streams) butterflies for every stage. Memory
scheduling and multipath delay commutator is used as
hardware implementation which reduced the storage and
provides efficient saving for power. They replaced the twiddle
factor with complex multiplier and presents the advantages in
terms of power consumption.
Song Yu et al proposed [3] the FFT algorithm by using
rotation factor and data address of the node. They uses Verilog
HDL platform for design and realization of data address and
also compilation is done on PLD software. They concludes that
FFT with FPGA is easy for expansion and it has real-time data
capability.
Athira et al introduced the FFT application in telemetry
data processing applications [4]. They uses vedic mathematics
for multiplications. The Urdhva Tiryakbhyam sutra method is
used because it reduces the computation complexity of
multipliers. They proposed the 24 bit floating point
implementation using IEEE754 multiplication on vedic
mathematics and compare the results with conventional
method. The carry adder or ripple carry adder improved the
performance of addition part. The design was done on Verilog
while simulations stage was carried out using Modelsim.
Zakir et al presents the Q format constant multipliers for
FFT processors with implementation on hardware improves the
speed of the multipliers [5]. The common sub expression
elimination (CSE) method and canonical signed Digit (CSD) is
used to reduce number of adder which also reduces hardware
specifications. They presented the design for 8bits, 10 bits and
16bits using Verilog and implement this system on Altera
device.
Ibrahim at al proposed the radix-2 FFT algorithm to be
implemented and investigated on FPGA and Graphic
Processing Units (GPU). [6] The Verilog HDL is used for
FPGA and Open Computing Language is used for GPU. They
compares both results and concludes that FFT of small sizes
produces faster result on FPGA while GPU can used for larger
sizes of FFT. Also FPGA implementations is faster than built-
in IP core Xilinx.
Archna et al presents the [7] importance of Hilbert
algorithm in digital signal processing like modulation,
frequency analysis and audio production. They proposed the
realization of this algorithm by radix 2^2 single path delay
feedback (SDF) pipelined FFT processor which can be used in
envelope detection. The system has implemented on Xilinx
Verilog code.
Ravi et al proposed the bit-reversal part of FFT. The
proposed system is simple and efficient for reordering of
parallel data formation [8]. They used the parallel pipelined
FFT processors. The system has been implemented using
Verilog HDL and simulated on Modelsim.
Anup et al compares the [9] DFT and FFT in the field of
digital signal processing. They said that FFT is efficient
method to compute DFT with larger points. They implemented
FFT algorithm on Verilog by using floating point. They
concluded that the FFT reduced the cost and complexity of
adders and multipliers.
Miao et al described the powerful tool FFT used in signal
processing [10]. The FFT method reduces the computational
complexity from O(n^2) to O(NlogN). It is important tool for
real time data and used in many applications like IoMT. The
dot product engine (DPE) method is used for computing FFT
Fig. 4 DIT Radix-2 FFT
and compared the results with real multiplication operation.
The DPE cluster can be used for many applications like IoMT.
Vinodh et al proposed the implementation of FFT pruning
algorithm on FPGA [13]. The FFT reduced the time for
computation when zero valued number outstrips the non-zero
valued due to excessive computations. This is achieved by
pruning algorithm of FFT and is implemented in hardware.
The computational time consumption is observed. The FFT
implementation on FPGA would reduce the computation time
and easily calculates the DFT [20].
Atin et al [15] proposed the area efficient architecture of
radix-2 FFT processor. The algorithm reused the units of
single stages of butterfly units which reduces the area. The
simulation has been taken on VHDL. The system has been
implemented and verified on FPGA. The FFT processor
performance can be increased b recording the size and
improving SNR of the system [18]. The proposed system
provides the better results than conventional system of FFT
processor.
Josue et al explained the 16 and 32 bit vertex 6 FFT
algorithm on VHDL [16]. The design proposed had used short
area and increase of input-output block can be reduced by
using parallel in serial out and serial in parallel out shift
registers.
III. PROPOSED FFT ALGORITHM
The mathematical representation of Fast Fourier transform [9]
is given as:
2/
( ) ( ) j kn N
n
Y k x n e

where x(n) represents the inputs and exponential part is known
as twiddle factor(
k
n
w
) which includes complex numbers. N is
the number of data points. The paper proposed implementation
of DIT radix-2 algorithm Fast Fourier Transform using
Verilog on VIVADO software. The design includes the
implementation of DIT radix-2 8-point FFT butterfly
algorithm. The butterfly comprises of three stages. Every stage
is connected with each other to get final output. It takes input
sequence and twiddle factor and produces the final desired
output. The twiddle factor is calculated manually and initialize
in Vivado. The design includes adders and multipliers of
complex numbers. The Fig. 4 illustrated the FFT design radix-
2. The RTL diagram for radix-2 FFT algorithm is shown in
Fig. 6.The adder and multiplier of radix-2 FFT is shown in
Fig.5. It takes 8 bit inputs as (x0,x1,x2,x3,x4,x5,x6,x7),
initialize twiddle factor as real (wr) and imaginary part (wi)
and produces the final output after three stages as 8 bit real
and imaginary part(𝑦0, 𝑦1,y2,y3,y4,y5,y6,y7). The initial
input values are x0=1, x1=2, x2=4, x3=8, x5=16, x7=64,
x7=128, and the twiddle factors are calculated below and
manual calculations are shown in Table. I.
2/
kj k N
ne
w
Where N=8
12 (1)/8 /4
8
1
2
jj
j
ee
w


 
22 (2)/8 /2
8jj
e e j
w


 
32 (3)/8 3 /4
8
(1 )
2
jj j
ee
w



 
Calculations
Stage 1 output
Stage 2 output
Stage 3 output
=1+ (1)16=17.
=17+ (1)68=85.
=85+ (1)170=255.
=1-(1)16=15.
=-15+ (-j)-60=-15+j60.
= [-15+j60] + (0.707-j0.707) (-30+j120) = 48+j165.
=4+ (1)64=68.
=17- (1)68=-51.
=-51+ (-j) (-102) =-
51+j102.
=4-(1)64=-60.
=-15-(-j)-60=-15-j60.
= [-15-j60] + (-0.707-j0.707) (-30-j120) =-78+j45.
=2+ (1)32=34
=34+ (1)136=170.
=85-(1)170=-85.
=2-(1)32=-30.
=-30+ (-j)-120=-
30+j120.
=
= [-15+j60]-(0.707-j0.707) (-30+j120) =-78-j45.
=8+(1)128=136.
=34-(1)136=-102.
==-51-(-j) (-102) =-51-
j102.
=8-(1)128=-120.
=-30-(-j)-120=-30-
j120.
= [-15-j60] + (-0.707-j0.707) (-30-j120) =48-j165.
TABLE I. Manual Calculations
Fig. 7 Vivado Result
Fig. 6 RTL diagram of DIT radix-2 FFT
Fig. 5 Adder & Multiplier
IV. RESULTS
The Fig. 7 shows the waveform result and each stage
output is shown in Table. II.The waveform have been shown
which includes real and imaginary output.
V. CONCLUSION
FFT has the benefit over DFT because it has less
computations. FFT takes inputs and through adding and
multiplication process produces the output. While DFT has to
make O (N^2) addition and multiplication for higher number of
points. Implemented butterflies algorithm on VIVADO reduces
calculations and makes system more efficient and fast. The
FFT algorithm is used information extracting for ECG, EEG
signals. It is also powerful tool for new technology IoMT. The
future work will to reduce the complexity of multiplications in
FFT to make system more efficient.
ACKNOWLEDGMENT
This work is funded by HEC Pakistan under the START-
UP RESEARCH GRANT PROGRAM (SRGP) #21-
1465/SRGP/R&D/HEC/2016, and Sukkur IBA University,
Sukkur, Sindh, Pakistan.
REFERENCES
[1] Sheng Zhou, Xiaochun Wang, Jianjun Ji, and Yanqun Wang,”Design
and Implementation of a 1024-point High-speed FFT Processor Based
on the FPGA “, 6th International Congress on Image and Signal
Processing (CISP), 2013.
[2] M. Sridhanya, and Mrs. G. Annapurna,” Efficient design of FFT/IFFT
processor using Verilog HDL”, International Journal of Professional
Engineering Studies (IJPRES), JULY 2015.
[3] Song Yu, Lin Xingye,and Zhai Shuang,” Design and realization of FFT
implementation unit based on FPGA ”, Third International Conference
on Multimedia Information Networking and Security, 2011.
[4] Athira Menon M S,and Renjith R J,”Implementation of 24 bit high
speed floating point vedic multiplier”, International Conference on
Networks & Advances in Computational Technologies(NetACT), 20-22
July 2017.
[5] Md.Zakir Hussain, Kazi Nikhat Parvin ,and Zeba Fatima Mir Ilyas
Ali,” Q-Point Constant Multipliers for FFT Processors”, International
conference on Signal Processing, Communication, Power and
Embedded System (SCOPES),2016.
[6] Muhammad Ibrahim, and Omar Khan, “Performance Analysis of Fast
Fourier Transform on Field Programmable Gate Arrays and Graphic
Card”, 2016.
[7] Archna Rani1, Ram Mohan Verma,and Saurabh Jaiswal,” FPGA
implementation of Hilbert Transform via Radix-2 Pipelined FFT
Processor”,4th ICCCNT, July 4-6, 2013, Tiruchengode, India.
//first stage
outputs
//second stage outputs
//third stage outputs
sel=0, outputs:yr=
17 ,yi= 0
sel=1, outputs:yr=
-15 ,yi= 0
sel=2, outputs:yr=
68 ,yi= 0
sel=3, outputs:yr=
-60 ,yi= 0
sel=4, outputs:yr=
34 ,yi= 0
sel=5, outputs:yr=
-30 ,yi= 0
sel=6, outputs:yr=
136 ,yi= 0
sel=7,
outputs:yr=-120
,yi= 0
sel=0,outputs:yr= 85 ,
yi= 0
sel=1,outputs:yr= -15 ,
yi=60
sel=2,outputs:yr= -51 ,
yi= 0
sel=3,outputs:yr=-
15,yi= -60
sel=4,outputs:yr=
170,yi= 0
sel=5,outputs:yr= -
30,yi= 120
sel=6,outputs:yr=-102
, yi= 0
sel=7,outputs:yr=-
30,yi=-120
sel=0,outputs:yr= 255 ,
yi= 0
sel=1,outputs:yr=48
,yi= 165
sel=2,outputs:yr= -
51,yi= 102
sel=3, outputs:yr= -
78,yi= 45
sel=4, outputs:yr= -85
,yi= 0
sel=5, outputs:yr= -
78,yi= -45
sel=6,outputs:yr= -
51,yi=-102
sel=7,outputs:yr=
48,yi=-165
TABLE III. Stagewise Output
[8] Ravi L.S, Chithra .C .P ,and Farzana Parveen B.C,” Design and
Implementation of Parallel Bit Reversal on FFT by using Verilog HDL”,
Vol.7,Issue No.8,International Journal of Engineering Science and
Computing, August 2017.
[9] Anup Tiwari,and Samir Pandey, “Implementation of Fast Fourier
Transform in Verilog” International Journal of Engineering and
Management Research, Vol.6, Issue-6, November-December 2016.
[10] Miao Hu, and John Paul Strachan, ”Accelerating Discrete Fourier
Transforms with Dot-product Engine”,2016.
[11] Rubio, M., Gómez, P. and Drouiche, K.,”A new superfast bit reversal
Algorithm”, International Journal of Adaptive Control and Signal
Processing, pp.703-707, 2002.
[12] Mehrotra, Mitul, Geetika Pandey, and Mandeep Singh Narula.
"Implementation of FFT algorithm”, International Journal of Engineering
Sciences & Research Technology, 10 May 2017.
[13] Ch. Vinodh Kumar,and K.R.K Satry,”Design and Implementation of
FFT Pruning Algorithm on FPGA”, 7th International Conference on
Cloud Computing, Data Science & Engineering, 2017.
[14] Debalina Ghosh , Depanwita Debnath , and Dr. Amlan
Chakrabarti,”FPGA Based Implementation of FFT Processor Using
Different Architectures”,IJAITI,2012.
[15] Atin Mukherjee, Amitabha Sinha and Debesh Choudhury, “A Novel
Architecture of Area Efficient FFT Algorithm for FPGA
Implementation”, 25 Feb 2015.
[16] Josue Saenz S., Juan J. Raygoza P., Edwin C. Becerra A. ,Susana Ortega
Cisneros, and Jorge Rivera Dominguez,” FPGA Design and
Implementation of Radix-2 Fast Fourier Transform Algorithm with 16
and 32 Points”, ROPEC,2015.
[17] J. G. Proakis, et al., Digital Signal Processing, 4th ed. U.S.A.:
Prentice-Hall, Inc., (2006).
[18] Aimei Tang, Li Yu, Fangjian Han, and Zhiqiang Zhang, “CORDIC-
based FFT Real-time Processing Design and FPGA Implementation”,
12th International Colloquium on Signal Processing & its Applications
(CSPA2016), 4 - 6 March 2016, Melaka, Malaysia.
[19] Wu, R. and T. Tsao, 2003. “The Optimization of Spectrum Analysis for
Digital Signals” IEEE Transactions on Power Delivery, 18(2): 398-405.
[20] Néstor Fernando Hortúa Díaz, Yuli Alexandra Velásquez Pulido, Dario
Amaya Hurtado, Calculation of The Fft Using The Radix 2 Algorithm
In A Fpga Cyclone IV”, 2016. Journal of Applied Sciences Research.
12(12); Pages: 46-53.
ResearchGate has not been able to resolve any citations for this publication.
Conference Paper
Full-text available
The Fast Fourier Transform (FFT) is an important algorithm used in the field of Digital Signal Processing and Communication Systems. The FFT has applications in a wide variety of areas, such as linear filtering, correlation, and spectrum analysis, among many others. This paper describes the development of decimation-in-time radix-2 FFT algorithm with 16 and 32 points. VHDL was used as a description language, and ISE Design Suite as an Integrated Development Environment (IDE).
Article
Full-text available
Fast Fourier transform (FFT) of large number of samples requires huge hardware resources of field programmable gate arrays (FPGA), which needs more area and power. In this paper, we present an area efficient architecture of FFT processor that reuses the butterfly elements several times. The FFT processor is simulated using VHDL and the results are validated on a Virtex-6 FPGA. The proposed architecture outperforms the conventional architecture of a $N$-point FFT processor in terms of area which is reduced by a factor of $log_N 2$ with negligible increase in processing time.
Article
Full-text available
In this paper we present a new bit reversal algorithm which outperforms the existing ones. The bit reversal technique is involved in the fast Fourier transform technique (FFT), which is widely used in computer-based numerical techniques for solving numerous problems. The new approach for computing the bit reversal is based upon a pseudo-semi-group homomorphism property. The surprise is that this property is almost trivial to prove but at the same time it also leads to a very efficient algorithm which we believe to be the best with only (N) operations and optimal constant, i.e. unity. Copyright © 2002 John Wiley & Sons, Ltd.
Conference Paper
The Fast Fourier Transform (FFT) is an important algorithm in the fields of science and engineering, where it is used in diverse areas such as communications, signal processing, instrumentation, image and video analysis, etc. The algorithm is essentially a fast implementation of the Discrete Fourier Transform which allows it to reduce the asymptotic complexity of the latter from O(n 2 ) to the former's O(n log n). In this paper, the radix-2 decimation in time FFT algorithm is implemented and investigated on Field Programmable Gate Arrays (FPGA) and Graphic Processing Units (GPU). The hardware descriptive language Verilog HDL (VHDL) is used for the FPGA, while the Open Computing Language (OpenCL) is used for the GPU. Both implementations are compared with various pre-installed IP-core modules of Xilinx and MATLAB for complex input of various sample sizes. From the results, it is concluded that the FPGA shows faster performance for a large number of FFT's of small sizes. On the other hand, the GPU is more promising for large number of FFT's of large sizes. The results also confirm that the FPGA based implementation is faster then the built-in IP-core modules of Xilinx. A hardware synthesis for FPGA is also provided.
Conference Paper
This paper presents a designing scheme of high-speed real-time serial pipelined Fast Fourier Transform (FFT) processor on FPGA which is based on Coordinate Rotation Digital Computer (CORDIC) algorithm. The CORDIC algorithm will reduce the hardware complexity compared to the direct implementation of the butterflies using complex multipliers. Moreover, the design uses the butterflies of the radix-2 Decimation-In-Time (DIT) algorithm, the dual-port RAM and the pipelined structure, which will sufficiently increase the performances of the FFT processor. The simulation results show that compared with the same type of real-time FFT processor, the scheme presented in this paper reduces the hardware resource requirements of Adaptive Look-up Tables (ALUTs) and increase the Signal Noise Ratio (SNR) by about 25dB.
Conference Paper
To design a Fast Fourier Transform (FFT) processor to meet the needs for high-speed and real-time signal processing. A 1024-point, 32-bit, fixed, complex FFT processor is designed based on a field programmable gate array (FPGA) by using the radix-2 decimation in frequency (DIF) algorithm and the pipeline structure in the butterfly module and the ping-pone operation in data storage unit. When the primary clock is 100 MHz, the 1024-point FFT calculation takes about 62.95 us. The processor is fast enough for processing the high-speed and real time signals. The result provides reference values that theoretical study of the FFT algorithm can be applied into the adaptive dynamic filter of an ultrasonic diagnostic system and an ultrasonic Doppler flow measurement system.
Article
This paper provides a complete method to solve the defects of spectrum analysis for discrete signals. Much research has been issued to deal with the defects of the fast Fourier transform (FFT). Those methods may cause the characteristics of the original signal to be altered or may solve this problem in part only. The defects of spectrum analysis cannot be solved efficiently and completely by those methods. These defects result because frequency scales cannot match with signal characteristics. This paper is based on a concept to establish a complete solution divided into three steps. First, this paper analyzes the signal characteristics to be the basis of spectrum adjustment. A simple and accurate algorithm is used to find the frequency and amplitude of each component. Next, this paper finds optimum spectrum parameters to make the new spectrum match with signal characteristics. Finally, this paper takes the parameters to reanalyze the original signal. The method will make spectrum analysis reach optimization. Every procedure in this paper compares with traditional ones to prove its benefits. Moreover, we verify the theory feasible by analyzing actual signals.