Links

Performance benchmarks

This page demonstrates speed of different transforms performed by GFFT library.
All the plots in the pages below use MFLOPS (million of floating-point operations per second) as the performance measure. It is not the real number of operations, but the conventional value computed from the time for one FFT as follows:

MFLOPS = 5 N log2(N) / (time for one FFT in microseconds)
/ 2 for real-data FFTs

This formula is also commonly used for performance benchmarks in other FFT libraries (e.g. FFTW)

The time for one FFT has been estimated from multiple runs of the same FFT and measured real time spended for the computation. It can be slightly bigger than CPU-time for the same computation due to existance of other running (system) processes, but this is only the way to estimate and compare performance of multithreaded code correctly.

We measure and represent real time spended for complete transform, since GFFT performs all transforms in a single step. Many other libraries represent only transform step omitting often expensive initialization or planning steps.

As you will see in the pages below, we intentionally avoid hardware dependent compiler options to underline hardware independent high performance of GFFT library relying on compiler capabilities.

GFFT 0.3

Intel Xeon W3550 3.07GHz, 24Gb RAM Linux Fedora 19 (gcc 4.8.2)

GFFT 0.2

Intel Core Duo T2300 1.66GHz, 1Gb RAM Linux Suse 11.1 (gcc 4.3.2, Intel C++ 11); Windows XP MSVC8
Intel Core 2 Duo T6600 2.2GHz, 4Gb RAM Windows 7 MSVC 10 Beta2

Last updated: