Prime-factor FFT algorithm

The Prime-factor FFT algorithm (PFA), also called the Good-Thomas algorithm, is a Fast Fourier Transform (FFT) algorithm that re-expresses a the discrete Fourier transform (DFT) of a size n = n₁n₂ as a two-dimensional n₁ by n₂ DFT, but only for the case where n₁ and n₂ are relatively prime. These smaller transforms of size n₁ and n₂ can then be evaluated by applying PFA recursively or by using some other FFT algorithm.

The popular Cooley-Tukey algorithm also subdivides a DFT of size n into smaller transforms of size n₁ and n₂, but it has the disadvantage that it also requires additional multiplications by roots of unity called twiddle factors in addition to the smaller transforms. On the other hand, the PFA has the disadvantages that it only works for relatively prime factors and requires a more complicated re-indexing of the input data based on the Chinese Remainder Theorem (CRT).

(It is interesting to note that Good's 1958 paper was the only prior FFT work cited by Cooley and Tukey in 1965, as they were not then aware of the earlier research by Gauss and others.)

The PFA algorithm is also closely related to the Winograd FFT algorithm, where the latter performs the decomposed n₁ by n₂ via more sophisticated two-dimensional convolution techniques. Some older papers therefore also call Winograd's algorithm a PFA FFT. (Outside of the FFT literature, some people also confusingly refer to the mixed-radix Cooley-Tukey algorithm as a "prime-factor" FFT.)

Algorithm

Recall that the DFT is defined by the formula:

f_{j}={\frac {1}{n}}\sum _{k=0}^{n-1}x_{k}e^{-{\frac {2\pi i}{n}}jk}\qquad j=0,\dots ,n-1.

The PFA involves a re-indexing of the input and output arrays, which when substituted into the DFT formula transforms it into two nested DFTs (a two-dimensional DFT).

Re-indexing

Suppose that n = n₁n₂, where n₁ and n₂ are relatively prime. In this case, we can define a one-to-one re-indexing of the input k and output j by:

k=k_{1}n_{2}+k_{2}n_{1}\mod n,

j=j_{1}n_{2}^{-1}n_{2}+j_{2}n_{1}^{-1}n_{1}\mod n,

where n₁^-1 is the multiplicative inverse modulo n₂ and vice-versa for n₂^-1; the indices j_a and k_a run from 0,...,n_a-1 (for a = 1, 2). These inverses only exist for relatively prime n₁ and n₂, and that condition is also required for the mappings to be one-to-one.

This re-indexing of k is called the Ruritanian mapping, while the re-indexing of j is called the CRT mapping. The latter refers to the fact that j is the solution to the Chinese remainder problem j = j₁ mod n₁ and j = j₂ mod n₂.

(One could instead use the Ruritanian mapping for the output j and the CRT mapping for the input k, or various intermediate cases.)

A great deal of research has been devoted to schemes for evaluating this re-indexing efficiently, ideally in-place, while minimizing the number of costly modulo operations (Chan, 1991, and references).

DFT re-expression

The above re-indexing is then substituted into the formula for the DFT, and in particular into the product jk in the exponent. Because e^2πi = 1, this exponent is evaluated modulo n: any n₁n₂ = n cross term in the jk product can be dropped. (Similarly, f_j and x_k are implicitly periodic in n, so their subscripts are evaluated modulo n.) The remaining-terms give:

f_{j_{1}n_{2}^{-1}n_{2}+j_{2}n_{1}^{-1}n_{1}}={\frac {1}{n_{1}}}\sum _{k_{1}=0}^{n_{1}-1}\left({\frac {1}{n_{2}}}\sum _{k_{2}=0}^{n_{2}-1}x_{k_{1}n_{2}+k_{2}n_{1}}e^{-{\frac {2\pi i}{n_{2}}}j_{2}k_{2}}\right)e^{-{\frac {2\pi i}{n_{1}}}j_{1}k_{1}}.

The inner sum is simply a DFT of size n₂, the outer sum is a DFT of size n₁.

(Here, we have used the fact that the n₁^-1n₁ vanish when evaluated modulo n₂ in the inner sum's exponent, and vice-versa for the outer sum's exponent.)

References:

I. J. Good, "The interaction algorithm and practical Fourier analysis," J. R. Statist. Soc. B 20 (2), 361-372 (1958). Addendum, ibid. 22 (2), 373-375 (1960).
L. H. Thomas, "Using a computer to solve problems in physics," in Applications of Digital Computers (Ginn: Boston, 1963).
P. Duhamel and M. Vetterli, "Fast Fourier transforms: a tutorial review and a state of the art," Signal Processing 19, 259-299 (1990).
S. C. Chan and K. L. Ho, "On indexing the prime-factor fast Fourier transform algorithm," IEEE Trans. Circuits and Systems 38 (8), 951-953 (1991).