The Sparse Fourier Transform. Haitham Hassanieh
Чтение книги онлайн.

Читать онлайн книгу The Sparse Fourier Transform - Haitham Hassanieh страница 12

Название: The Sparse Fourier Transform

Автор: Haitham Hassanieh

Издательство: Ingram

Жанр: Программы

Серия: ACM Books

isbn: 9781947487062

isbn:

СКАЧАТЬ

      Iteration-based algorithms use a filter that has a significant mass outside its pass region [Gilbert et al. 2002, Gilbert et al. 2005a, Mansour 1992]. For example, Gilbert et al. [2002] and Gilbert et al. [2005a] set G to the rectangular filter which was shown in Figure 1.3(a), in which case image is the Dirichlet kernel4, whose tail decays in an inverse linear fashion. Since the tail decays slowly, the Fourier coefficients binned to a particular bucket “leak” into other buckets. On the other hand, Mansour [1992] estimates the convolution in time domain via random sampling, which also leads to a large estimation error. To reduce these errors and obtain the 2/2 guarantee, these algorithms have to perform multiple iterations, where each iteration estimates the largest Fourier coefficient (the one least impacted by leakage) and subtracts its contribution to the time signal. The iterative update of the time signal causes a large increase in runtime. The algorithms in Gilbert et al. [2002] and Mansour [1992] perform this update by going through O(k) iterations, each of which updates at least O(k) time samples, resulting in an O(k2) term in the runtime. The algorithm of Gilbert et al. [2005a] introduced a “bulk sampling” algorithm that amortizes this process but it requires solving instances of a non-uniform Fourier transform, which is expensive in practice.

      Interpolation-based algorithms are less common and limited to the design in Iwen [2010b]. This approach uses the aliasing filter presented in Chapter 1, which is a leakage-free filter that allows [Iwen 2010b] to avoid the need for iteration. Recall that in this case, the filter G has Gi = 1 iff i mod n/p = 0 and Gi = 0 otherwise. The Fourier transform of this filter is a “spike train” with period p and hence this filter does not leak; it is equal to 1 on 1/p fraction of coordinates and is zero elsewhere. Unfortunately, however, such a filter requires that p divides n and the algorithm in Iwen [2010b] needs many different values of p. Since in general one cannot assume that n is divisible by all numbers p, the algorithm treats the signal as a continuous function and interpolates it at the required points. Interpolation introduces additional complexity and increases the exponents in the runtime.

       3.1.3 Our Approach

      The key feature of our algorithm is the use of a different type of filter. In the simplest case, we use a filter obtained by convolving a Gaussian function with a box-car function.5 Because of this new filter, our algorithm does not need to either iterate or interpolate. Specifically, the frequency response of our filter image is nearly flat inside the pass region and has an exponential tail outside it. This means that leakage from frequencies in other buckets is negligible, and hence, our algorithm need not iterate. Also, filtering can be performed using the existing input samples xi, and hence our algorithm need not interpolate the signal at new points. Avoiding both iteration and interpolation is the key feature that makes this algorithm efficient.

      Further, once a large coefficient is isolated in a bucket, one needs to identify its frequency. In contrast to past work which typically uses binary search for this task, we adopt an idea from Porat and Strauss [2010] and tailor it to our problem. Specifically, we simply select the set of “large” bins which are likely to contain large coefficients, and directly estimate all frequencies in those bins. To balance the cost of the bin selection and estimation steps, we make the number of bins somewhat larger than the typical value of O(k). Specifically, we use image, which leads to the stated runtime.6

       3.2 Algorithm

      We refer to our algorithm as SFT 1.0 and it is shown in Algorithm 3.1. A key element of this algorithm is the inner loop, which finds and estimates each “large” coefficient with constant probability. In Section 3.2.1 we describe the inner loop, and in Section 3.2.2 we show how to use it to construct the full algorithm.

      Let B be a parameter that divides n, to be determined later. Let G be a (1/B, 1/(2B), δ, ω) flat window function described in Section 2.2.1 for some δ and image. We will have δ ≈ 1/nc, so one can think of it as negligibly small.

      There are two versions of the inner loop: location loops and estimation loops. Location loops, described as the procedure LOCATIONINNERLOOP in Algorithm 3.1, are given a parameter d, and output a set I ⊂ [n] of dkn/B coordinates that contains each large coefficient with “good” probability. Estimation loops, described as the procedure ESTIMATIONINNERLOOP in Algorithm 3.1, are given a set I ⊂ [n] and estimate x̂I such that each coordinate is estimated well with “good” probability.

      By Claim 2.4, we can compute in image time. Location loops thus take image time and estimation loops take image time. Figure 3.1 illustrates the inner loop.

      For estimation loops, we get the following guarantee.

      Lemma 3.1 Let S be the support of the largest k coefficients of , and −S contain the rest. Then for any ≤ 1,

image

      Proof The proof can be found in Appendix A.1. ■

      Furthermore, since image is a good estimate for |x̂i|—the division is mainly useful for fixing the phase. Therefore in location loops, we get the following guarantee:

      Algorithm 3.1 SFT 1.0: Non-iterative Sparse Fourier Transform for image

image

      Figure 3.1 Example inner loop of the algorithm on sparse input. This run has parameters n = 256, k = 4, G being the (0.11, 0.06, 2 × 10−9, 133) flat window function in Figure 2.1, and selecting the top 4 of B = 16 samples. In part (a), the algorithm begins with time domain access to Pσ, τ, bx given by (Pσ, τ, bx)i = xσ(i−τ)ωσbi, which permutes the spectrum of x by permuting the samples in the time domain. In part (b), the algorithm computes the time domain signal y = G СКАЧАТЬ