19 Chapter 11: Matrix & Numerical Algorithms - When Math Meets Speed
From MP3s to MRI Scans: The Algorithms That Process Reality
“The Fast Fourier Transform is the most important numerical algorithm of our lifetime.” - Gilbert Strang, MIT
“And yet, most programmers have no idea how it works, even though they use it every day.” - Every signal processing engineer ever
19.1 11.1 Introduction: The Algorithm That Changed Everything
Here’s a mind-blowing fact: Every time you listen to music on Spotify, watch a YouTube video, make a phone call, or take a digital photo, the Fast Fourier Transform is working behind the scenes. It’s compressing your audio files, cleaning up your voice on calls, filtering your photos, and analyzing medical images.
The FFT is so fundamental that when it was rediscovered in 1965 (it was actually invented earlier but forgotten!), it sparked a revolution. Suddenly, operations that took hours could be done in seconds. This single algorithm enabled:
- Digital audio: MP3, AAC, and every audio codec
- Image processing: JPEG compression, Instagram filters
- Medical imaging: MRI and CT scans
- Telecommunications: 4G/5G, WiFi, everything wireless
- Scientific computing: Climate modeling, quantum physics
- Finance: High-frequency trading, risk analysis
But the FFT is just the beginning. In this chapter, we’ll explore the algorithms that power our digital world: how to multiply matrices fast enough to train neural networks, how to multiply polynomials in O(n log n) time instead of O(n²), and how to make sure floating-point errors don’t destroy your calculations.
What makes these algorithms special?
Unlike the algorithms we’ve studied so far, matrix and numerical algorithms operate in the continuous world of real numbers. We can’t just count comparisons—we have to worry about: - Floating-point errors: Tiny rounding errors that accumulate - Numerical stability: Algorithms that don’t explode with errors - Cache efficiency: Accessing memory in the right order matters HUGELY - Parallelization: Matrix operations are embarrassingly parallel
These aren’t just academic concerns. Ask any quant trader whose model lost millions due to numerical instability. Or any engineer whose signal processing pipeline had mysterious glitches from accumulated rounding errors.
Ready to dive into the beautiful world where algorithms meet calculus? Let’s start with the most influential algorithm of the 20th century: the Fast Fourier Transform.
19.2 11.2 The Fourier Transform: Seeing Through Time
19.2.1 11.2.1 The Big Idea
Imagine you’re listening to a song. You hear it as sound over time—drums, guitar, vocals flowing together. But a sound engineer hears something different: they hear frequencies. They know the bass is vibrating at 100 Hz, the vocals are around 1000 Hz, and there’s a guitar note at 440 Hz.
The Fourier Transform is the mathematical magic that converts between these two views: - Time domain: amplitude over time (what you hear) - Frequency domain: amplitude at each frequency (what the engineer sees)
Time Domain: Frequency Domain:
[audio waveform] ⟺ [frequency spectrum]
"when" "what notes"
Why is this useful?
Once you’re in the frequency domain, you can: - Remove noise: Filter out specific frequencies - Compress: Throw away inaudible frequencies (MP3!) - Analyze: Find the dominant frequencies - Modify: Auto-tune, equalizer effects, etc.
19.2.2 11.2.2 The Mathematical Foundation
For a signal with N samples x₀, x₁, …, x_{N-1}, the Discrete Fourier Transform (DFT) is:
X_k = Σ(n=0 to N-1) x_n * e^(-2πikn/N)
where:
- X_k = amplitude of frequency k
- x_n = sample n in time domain
- e^(-2πikn/N) = complex rotation (Euler's formula)
- k ranges from 0 to N-1
In plain English: to find the amplitude at frequency k, we multiply each time sample by a complex exponential at that frequency and sum everything up.
The complex exponential e^(iθ) = cos(θ) + i*sin(θ) might look scary, but it’s just a rotation in the complex plane. Think of it as asking “how much of frequency k is present in the signal?”
19.2.3 11.2.3 The Naive DFT: O(N²)
The straightforward implementation computes each X_k independently:
def dft_naive(x):
"""
Naive Discrete Fourier Transform.
Time complexity: O(N²)
"""
import numpy as np
N = len(x)
X = np.zeros(N, dtype=complex)
for k in range(N):
for n in range(N):
angle = -2 * np.pi * k * n / N
X[k] += x[n] * np.exp(1j * angle)
return XThe problem: For N = 1,000,000 samples (about 22 seconds of audio at 44.1 kHz), this requires 1 trillion complex multiplications. That’s slow!
Can we do better?
19.3 11.3 The Fast Fourier Transform: A Divide-and-Conquer Miracle
19.3.1 11.3.1 The Key Insight
The FFT (Cooley-Tukey algorithm, 1965) is based on a brilliant observation: the DFT has symmetry we can exploit.
Main idea: Split the DFT into odd and even indices:
X_k = Σ(even n) x_n * e^(-2πikn/N) + Σ(odd n) x_n * e^(-2πikn/N)
Let n = 2m for even, n = 2m+1 for odd:
X_k = Σ(m=0 to N/2-1) x_2m * e^(-2πik(2m)/N) +
Σ(m=0 to N/2-1) x_(2m+1) * e^(-2πik(2m+1)/N)
= Σ(m=0 to N/2-1) x_2m * e^(-2πikm/(N/2)) +
e^(-2πik/N) * Σ(m=0 to N/2-1) x_(2m+1) * e^(-2πikm/(N/2))
= E_k + e^(-2πik/N) * O_k
Where: - E_k = DFT of even-indexed samples - O_k = DFT of odd-indexed samples
This is huge! We’ve reduced a size-N DFT to two size-N/2 DFTs!
19.3.2 11.3.2 The Recursive Algorithm
Keep dividing until you reach size 1 (base case: DFT of one element is itself).
Time complexity analysis:
T(N) = 2T(N/2) + O(N) (two recursive calls + combining)
= O(N log N) (by Master Theorem)
The speedup: For N = 1,000,000: - Naive: 1,000,000² = 1,000,000,000,000 operations - FFT: 1,000,000 × log₂(1,000,000) ≈ 20,000,000 operations - Speedup: 50,000x faster!
This is why FFT changed everything. What took hours now takes milliseconds.
19.3.3 11.3.3 Implementation: Recursive FFT
import numpy as np
def fft_recursive(x):
"""
Recursive Fast Fourier Transform.
Time: O(N log N)
Space: O(N log N) due to recursion
Args:
x: Input array (length must be power of 2)
Returns:
FFT of x
"""
N = len(x)
# Base case
if N <= 1:
return x
# Divide
even = fft_recursive(x[0::2]) # Even indices
odd = fft_recursive(x[1::2]) # Odd indices
# Conquer
T = np.exp(-2j * np.pi * np.arange(N) / N)
# Combine (using symmetry: X_{k+N/2} = E_k - T_{k+N/2} * O_k)
return np.concatenate([
even + T[:N//2] * odd,
even + T[N//2:] * odd
])
def ifft_recursive(X):
"""
Inverse FFT: convert frequency domain back to time domain.
Trick: IFFT(X) = conj(FFT(conj(X))) / N
"""
N = len(X)
return np.conj(fft_recursive(np.conj(X))) / NLet’s trace through a small example:
def fft_trace_example():
"""Trace FFT execution on a small signal."""
x = np.array([1, 2, 3, 4])
print("Input signal: ", x)
print("\nRecursive breakdown:")
print(" Level 1: [1,2,3,4]")
print(" ↓")
print(" Level 2: [1,3] (even) [2,4] (odd)")
print(" ↓ ↓")
print(" Level 3: [1] [3] [2] [4]")
print("\nCombining back up:")
# Manual calculation
even = fft_recursive([1, 3])
odd = fft_recursive([2, 4])
print(f" Even DFT: {even}")
print(f" Odd DFT: {odd}")
result = fft_recursive(x)
print(f"\nFinal FFT: {result}")
# Verify with numpy
numpy_result = np.fft.fft(x)
print(f"NumPy FFT: {numpy_result}")
print(f"Match: {np.allclose(result, numpy_result)}")19.3.4 11.3.4 Iterative FFT (Cooley-Tukey)
The recursive version is elegant but uses O(N log N) space. The iterative version uses O(N) space and is faster in practice:
def fft_iterative(x):
"""
Iterative FFT using bit-reversal.
Time: O(N log N)
Space: O(N)
This is the version used in production!
"""
N = len(x)
# Check that N is a power of 2
if N & (N - 1) != 0:
raise ValueError("N must be a power of 2")
# Bit-reversal permutation
x = x.copy()
j = 0
for i in range(1, N):
bit = N >> 1
while j & bit:
j ^= bit
bit >>= 1
j ^= bit
if i < j:
x[i], x[j] = x[j], x[i]
# FFT computation
length = 2
while length <= N:
angle = -2 * np.pi / length
wlen = np.exp(1j * angle)
for i in range(0, N, length):
w = 1
for j in range(length // 2):
u = x[i + j]
v = x[i + j + length // 2] * w
x[i + j] = u + v
x[i + j + length // 2] = u - v
w *= wlen
length *= 2
return x
def ifft_iterative(X):
"""Inverse FFT (iterative)."""
N = len(X)
# Conjugate, FFT, conjugate, scale
return np.conj(fft_iterative(np.conj(X))) / NThe bit-reversal trick: The iterative FFT processes samples in “bit-reversed” order. For N=8:
Original: 0 1 2 3 4 5 6 7
Binary: 000 001 010 011 100 101 110 111
Reversed: 000 100 010 110 001 101 011 111
Result: 0 4 2 6 1 5 3 7
This reordering allows us to process the FFT in-place, bottom-up, instead of top-down recursively.
19.3.5 11.3.5 Complete FFT Implementation
class FFT:
"""
Comprehensive FFT implementation with utilities.
"""
@staticmethod
def fft(x, method='iterative'):
"""
Compute FFT of signal x.
Args:
x: Input signal (1D array)
method: 'recursive' or 'iterative'
Returns:
Frequency domain representation
"""
x = np.asarray(x, dtype=complex)
# Pad to power of 2
N = len(x)
N_padded = 1 << (N - 1).bit_length()
if N_padded != N:
x = np.pad(x, (0, N_padded - N), mode='constant')
if method == 'recursive':
return fft_recursive(x)
else:
return fft_iterative(x)
@staticmethod
def ifft(X, method='iterative'):
"""Inverse FFT."""
X = np.asarray(X, dtype=complex)
if method == 'recursive':
return ifft_recursive(X)
else:
return ifft_iterative(X)
@staticmethod
def rfft(x):
"""
Real FFT: optimized FFT for real-valued signals.
Takes advantage of the fact that FFT of real signal
has Hermitian symmetry: X[k] = conj(X[N-k])
Returns only positive frequencies (N/2 + 1 values).
"""
X = FFT.fft(x)
return X[:len(X)//2 + 1]
@staticmethod
def fftfreq(n, d=1.0):
"""
Return frequency bins for FFT output.
Args:
n: Number of samples
d: Sample spacing (1/sample_rate)
Returns:
Array of frequency values
"""
val = 1.0 / (n * d)
results = np.arange(0, n, dtype=int)
N = (n - 1) // 2 + 1
results[:N] = np.arange(0, N, dtype=int)
results[N:] = np.arange(-(n // 2), 0, dtype=int)
return results * val
@staticmethod
def fftshift(x):
"""
Shift zero-frequency component to center.
Useful for visualization.
"""
n = len(x)
p2 = (n + 1) // 2
return np.concatenate((x[p2:], x[:p2]))
@staticmethod
def power_spectrum(x):
"""
Compute power spectrum (magnitude squared).
Shows energy at each frequency.
"""
X = FFT.fft(x)
return np.abs(X) ** 2
@staticmethod
def spectrogram(signal, window_size=256, hop_size=128):
"""
Compute spectrogram (FFT over time).
Args:
signal: Time-domain signal
window_size: Size of FFT window
hop_size: Distance between windows
Returns:
2D array: frequencies × time
"""
n_windows = (len(signal) - window_size) // hop_size + 1
specgram = np.zeros((window_size // 2 + 1, n_windows))
# Hamming window to reduce spectral leakage
window = np.hamming(window_size)
for i in range(n_windows):
start = i * hop_size
segment = signal[start:start + window_size] * window
spectrum = FFT.rfft(segment)
specgram[:, i] = np.abs(spectrum)
return specgram
# Example usage and visualization
def example_fft_basics():
"""Demonstrate basic FFT usage."""
import matplotlib.pyplot as plt
# Create a signal: combination of two sine waves
sample_rate = 1000 # Hz
duration = 1.0 # seconds
t = np.linspace(0, duration, int(sample_rate * duration), endpoint=False)
# Signal: 50 Hz + 120 Hz
signal = np.sin(2 * np.pi * 50 * t) + 0.5 * np.sin(2 * np.pi * 120 * t)
# Add some noise
signal += 0.1 * np.random.randn(len(t))
# Compute FFT
spectrum = FFT.fft(signal)
freqs = FFT.fftfreq(len(signal), 1/sample_rate)
# Plot
fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(12, 8))
# Time domain
ax1.plot(t[:200], signal[:200])
ax1.set_xlabel('Time (s)')
ax1.set_ylabel('Amplitude')
ax1.set_title('Time Domain Signal')
ax1.grid(True)
# Frequency domain (positive frequencies only)
positive_freqs = freqs[:len(freqs)//2]
positive_spectrum = np.abs(spectrum[:len(spectrum)//2])
ax2.plot(positive_freqs, positive_spectrum)
ax2.set_xlabel('Frequency (Hz)')
ax2.set_ylabel('Magnitude')
ax2.set_title('Frequency Domain (FFT)')
ax2.set_xlim([0, 200])
ax2.grid(True)
# Mark the two peaks
ax2.axvline(x=50, color='r', linestyle='--', label='50 Hz')
ax2.axvline(x=120, color='g', linestyle='--', label='120 Hz')
ax2.legend()
plt.tight_layout()
plt.savefig('fft_example.png', dpi=150)
plt.show()
print("Signal contains frequencies at 50 Hz and 120 Hz")
print("FFT clearly shows these peaks!")
def example_audio_filtering():
"""Demonstrate audio filtering with FFT."""
# Create noisy audio
sample_rate = 8000
duration = 2.0
t = np.linspace(0, duration, int(sample_rate * duration), endpoint=False)
# Clean signal: musical note at 440 Hz (A4)
clean = np.sin(2 * np.pi * 440 * t)
# Add high-frequency noise
noise = 0.3 * np.sin(2 * np.pi * 3000 * t)
noisy = clean + noise
# Filter using FFT
spectrum = FFT.fft(noisy)
freqs = FFT.fftfreq(len(noisy), 1/sample_rate)
# Low-pass filter: remove frequencies above 1000 Hz
cutoff = 1000
spectrum[np.abs(freqs) > cutoff] = 0
# Convert back to time domain
filtered = np.real(FFT.ifft(spectrum))
print(f"Original signal power: {np.sum(noisy**2):.2f}")
print(f"Filtered signal power: {np.sum(filtered**2):.2f}")
print(f"Noise removed: {(1 - np.sum(filtered**2)/np.sum(noisy**2))*100:.1f}%")
return clean, noisy, filtered
def benchmark_fft():
"""Benchmark FFT vs naive DFT."""
import time
sizes = [64, 128, 256, 512, 1024, 2048]
print("FFT Performance Benchmark")
print("=" * 50)
print(f"{'Size':>6} {'Naive (ms)':>12} {'FFT (ms)':>12} {'Speedup':>10}")
print("-" * 50)
for N in sizes:
x = np.random.randn(N)
if N <= 512: # Naive is too slow for larger sizes
start = time.time()
_ = dft_naive(x)
naive_time = (time.time() - start) * 1000
else:
naive_time = float('inf')
start = time.time()
_ = FFT.fft(x)
fft_time = (time.time() - start) * 1000
speedup = naive_time / fft_time if naive_time != float('inf') else float('inf')
print(f"{N:6d} {naive_time:12.2f} {fft_time:12.3f} {speedup:10.1f}x")
if __name__ == "__main__":
print("=== FFT Basics ===")
example_fft_basics()
print("\n=== Audio Filtering ===")
example_audio_filtering()
print("\n=== Performance Benchmark ===")
benchmark_fft()19.4 11.4 Polynomial Multiplication: FFT’s Secret Superpower
19.4.1 11.4.1 The Problem
You have two polynomials:
A(x) = a₀ + a₁x + a₂x² + ... + aₙxⁿ
B(x) = b₀ + b₁x + b₂x² + ... + bₘxᵐ
Goal: Compute C(x) = A(x) × B(x)
Naive approach: Multiply every term in A by every term in B.
def multiply_polynomials_naive(A, B):
"""
Naive polynomial multiplication.
Time: O(n²)
"""
n = len(A)
m = len(B)
result = [0] * (n + m - 1)
for i in range(n):
for j in range(m):
result[i + j] += A[i] * B[j]
return resultExample:
A(x) = 1 + 2x + 3x² → [1, 2, 3]
B(x) = 4 + 5x + 6x² → [4, 5, 6]
C(x) = A(x) × B(x)
= 4 + 13x + 28x² + 27x³ + 18x⁴
= [4, 13, 28, 27, 18]
19.4.2 11.4.2 The FFT Trick
Here’s the key insight: polynomial multiplication in coefficient form is convolution, and convolution in time domain is multiplication in frequency domain!
A(x) × B(x) in coefficient form
↓ FFT
A(ω) × B(ω) in frequency domain (pointwise multiply)
↓ IFFT
C(x) back to coefficients
Why does this work?
Think of polynomials as signals. The coefficients are like time-domain samples. When we multiply polynomials, we’re convolving their coefficients. By the convolution theorem:
FFT(A × B) = FFT(A) * FFT(B) (pointwise multiplication)
Time complexity: - Naive: O(n²) - FFT method: O(n log n)
For large polynomials, this is HUGE!
19.4.3 11.4.3 Implementation
def multiply_polynomials_fft(A, B):
"""
Fast polynomial multiplication using FFT.
Time: O(n log n)
Args:
A, B: Polynomial coefficients (lowest degree first)
Returns:
Coefficients of A(x) × B(x)
"""
# Result will have degree deg(A) + deg(B)
result_size = len(A) + len(B) - 1
# Pad to next power of 2
n = 1 << (result_size - 1).bit_length()
A_padded = np.pad(A, (0, n - len(A)), mode='constant')
B_padded = np.pad(B, (0, n - len(B)), mode='constant')
# Transform to frequency domain
A_freq = FFT.fft(A_padded)
B_freq = FFT.fft(B_padded)
# Pointwise multiplication
C_freq = A_freq * B_freq
# Transform back
C = FFT.ifft(C_freq)
# Extract result (real part, truncate)
return np.real(C[:result_size])
# Example and verification
def example_polynomial_multiplication():
"""Demonstrate fast polynomial multiplication."""
# A(x) = 1 + 2x + 3x²
A = np.array([1, 2, 3])
# B(x) = 4 + 5x + 6x²
B = np.array([4, 5, 6])
# Naive method
result_naive = multiply_polynomials_naive(A, B)
# FFT method
result_fft = multiply_polynomials_fft(A, B)
print("A(x) = 1 + 2x + 3x²")
print("B(x) = 4 + 5x + 6x²")
print()
print(f"Naive result: {result_naive}")
print(f"FFT result: {np.round(result_fft).astype(int)}")
print()
print("C(x) = 4 + 13x + 28x² + 27x³ + 18x⁴")
# Verify they match
assert np.allclose(result_naive, result_fft)
print("✓ Results match!")
def benchmark_polynomial_multiplication():
"""Benchmark polynomial multiplication methods."""
import time
sizes = [10, 50, 100, 500, 1000, 5000]
print("\nPolynomial Multiplication Benchmark")
print("=" * 60)
print(f"{'Degree':>8} {'Naive (ms)':>14} {'FFT (ms)':>12} {'Speedup':>10}")
print("-" * 60)
for n in sizes:
A = np.random.randn(n)
B = np.random.randn(n)
if n <= 1000:
start = time.time()
_ = multiply_polynomials_naive(A, B)
naive_time = (time.time() - start) * 1000
else:
naive_time = float('inf')
start = time.time()
_ = multiply_polynomials_fft(A, B)
fft_time = (time.time() - start) * 1000
speedup = naive_time / fft_time if naive_time != float('inf') else float('inf')
print(f"{n:8d} {naive_time:14.2f} {fft_time:12.3f} {speedup:10.1f}x")
if __name__ == "__main__":
example_polynomial_multiplication()
benchmark_polynomial_multiplication()19.4.4 11.4.4 Application: Big Integer Multiplication
You can use FFT to multiply huge integers in O(n log n) time!
def multiply_large_integers(a, b, base=10):
"""
Multiply large integers using FFT.
Args:
a, b: Integers as strings
base: Number base (10 for decimal)
Returns:
Product as string
"""
# Convert to digit arrays
a_digits = [int(d) for d in reversed(a)]
b_digits = [int(d) for d in reversed(b)]
# Multiply as polynomials
result = multiply_polynomials_fft(a_digits, b_digits)
# Handle carries
result = np.round(result).astype(int)
carry = 0
for i in range(len(result)):
result[i] += carry
carry = result[i] // base
result[i] %= base
while carry:
result = np.append(result, carry % base)
carry //= base
# Convert back to string
return ''.join(str(d) for d in reversed(result)).lstrip('0') or '0'
# Example
def example_big_integer_multiplication():
"""Multiply huge numbers using FFT."""
a = "12345678901234567890"
b = "98765432109876543210"
result = multiply_large_integers(a, b)
print(f"a = {a}")
print(f"b = {b}")
print(f"a × b = {result}")
# Verify with Python's built-in
expected = str(int(a) * int(b))
assert result == expected
print("✓ Correct!")
if __name__ == "__main__":
example_big_integer_multiplication()19.5 11.5 Matrix Operations: The Foundation of Modern Computing
19.5.1 11.5.1 Why Matrices Matter
Matrices are EVERYWHERE in modern computing: - Machine learning: Neural networks are just matrix multiplications - Computer graphics: Every 3D transformation is a matrix - Scientific computing: Solving systems of equations - Recommendation systems: Collaborative filtering - Image processing: Convolution, filtering - Quantum computing: Quantum gates are matrices
The speed of matrix operations directly determines: - How fast you can train neural networks - How realistic video games look - How quickly simulations run - Whether real-time applications are possible
19.5.2 11.5.2 Matrix Multiplication: The Naive Way
Given: A (m×n) and B (n×p)
Compute: C = AB (m×p)
def matmul_naive(A, B):
"""
Naive matrix multiplication.
Time: O(mnp)
"""
m, n = A.shape
n2, p = B.shape
assert n == n2, "Incompatible dimensions"
C = np.zeros((m, p))
for i in range(m):
for j in range(p):
for k in range(n):
C[i, j] += A[i, k] * B[k, j]
return CFor square matrices (n×n), this is O(n³).
Can we do better?
Yes! But it’s complicated…
19.5.3 11.5.3 Strassen’s Algorithm: O(n^2.807)
Volker Strassen (1969) discovered that you can multiply 2×2 matrices with only 7 multiplications instead of 8:
Normal method (8 multiplications):
[a b] [e f] [ae+bg af+bh]
[c d] × [g h] = [ce+dg cf+dh]
Strassen’s method (7 multiplications):
M₁ = (a + d)(e + h)
M₂ = (c + d)e
M₃ = a(f - h)
M₄ = d(g - e)
M₅ = (a + b)h
M₆ = (c - a)(e + f)
M₇ = (b - d)(g + h)
Then:
C₁₁ = M₁ + M₄ - M₅ + M₇
C₁₂ = M₃ + M₅
C₂₁ = M₂ + M₄
C₂₂ = M₁ - M₂ + M₃ + M₆
By recursively applying this, you get O(n^log₂(7)) ≈ O(n^2.807).
def strassen_matmul(A, B):
"""
Strassen's matrix multiplication algorithm.
Time: O(n^2.807)
Faster than naive for large matrices, but overhead
makes it slower for small matrices.
"""
n = len(A)
# Base case: use naive for small matrices
if n <= 64:
return matmul_naive(A, B)
# Pad to even size
if n % 2 == 1:
A = np.pad(A, ((0, 1), (0, 1)), mode='constant')
B = np.pad(B, ((0, 1), (0, 1)), mode='constant')
n += 1
# Divide matrices into quadrants
mid = n // 2
A11, A12 = A[:mid, :mid], A[:mid, mid:]
A21, A22 = A[mid:, :mid], A[mid:, mid:]
B11, B12 = B[:mid, :mid], B[:mid, mid:]
B21, B22 = B[mid:, :mid], B[mid:, mid:]
# Compute the 7 products
M1 = strassen_matmul(A11 + A22, B11 + B22)
M2 = strassen_matmul(A21 + A22, B11)
M3 = strassen_matmul(A11, B12 - B22)
M4 = strassen_matmul(A22, B21 - B11)
M5 = strassen_matmul(A11 + A12, B22)
M6 = strassen_matmul(A21 - A11, B11 + B12)
M7 = strassen_matmul(A12 - A22, B21 + B22)
# Combine
C11 = M1 + M4 - M5 + M7
C12 = M3 + M5
C21 = M2 + M4
C22 = M1 - M2 + M3 + M6
# Assemble result
C = np.vstack([
np.hstack([C11, C12]),
np.hstack([C21, C22])
])
return C[:len(A), :len(B[0])]In practice: Strassen’s algorithm is rarely used because: - ❌ Constant factors are large - ❌ More numerically unstable - ❌ Less cache-friendly - ❌ Complicated to implement
Modern matrix libraries use blocked algorithms instead (see below).
19.5.4 11.5.4 Cache-Efficient Matrix Multiplication
The real bottleneck in matrix multiplication isn’t the number of operations—it’s memory access!
Modern CPUs are FAST, but memory is SLOW. The key is using the cache efficiently.
Blocked (Tiled) Matrix Multiplication:
def matmul_blocked(A, B, block_size=64):
"""
Blocked matrix multiplication for better cache usage.
Time: Still O(n³), but with better constants!
The block size should fit in L1 cache.
"""
m, n = A.shape
n2, p = B.shape
assert n == n2
C = np.zeros((m, p))
# Iterate in blocks
for i0 in range(0, m, block_size):
for j0 in range(0, p, block_size):
for k0 in range(0, n, block_size):
# Define block boundaries
i_end = min(i0 + block_size, m)
j_end = min(j0 + block_size, p)
k_end = min(k0 + block_size, n)
# Multiply blocks
for i in range(i0, i_end):
for j in range(j0, j_end):
for k in range(k0, k_end):
C[i, j] += A[i, k] * B[k, j]
return CWhy is this faster?
When blocks fit in cache, we get many more cache hits. For a 1000×1000 matrix: - Naive: ~1 billion cache misses - Blocked: ~15 million cache misses - Speedup: 5-10x just from better cache usage!
19.5.5 11.5.5 Matrix Operations Library
class MatrixOps:
"""
Efficient matrix operations with multiple implementations.
"""
@staticmethod
def multiply(A, B, method='blocked'):
"""
Matrix multiplication with choice of algorithm.
Args:
A, B: Input matrices
method: 'naive', 'blocked', 'strassen', or 'numpy'
"""
if method == 'naive':
return matmul_naive(A, B)
elif method == 'blocked':
return matmul_blocked(A, B)
elif method == 'strassen':
return strassen_matmul(A, B)
else: # numpy (uses BLAS)
return np.dot(A, B)
@staticmethod
def transpose(A):
"""Transpose matrix (swap rows and columns)."""
return A.T
@staticmethod
def power(A, n):
"""
Compute A^n efficiently using binary exponentiation.
Time: O(log n) matrix multiplications
"""
if n == 0:
return np.eye(len(A))
if n == 1:
return A.copy()
if n % 2 == 0:
half = MatrixOps.power(A, n // 2)
return MatrixOps.multiply(half, half)
else:
return MatrixOps.multiply(A, MatrixOps.power(A, n - 1))
@staticmethod
def solve_triangular(A, b, lower=True):
"""
Solve triangular system Ax = b.
Time: O(n²)
Args:
A: Triangular matrix
b: Right-hand side
lower: True if A is lower triangular
"""
n = len(b)
x = np.zeros(n)
if lower:
# Forward substitution
for i in range(n):
x[i] = b[i]
for j in range(i):
x[i] -= A[i, j] * x[j]
x[i] /= A[i, i]
else:
# Backward substitution
for i in range(n - 1, -1, -1):
x[i] = b[i]
for j in range(i + 1, n):
x[i] -= A[i, j] * x[j]
x[i] /= A[i, i]
return x
@staticmethod
def lu_decomposition(A):
"""
LU decomposition: A = LU
L is lower triangular, U is upper triangular
Time: O(n³)
"""
n = len(A)
L = np.eye(n)
U = A.copy()
for k in range(n - 1):
for i in range(k + 1, n):
L[i, k] = U[i, k] / U[k, k]
U[i, k:] -= L[i, k] * U[k, k:]
return L, U
@staticmethod
def solve_linear_system(A, b):
"""
Solve Ax = b using LU decomposition.
Time: O(n³)
"""
L, U = MatrixOps.lu_decomposition(A)
# Solve Ly = b
y = MatrixOps.solve_triangular(L, b, lower=True)
# Solve Ux = y
x = MatrixOps.solve_triangular(U, y, lower=False)
return x
@staticmethod
def determinant(A):
"""
Compute determinant using LU decomposition.
Time: O(n³)
"""
L, U = MatrixOps.lu_decomposition(A)
# det(A) = det(L) * det(U) = 1 * product of U diagonal
return np.prod(np.diag(U))
@staticmethod
def inverse(A):
"""
Compute matrix inverse using LU decomposition.
Time: O(n³)
"""
n = len(A)
A_inv = np.zeros((n, n))
# Solve Ax = e_i for each column
for i in range(n):
e_i = np.zeros(n)
e_i[i] = 1
A_inv[:, i] = MatrixOps.solve_linear_system(A, e_i)
return A_inv
# Examples
def example_matrix_operations():
"""Demonstrate various matrix operations."""
print("=== Matrix Operations ===\n")
# Create test matrices
A = np.array([[2, 1], [5, 7]])
B = np.array([[3, 8], [4, 6]])
print("Matrix A:")
print(A)
print("\nMatrix B:")
print(B)
# Multiplication
C = MatrixOps.multiply(A, B)
print("\nA × B:")
print(C)
# Power
A_cubed = MatrixOps.power(A, 3)
print("\nA³:")
print(A_cubed)
# Determinant
det_A = MatrixOps.determinant(A)
print(f"\ndet(A) = {det_A}")
# Inverse
A_inv = MatrixOps.inverse(A)
print("\nA⁻¹:")
print(A_inv)
# Verify: A × A⁻¹ = I
identity = MatrixOps.multiply(A, A_inv)
print("\nA × A⁻¹ (should be identity):")
print(np.round(identity, 10))
# Solve linear system
b = np.array([1, 2])
x = MatrixOps.solve_linear_system(A, b)
print(f"\nSolve Ax = b where b = {b}")
print(f"Solution x = {x}")
print(f"Verification Ax = {np.dot(A, x)}")
def benchmark_matrix_multiplication():
"""Benchmark different matrix multiplication methods."""
import time
sizes = [64, 128, 256, 512, 1024]
print("\n=== Matrix Multiplication Benchmark ===\n")
print(f"{'Size':>6} {'Naive':>10} {'Blocked':>10} {'NumPy':>10} {'Speedup':>10}")
print("-" * 56)
for n in sizes:
A = np.random.randn(n, n)
B = np.random.randn(n, n)
# Naive (only for small matrices)
if n <= 256:
start = time.time()
_ = matmul_naive(A, B)
naive_time = time.time() - start
else:
naive_time = float('inf')
# Blocked
start = time.time()
_ = matmul_blocked(A, B)
blocked_time = time.time() - start
# NumPy (highly optimized BLAS)
start = time.time()
_ = np.dot(A, B)
numpy_time = time.time() - start
speedup = blocked_time / numpy_time
naive_str = f"{naive_time:.3f}s" if naive_time != float('inf') else "too slow"
print(f"{n:6d} {naive_str:>10} {blocked_time:10.3f}s {numpy_time:10.3f}s {speedup:10.1f}x")
if __name__ == "__main__":
example_matrix_operations()
benchmark_matrix_multiplication()19.6 11.6 Numerical Stability: When Math Meets Reality
19.6.1 11.6.1 The Floating-Point Problem
Computers can’t represent real numbers exactly. They use floating-point numbers with limited precision.
def demonstrate_floating_point_errors():
"""Show how floating-point arithmetic can be surprising."""
print("=== Floating-Point Surprises ===\n")
# Addition is not associative!
a = (0.1 + 0.2) + 0.3
b = 0.1 + (0.2 + 0.3)
print(f"(0.1 + 0.2) + 0.3 = {a}")
print(f"0.1 + (0.2 + 0.3) = {b}")
print(f"Equal? {a == b}\n")
# Small numbers can disappear
big = 1e16
small = 1.0
result = (big + small) - big
print(f"(1e16 + 1) - 1e16 = {result}")
print(f"Expected: 1.0, Got: {result}\n")
# Cancellation error
x = 1.0
for _ in range(1000000):
x += 1e-10
x -= 1e-10
print(f"Start with 1.0, add/subtract 1e-10 million times")
print(f"Expected: 1.0, Got: {x}\n")19.6.2 11.6.2 Condition Numbers
The condition number measures how sensitive a problem is to small changes in input.
For solving Ax = b:
κ(A) = ||A|| × ||A⁻¹||
If κ(A) is large, the problem is ill-conditioned:
small changes in A or b cause large changes in x.
def compute_condition_number(A):
"""
Compute condition number of matrix A.
Large condition number = ill-conditioned = numerical problems!
"""
A_inv = np.linalg.inv(A)
norm_A = np.linalg.norm(A)
norm_A_inv = np.linalg.norm(A_inv)
return norm_A * norm_A_inv
def demonstrate_ill_conditioning():
"""Show problems with ill-conditioned matrices."""
print("=== Ill-Conditioned Systems ===\n")
# Well-conditioned matrix
A_good = np.array([[4, 1], [1, 3]], dtype=float)
b = np.array([1, 2], dtype=float)
x = np.linalg.solve(A_good, b)
cond = compute_condition_number(A_good)
print("Well-conditioned system:")
print(f"Condition number: {cond:.2f}")
print(f"Solution: {x}\n")
# Ill-conditioned matrix (Hilbert matrix)
n = 5
A_bad = np.array([[1/(i+j+1) for j in range(n)] for i in range(n)])
b = np.ones(n)
x = np.linalg.solve(A_bad, b)
cond = compute_condition_number(A_bad)
print("Ill-conditioned system (Hilbert matrix):")
print(f"Condition number: {cond:.2e}")
print(f"Solution: {x}")
# Small perturbation
b_perturbed = b + 1e-10 * np.random.randn(n)
x_perturbed = np.linalg.solve(A_bad, b_perturbed)
relative_change = np.linalg.norm(x - x_perturbed) / np.linalg.norm(x)
print(f"\nAfter tiny perturbation (1e-10):")
print(f"Relative change in solution: {relative_change:.2e}")
print("Small input change → HUGE output change!")19.6.3 11.6.3 Numerically Stable Algorithms
Unstable algorithm: Subtracting nearly equal numbers
def quadratic_formula_unstable(a, b, c):
"""
Unstable: loses precision when b² >> 4ac.
"""
discriminant = np.sqrt(b**2 - 4*a*c)
x1 = (-b + discriminant) / (2*a) # Cancellation!
x2 = (-b - discriminant) / (2*a)
return x1, x2
def quadratic_formula_stable(a, b, c):
"""
Stable: avoids catastrophic cancellation.
"""
discriminant = np.sqrt(b**2 - 4*a*c)
if b > 0:
x1 = (-b - discriminant) / (2*a)
x2 = c / (a * x1) # Use Vieta's formula
else:
x1 = (-b + discriminant) / (2*a)
x2 = c / (a * x1)
return x1, x2
def demonstrate_numerical_stability():
"""Show importance of numerically stable algorithms."""
print("=== Numerical Stability ===\n")
# Coefficients where b² >> 4ac
a, b, c = 1, 1e8, 1
print(f"Solving x² + {b}x + {c} = 0\n")
x1_unstable, x2_unstable = quadratic_formula_unstable(a, b, c)
x1_stable, x2_stable = quadratic_formula_stable(a, b, c)
print("Unstable algorithm:")
print(f" x1 = {x1_unstable}")
print(f" x2 = {x2_unstable}\n")
print("Stable algorithm:")
print(f" x1 = {x1_stable}")
print(f" x2 = {x2_stable}\n")
# Verify solutions
true_x1, true_x2 = -1e-8, -1e8
error_unstable = abs(x1_unstable - true_x1)
error_stable = abs(x1_stable - true_x1)
print(f"True x1: {true_x1}")
print(f"Unstable error: {error_unstable:.2e}")
print(f"Stable error: {error_stable:.2e}")
print(f"Improvement: {error_unstable / error_stable:.0f}x better!")19.6.4 11.6.4 Best Practices for Numerical Computing
class NumericalBestPractices:
"""Guidelines for writing numerically stable code."""
@staticmethod
def sum_stable(values):
"""
Kahan summation: reduces rounding errors.
Regular sum can accumulate large errors.
Kahan keeps track of lost low-order bits.
"""
total = 0.0
compensation = 0.0
for value in values:
y = value - compensation
t = total + y
compensation = (t - total) - y
total = t
return total
@staticmethod
def log_sum_exp(x):
"""
Compute log(sum(exp(x))) numerically stable.
Direct computation often overflows/underflows.
Used in machine learning (softmax, logsumexp).
"""
x = np.asarray(x)
x_max = np.max(x)
return x_max + np.log(np.sum(np.exp(x - x_max)))
@staticmethod
def safe_divide(a, b, epsilon=1e-10):
"""Avoid division by zero."""
return a / (b + epsilon * np.sign(b))
@staticmethod
def compute_mean_variance_stable(data):
"""
Welford's online algorithm for mean and variance.
More stable than naive two-pass algorithm.
"""
n = 0
mean = 0.0
M2 = 0.0
for x in data:
n += 1
delta = x - mean
mean += delta / n
delta2 = x - mean
M2 += delta * delta2
if n < 2:
return mean, 0.0
variance = M2 / (n - 1)
return mean, variance
# Examples
def example_numerical_best_practices():
"""Demonstrate numerical best practices."""
print("=== Numerical Best Practices ===\n")
# Kahan summation
values = [1e16, 1.0, -1e16] # Should sum to 1.0
naive_sum = sum(values)
kahan_sum = NumericalBestPractices.sum_stable(values)
print("Summing [1e16, 1.0, -1e16]:")
print(f"Naive sum: {naive_sum}")
print(f"Kahan sum: {kahan_sum}")
print(f"Expected: 1.0\n")
# Log-sum-exp
x = np.array([1000, 1001, 1002]) # exp() would overflow!
try:
naive = np.log(np.sum(np.exp(x)))
print(f"Naive log-sum-exp: {naive}")
except:
print("Naive log-sum-exp: OVERFLOW!")
stable = NumericalBestPractices.log_sum_exp(x)
print(f"Stable log-sum-exp: {stable}\n")
# Stable mean/variance
data = np.random.randn(1000000) + 1e10 # Large offset
mean_stable, var_stable = NumericalBestPractices.compute_mean_variance_stable(data)
mean_naive = np.mean(data)
var_naive = np.var(data, ddof=1)
print("Computing mean/variance of large numbers:")
print(f"Stable: mean={mean_stable:.6f}, var={var_stable:.6f}")
print(f"NumPy: mean={mean_naive:.6f}, var={var_naive:.6f}")
if __name__ == "__main__":
demonstrate_floating_point_errors()
demonstrate_ill_conditioning()
demonstrate_numerical_stability()
example_numerical_best_practices()19.7 11.7 Applications in Signal Processing
19.7.1 11.7.1 Audio Processing
class AudioProcessor:
"""Audio signal processing using FFT."""
@staticmethod
def load_audio(filename, duration=None):
"""Load audio file (placeholder - would use librosa in practice)."""
# For demo, generate synthetic audio
sample_rate = 44100
if duration is None:
duration = 3.0
t = np.linspace(0, duration, int(sample_rate * duration))
# Generate a chord: A4 + C#5 + E5 (A major chord)
signal = (np.sin(2 * np.pi * 440 * t) + # A4
0.8 * np.sin(2 * np.pi * 554.37 * t) + # C#5
0.6 * np.sin(2 * np.pi * 659.25 * t)) # E5
return signal, sample_rate
@staticmethod
def apply_lowpass_filter(signal, sample_rate, cutoff_freq):
"""
Low-pass filter: remove high frequencies.
Used for: reducing noise, anti-aliasing
"""
# FFT
spectrum = FFT.fft(signal)
freqs = FFT.fftfreq(len(signal), 1/sample_rate)
# Zero out high frequencies
spectrum[np.abs(freqs) > cutoff_freq] = 0
# IFFT
filtered = np.real(FFT.ifft(spectrum))
return filtered
@staticmethod
def apply_highpass_filter(signal, sample_rate, cutoff_freq):
"""
High-pass filter: remove low frequencies.
Used for: removing rumble, isolating treble
"""
spectrum = FFT.fft(signal)
freqs = FFT.fftfreq(len(signal), 1/sample_rate)
spectrum[np.abs(freqs) < cutoff_freq] = 0
filtered = np.real(FFT.ifft(spectrum))
return filtered
@staticmethod
def apply_bandpass_filter(signal, sample_rate, low_freq, high_freq):
"""
Band-pass filter: keep only frequencies in a range.
Used for: isolating specific instruments, voice isolation
"""
spectrum = FFT.fft(signal)
freqs = FFT.fftfreq(len(signal), 1/sample_rate)
mask = (np.abs(freqs) < low_freq) | (np.abs(freqs) > high_freq)
spectrum[mask] = 0
filtered = np.real(FFT.ifft(spectrum))
return filtered
@staticmethod
def pitch_shift(signal, sample_rate, semitones):
"""
Shift pitch by semitones (crude method).
Professional pitch shifting is much more complex!
"""
# Compute spectrogram
spec = FFT.spectrogram(signal)
# Shift frequencies
shift_factor = 2 ** (semitones / 12)
# This is oversimplified - real pitch shifting preserves timing
return signal # Placeholder
@staticmethod
def compute_spectrogram(signal, sample_rate, window_size=2048, hop_size=512):
"""
Compute spectrogram for visualization.
Shows how frequency content changes over time.
"""
n_windows = (len(signal) - window_size) // hop_size + 1
spec = np.zeros((window_size // 2 + 1, n_windows))
window = np.hamming(window_size)
for i in range(n_windows):
start = i * hop_size
segment = signal[start:start + window_size] * window
if len(segment) < window_size:
segment = np.pad(segment, (0, window_size - len(segment)))
spectrum = FFT.rfft(segment)
spec[:, i] = np.abs(spectrum)
times = np.arange(n_windows) * hop_size / sample_rate
freqs = np.fft.rfftfreq(window_size, 1/sample_rate)
return spec, times, freqs
@staticmethod
def remove_noise(signal, sample_rate, noise_profile_duration=0.5):
"""
Simple noise reduction using spectral subtraction.
1. Learn noise profile from first segment
2. Subtract noise spectrum from signal
"""
noise_samples = int(noise_profile_duration * sample_rate)
noise_segment = signal[:noise_samples]
# Estimate noise spectrum
noise_spectrum = np.abs(FFT.fft(noise_segment))
# Process signal in windows
window_size = 2048
hop_size = 512
n_windows = (len(signal) - window_size) // hop_size + 1
output = np.zeros_like(signal)
window = np.hamming(window_size)
for i in range(n_windows):
start = i * hop_size
segment = signal[start:start + window_size] * window
if len(segment) < window_size:
continue
# FFT
spectrum = FFT.fft(segment)
magnitude = np.abs(spectrum)
phase = np.angle(spectrum)
# Spectral subtraction
magnitude_clean = np.maximum(magnitude - noise_spectrum[:window_size], 0)
# Reconstruct
spectrum_clean = magnitude_clean * np.exp(1j * phase)
segment_clean = np.real(FFT.ifft(spectrum_clean))
# Overlap-add
output[start:start + window_size] += segment_clean
return output
# Examples
def example_audio_filters():
"""Demonstrate audio filtering."""
import matplotlib.pyplot as plt
print("=== Audio Filtering ===\n")
# Generate test signal
signal, sample_rate = AudioProcessor.load_audio(duration=2.0)
# Add noise
noise = 0.1 * np.random.randn(len(signal))
noisy_signal = signal + noise
# Apply filters
lowpass = AudioProcessor.apply_lowpass_filter(noisy_signal, sample_rate, 1000)
highpass = AudioProcessor.apply_highpass_filter(noisy_signal, sample_rate, 300)
bandpass = AudioProcessor.apply_bandpass_filter(noisy_signal, sample_rate, 400, 600)
# Visualize
fig, axes = plt.subplots(4, 2, figsize=(14, 10))
# Time domain
t = np.arange(len(signal)) / sample_rate
for idx, (sig, title) in enumerate([
(signal, 'Original'),
(noisy_signal, 'Noisy'),
(lowpass, 'Low-pass (< 1kHz)'),
(bandpass, 'Band-pass (400-600 Hz)')
]):
# Time domain
axes[idx, 0].plot(t[:1000], sig[:1000])
axes[idx, 0].set_title(f'{title} - Time Domain')
axes[idx, 0].set_xlabel('Time (s)')
axes[idx, 0].set_ylabel('Amplitude')
# Frequency domain
spectrum = np.abs(FFT.rfft(sig))
freqs = np.fft.rfftfreq(len(sig), 1/sample_rate)
axes[idx, 1].plot(freqs, spectrum)
axes[idx, 1].set_title(f'{title} - Frequency Domain')
axes[idx, 1].set_xlabel('Frequency (Hz)')
axes[idx, 1].set_ylabel('Magnitude')
axes[idx, 1].set_xlim([0, 2000])
plt.tight_layout()
plt.savefig('audio_filtering.png', dpi=150)
plt.close()
print("✓ Audio filtering visualization saved")
def example_spectrogram():
"""Create and visualize spectrogram."""
import matplotlib.pyplot as plt
print("\n=== Spectrogram ===\n")
# Generate chirp signal (frequency increases over time)
duration = 2.0
sample_rate = 8000
t = np.linspace(0, duration, int(sample_rate * duration))
# Chirp from 100 Hz to 1000 Hz
f0, f1 = 100, 1000
chirp = np.sin(2 * np.pi * (f0 + (f1 - f0) * t / duration) * t)
# Compute spectrogram
spec, times, freqs = AudioProcessor.compute_spectrogram(chirp, sample_rate)
# Plot
plt.figure(figsize=(12, 6))
plt.pcolormesh(times, freqs, 10 * np.log10(spec + 1e-10), shading='auto')
plt.colorbar(label='Power (dB)')
plt.ylabel('Frequency (Hz)')
plt.xlabel('Time (s)')
plt.title('Spectrogram of Chirp Signal')
plt.ylim([0, 1500])
plt.savefig('spectrogram.png', dpi=150)
plt.close()
print("✓ Spectrogram visualization saved")
if __name__ == "__main__":
example_audio_filters()
example_spectrogram()19.7.2 11.7.2 Image Processing with FFT
class ImageProcessor:
"""Image processing using 2D FFT."""
@staticmethod
def fft2d(image):
"""2D FFT of image."""
return np.fft.fft2(image)
@staticmethod
def ifft2d(spectrum):
"""2D inverse FFT."""
return np.fft.ifft2(spectrum)
@staticmethod
def lowpass_filter_image(image, cutoff_ratio=0.1):
"""
Low-pass filter: blur image (remove high frequencies).
"""
# 2D FFT
spectrum = ImageProcessor.fft2d(image)
spectrum_shifted = np.fft.fftshift(spectrum)
# Create mask
rows, cols = image.shape
crow, ccol = rows // 2, cols // 2
mask = np.zeros((rows, cols))
r = int(cutoff_ratio * min(rows, cols))
center = [crow, ccol]
y, x = np.ogrid[:rows, :cols]
mask_area = (x - center[1])**2 + (y - center[0])**2 <= r**2
mask[mask_area] = 1
# Apply mask
spectrum_shifted *= mask
# IFFT
spectrum = np.fft.ifftshift(spectrum_shifted)
filtered = np.real(ImageProcessor.ifft2d(spectrum))
return filtered
@staticmethod
def highpass_filter_image(image, cutoff_ratio=0.1):
"""
High-pass filter: edge detection (keep high frequencies).
"""
spectrum = ImageProcessor.fft2d(image)
spectrum_shifted = np.fft.fftshift(spectrum)
rows, cols = image.shape
crow, ccol = rows // 2, cols // 2
mask = np.ones((rows, cols))
r = int(cutoff_ratio * min(rows, cols))
center = [crow, ccol]
y, x = np.ogrid[:rows, :cols]
mask_area = (x - center[1])**2 + (y - center[0])**2 <= r**2
mask[mask_area] = 0
spectrum_shifted *= mask
spectrum = np.fft.ifftshift(spectrum_shifted)
filtered = np.real(ImageProcessor.ifft2d(spectrum))
return filtered
@staticmethod
def compress_image(image, keep_ratio=0.1):
"""
Compress image by keeping only largest FFT coefficients.
This is similar to JPEG compression!
"""
spectrum = ImageProcessor.fft2d(image)
# Keep only largest coefficients
threshold = np.percentile(np.abs(spectrum), (1 - keep_ratio) * 100)
spectrum[np.abs(spectrum) < threshold] = 0
# Reconstruct
compressed = np.real(ImageProcessor.ifft2d(spectrum))
# Compression ratio
kept = np.count_nonzero(spectrum)
total = spectrum.size
actual_ratio = kept / total
return compressed, actual_ratio
def example_image_processing():
"""Demonstrate image processing with FFT."""
import matplotlib.pyplot as plt
print("\n=== Image Processing ===\n")
# Create test image
size = 256
image = np.zeros((size, size))
# Add some shapes
image[50:100, 50:100] = 1 # Square
image[150:200, 150:200] = 1 # Another square
# Add texture
x, y = np.meshgrid(np.arange(size), np.arange(size))
image += 0.3 * np.sin(2 * np.pi * x / 20) * np.sin(2 * np.pi * y / 20)
# Apply filters
lowpass = ImageProcessor.lowpass_filter_image(image, 0.1)
highpass = ImageProcessor.highpass_filter_image(image, 0.05)
compressed, ratio = ImageProcessor.compress_image(image, 0.1)
# Visualize
fig, axes = plt.subplots(2, 2, figsize=(12, 12))
axes[0, 0].imshow(image, cmap='gray')
axes[0, 0].set_title('Original Image')
axes[0, 0].axis('off')
axes[0, 1].imshow(lowpass, cmap='gray')
axes[0, 1].set_title('Low-pass Filter (Blurred)')
axes[0, 1].axis('off')
axes[1, 0].imshow(highpass, cmap='gray')
axes[1, 0].set_title('High-pass Filter (Edges)')
axes[1, 0].axis('off')
axes[1, 1].imshow(compressed, cmap='gray')
axes[1, 1].set_title(f'Compressed ({ratio*100:.1f}% coefficients)')
axes[1, 1].axis('off')
plt.tight_layout()
plt.savefig('image_processing.png', dpi=150)
plt.close()
print("✓ Image processing visualization saved")
print(f" Compression: kept {ratio*100:.1f}% of coefficients")
if __name__ == "__main__":
example_image_processing()19.8 11.8 Chapter Project: Complete FFT Analysis Toolkit
Let’s build a comprehensive signal processing library!
19.8.1 11.8.1 Project Structure
FFTAnalyzer/
├── fft_analyzer/
│ ├── __init__.py
│ ├── core/
│ │ ├── fft.py # FFT implementations
│ │ ├── matrix.py # Matrix operations
│ │ └── numerical.py # Numerical stability utilities
│ ├── signal/
│ │ ├── filters.py # Audio filters
│ │ ├── generators.py # Signal generation
│ │ └── analysis.py # Spectral analysis
│ ├── image/
│ │ ├── filters.py # Image filters
│ │ └── compression.py # Image compression
│ ├── applications/
│ │ ├── audio_processor.py # Audio app
│ │ ├── image_processor.py # Image app
│ │ └── data_analyzer.py # Time series
│ └── visualization/
│ ├── plots.py # Plotting utilities
│ └── animations.py # Animated visualizations
├── tests/
├── examples/
├── docs/
└── setup.py
19.8.2 11.8.2 Core FFT Module
# fft_analyzer/__init__.py
"""
FFT Analyzer: Comprehensive signal processing toolkit.
Features:
- Multiple FFT implementations (recursive, iterative)
- Audio processing (filtering, spectrograms)
- Image processing (2D FFT, compression)
- Matrix operations (fast multiplication, decompositions)
- Numerical stability utilities
"""
__version__ = "1.0.0"
from .core import FFT, MatrixOps, NumericalStability
from .signal import SignalGenerator, AudioFilter, SpectralAnalyzer
from .image import ImageFilter, ImageCompressor
from .applications import AudioProcessor, ImageProcessor, TimeSeriesAnalyzer
__all__ = [
'FFT',
'MatrixOps',
'NumericalStability',
'SignalGenerator',
'AudioFilter',
'SpectralAnalyzer',
'ImageFilter',
'ImageCompressor',
'AudioProcessor',
'ImageProcessor',
'TimeSeriesAnalyzer',
]19.8.3 11.8.3 Signal Generator Module
# fft_analyzer/signal/generators.py
"""
Signal generation utilities for testing and demonstration.
"""
import numpy as np
class SignalGenerator:
"""Generate various test signals."""
@staticmethod
def sine_wave(frequency, duration, sample_rate=44100, amplitude=1.0, phase=0):
"""Generate pure sine wave."""
t = np.arange(int(duration * sample_rate)) / sample_rate
return amplitude * np.sin(2 * np.pi * frequency * t + phase), t
@staticmethod
def square_wave(frequency, duration, sample_rate=44100, amplitude=1.0):
"""Generate square wave."""
t = np.arange(int(duration * sample_rate)) / sample_rate
return amplitude * np.sign(np.sin(2 * np.pi * frequency * t)), t
@staticmethod
def sawtooth_wave(frequency, duration, sample_rate=44100, amplitude=1.0):
"""Generate sawtooth wave."""
t = np.arange(int(duration * sample_rate)) / sample_rate
return amplitude * 2 * (t * frequency - np.floor(t * frequency + 0.5)), t
@staticmethod
def chirp(f0, f1, duration, sample_rate=44100, method='linear'):
"""
Generate chirp signal (sweep from f0 to f1).
Args:
f0: Start frequency
f1: End frequency
duration: Duration in seconds
sample_rate: Samples per second
method: 'linear' or 'exponential'
"""
t = np.arange(int(duration * sample_rate)) / sample_rate
if method == 'linear':
# Linear chirp
c = (f1 - f0) / duration
phase = 2 * np.pi * (f0 * t + 0.5 * c * t**2)
else: # exponential
c = (f1 / f0) ** (1 / duration)
phase = 2 * np.pi * f0 * (c**t - 1) / np.log(c)
return np.sin(phase), t
@staticmethod
def white_noise(duration, sample_rate=44100, amplitude=1.0):
"""Generate white noise."""
n_samples = int(duration * sample_rate)
return amplitude * np.random.randn(n_samples)
@staticmethod
def pink_noise(duration, sample_rate=44100, amplitude=1.0):
"""
Generate pink noise (1/f noise).
Pink noise has equal energy per octave.
"""
n_samples = int(duration * sample_rate)
white = np.random.randn(n_samples)
# Pink noise via FFT filtering
spectrum = np.fft.rfft(white)
freqs = np.fft.rfftfreq(n_samples, 1/sample_rate)
freqs[0] = 1 # Avoid division by zero
# 1/f envelope
spectrum *= 1 / np.sqrt(freqs)
pink = np.fft.irfft(spectrum, n_samples)
return amplitude * pink / np.std(pink)
@staticmethod
def impulse(duration, sample_rate=44100, delay=0):
"""Generate impulse (delta function)."""
n_samples = int(duration * sample_rate)
signal = np.zeros(n_samples)
delay_samples = int(delay * sample_rate)
if 0 <= delay_samples < n_samples:
signal[delay_samples] = 1.0
return signal
@staticmethod
def fm_synthesis(carrier_freq, mod_freq, mod_index, duration, sample_rate=44100):
"""
FM synthesis (frequency modulation).
Used in classic synthesizers!
Args:
carrier_freq: Carrier frequency
mod_freq: Modulation frequency
mod_index: Modulation index (depth)
duration: Duration in seconds
sample_rate: Samples per second
"""
t = np.arange(int(duration * sample_rate)) / sample_rate
modulator = mod_index * np.sin(2 * np.pi * mod_freq * t)
carrier = np.sin(2 * np.pi * carrier_freq * t + modulator)
return carrier, t
@staticmethod
def am_synthesis(carrier_freq, mod_freq, mod_depth, duration, sample_rate=44100):
"""
AM synthesis (amplitude modulation).
Args:
carrier_freq: Carrier frequency
mod_freq: Modulation frequency
mod_depth: Modulation depth (0 to 1)
duration: Duration in seconds
sample_rate: Samples per second
"""
t = np.arange(int(duration * sample_rate)) / sample_rate
modulator = 1 + mod_depth * np.sin(2 * np.pi * mod_freq * t)
carrier = np.sin(2 * np.pi * carrier_freq * t)
return carrier * modulator, t
@staticmethod
def musical_note(note, duration, sample_rate=44100, envelope='adsr'):
"""
Generate musical note with envelope.
Args:
note: Note name (e.g., 'A4', 'C#5') or frequency
duration: Duration in seconds
sample_rate: Samples per second
envelope: Envelope type ('adsr', 'linear', 'exponential')
"""
# Note to frequency mapping
note_freqs = {
'C': 261.63, 'C#': 277.18, 'D': 293.66, 'D#': 311.13,
'E': 329.63, 'F': 349.23, 'F#': 369.99, 'G': 392.00,
'G#': 415.30, 'A': 440.00, 'A#': 466.16, 'B': 493.88
}
if isinstance(note, str):
# Parse note name
note_name = note[:-1]
octave = int(note[-1])
freq = note_freqs[note_name] * (2 ** (octave - 4))
else:
freq = note
# Generate tone
signal, t = SignalGenerator.sine_wave(freq, duration, sample_rate)
# Apply envelope
if envelope == 'adsr':
# Attack, Decay, Sustain, Release
attack = int(0.1 * sample_rate)
decay = int(0.1 * sample_rate)
release = int(0.2 * sample_rate)
sustain_level = 0.7
env = np.ones_like(signal)
# Attack
env[:attack] = np.linspace(0, 1, attack)
# Decay
env[attack:attack+decay] = np.linspace(1, sustain_level, decay)
# Sustain (already at sustain_level)
env[attack+decay:-release] = sustain_level
# Release
env[-release:] = np.linspace(sustain_level, 0, release)
signal *= env
return signal, t
# Examples
def example_signal_generation():
"""Demonstrate signal generation."""
import matplotlib.pyplot as plt
print("=== Signal Generation ===\n")
duration = 1.0
sample_rate = 8000
# Generate various signals
signals = {
'Sine Wave (440 Hz)': SignalGenerator.sine_wave(440, duration, sample_rate)[0],
'Square Wave': SignalGenerator.square_wave(440, duration, sample_rate)[0],
'Chirp (100-1000 Hz)': SignalGenerator.chirp(100, 1000, duration, sample_rate)[0],
'White Noise': SignalGenerator.white_noise(duration, sample_rate)[0],
'FM Synthesis': SignalGenerator.fm_synthesis(440, 5, 100, duration, sample_rate)[0],
'Musical Note (A4)': SignalGenerator.musical_note('A4', duration, sample_rate)[0]
}
# Plot
fig, axes = plt.subplots(3, 2, figsize=(14, 10))
axes = axes.flatten()
t = np.arange(int(duration * sample_rate)) / sample_rate
for idx, (name, signal) in enumerate(signals.items()):
axes[idx].plot(t[:1000], signal[:1000])
axes[idx].set_title(name)
axes[idx].set_xlabel('Time (s)')
axes[idx].set_ylabel('Amplitude')
axes[idx].grid(True)
plt.tight_layout()
plt.savefig('signal_generation.png', dpi=150)
plt.close()
print("✓ Signal generation examples saved")
if __name__ == "__main__":
example_signal_generation()19.8.4 11.8.4 Complete Audio Processor Application
# fft_analyzer/applications/audio_processor.py
"""
Complete audio processing application.
"""
import numpy as np
from ..core import FFT
from ..signal import SignalGenerator, AudioFilter
from ..visualization import AudioVisualizer
class AudioProcessor:
"""
Complete audio processing application.
Features:
- Load/save audio files
- Real-time filtering
- Spectral analysis
- Effects (reverb, echo, pitch shift)
- Visualization
"""
def __init__(self, sample_rate=44100):
self.sample_rate = sample_rate
self.audio = None
self.processed = None
def load(self, filename=None, duration=None):
"""Load audio file or generate test signal."""
if filename is None:
# Generate test signal
self.audio, _ = SignalGenerator.musical_note(
'A4', duration or 2.0, self.sample_rate
)
else:
# In real app, would use librosa or soundfile
pass
def apply_filter(self, filter_type, **params):
"""Apply filter to audio."""
if self.audio is None:
raise ValueError("No audio loaded")
if filter_type == 'lowpass':
cutoff = params.get('cutoff', 1000)
self.processed = AudioFilter.lowpass(
self.audio, self.sample_rate, cutoff
)
elif filter_type == 'highpass':
cutoff = params.get('cutoff', 300)
self.processed = AudioFilter.highpass(
self.audio, self.sample_rate, cutoff
)
elif filter_type == 'bandpass':
low = params.get('low', 400)
high = params.get('high', 600)
self.processed = AudioFilter.bandpass(
self.audio, self.sample_rate, low, high
)
else:
raise ValueError(f"Unknown filter type: {filter_type}")
def add_echo(self, delay=0.3, decay=0.5):
"""Add echo effect."""
if self.audio is None:
raise ValueError("No audio loaded")
delay_samples = int(delay * self.sample_rate)
output = self.audio.copy()
# Add delayed and attenuated copies
if len(output) > delay_samples:
output[delay_samples:] += decay * self.audio[:-delay_samples]
self.processed = output
def add_reverb(self, room_size=0.5, damping=0.5):
"""
Add reverb effect (simplified).
Real reverb uses convolution with impulse response.
"""
if self.audio is None:
raise ValueError("No audio loaded")
# Simple comb filter reverb
delays = [0.02973, 0.03715, 0.04197, 0.04543] # seconds
output = self.audio.copy()
for delay in delays:
delay_samples = int(delay * self.sample_rate * room_size)
if len(output) > delay_samples:
decayed = damping * self.audio[:-delay_samples]
output[delay_samples:] += decayed[:len(output)-delay_samples]
self.processed = output / (1 + len(delays) * damping)
def normalize(self):
"""Normalize audio to [-1, 1] range."""
if self.processed is not None:
self.processed = self.processed / np.max(np.abs(self.processed))
elif self.audio is not None:
self.audio = self.audio / np.max(np.abs(self.audio))
def analyze_spectrum(self):
"""
Perform spectral analysis.
Returns:
Dictionary with spectral features
"""
if self.audio is None:
raise ValueError("No audio loaded")
# FFT
spectrum = np.abs(FFT.rfft(self.audio))
freqs = np.fft.rfftfreq(len(self.audio), 1/self.sample_rate)
# Find dominant frequency
dominant_idx = np.argmax(spectrum)
dominant_freq = freqs[dominant_idx]
# Compute spectral centroid
centroid = np.sum(freqs * spectrum) / np.sum(spectrum)
# Compute bandwidth
bandwidth = np.sqrt(np.sum(((freqs - centroid)**2) * spectrum) / np.sum(spectrum))
return {
'dominant_frequency': dominant_freq,
'spectral_centroid': centroid,
'spectral_bandwidth': bandwidth,
'spectrum': spectrum,
'frequencies': freqs
}
def visualize(self, show_processed=True):
"""Visualize audio in time and frequency domains."""
import matplotlib.pyplot as plt
fig, axes = plt.subplots(2, 2, figsize=(14, 10))
t = np.arange(len(self.audio)) / self.sample_rate
# Original signal - time domain
axes[0, 0].plot(t[:2000], self.audio[:2000])
axes[0, 0].set_title('Original - Time Domain')
axes[0, 0].set_xlabel('Time (s)')
axes[0, 0].set_ylabel('Amplitude')
axes[0, 0].grid(True)
# Original signal - frequency domain
spectrum = np.abs(FFT.rfft(self.audio))
freqs = np.fft.rfftfreq(len(self.audio), 1/self.sample_rate)
axes[0, 1].plot(freqs, spectrum)
axes[0, 1].set_title('Original - Frequency Domain')
axes[0, 1].set_xlabel('Frequency (Hz)')
axes[0, 1].set_ylabel('Magnitude')
axes[0, 1].set_xlim([0, 5000])
axes[0, 1].grid(True)
if show_processed and self.processed is not None:
# Processed signal - time domain
axes[1, 0].plot(t[:2000], self.processed[:2000])
axes[1, 0].set_title('Processed - Time Domain')
axes[1, 0].set_xlabel('Time (s)')
axes[1, 0].set_ylabel('Amplitude')
axes[1, 0].grid(True)
# Processed signal - frequency domain
spectrum_proc = np.abs(FFT.rfft(self.processed))
axes[1, 1].plot(freqs, spectrum_proc)
axes[1, 1].set_title('Processed - Frequency Domain')
axes[1, 1].set_xlabel('Frequency (Hz)')
axes[1, 1].set_ylabel('Magnitude')
axes[1, 1].set_xlim([0, 5000])
axes[1, 1].grid(True)
plt.tight_layout()
return fig
# Interactive Demo
def interactive_audio_demo():
"""Interactive audio processing demo."""
print("=== Audio Processor Demo ===\n")
# Create processor
processor = AudioProcessor(sample_rate=8000)
# Load test audio
print("Generating test audio (musical note)...")
processor.load(duration=2.0)
# Analyze original
print("\nOriginal Audio Analysis:")
analysis = processor.analyze_spectrum()
print(f" Dominant frequency: {analysis['dominant_frequency']:.1f} Hz")
print(f" Spectral centroid: {analysis['spectral_centroid']:.1f} Hz")
print(f" Spectral bandwidth: {analysis['spectral_bandwidth']:.1f} Hz")
# Apply effects
print("\nApplying effects...")
# 1. Low-pass filter
processor.apply_filter('lowpass', cutoff=1000)
print(" ✓ Low-pass filter applied")
# 2. Add echo
processor.add_echo(delay=0.3, decay=0.5)
print(" ✓ Echo added")
# 3. Normalize
processor.normalize()
print(" ✓ Normalized")
# Analyze processed
processor.audio = processor.processed
analysis_proc = processor.analyze_spectrum()
print("\nProcessed Audio Analysis:")
print(f" Dominant frequency: {analysis_proc['dominant_frequency']:.1f} Hz")
print(f" Spectral centroid: {analysis_proc['spectral_centroid']:.1f} Hz")
print(f" Spectral bandwidth: {analysis_proc['spectral_bandwidth']:.1f} Hz")
# Visualize
print("\nGenerating visualization...")
fig = processor.visualize()
fig.savefig('audio_processor_demo.png', dpi=150)
print(" ✓ Visualization saved to 'audio_processor_demo.png'")
if __name__ == "__main__":
interactive_audio_demo()19.8.5 11.8.5 Command-Line Interface
# fft_analyzer/cli.py
"""
Command-line interface for FFT Analyzer.
"""
import argparse
import numpy as np
from .applications import AudioProcessor, ImageProcessor
from .signal import SignalGenerator
def main():
parser = argparse.ArgumentParser(
description='FFT Analyzer: Signal and image processing toolkit'
)
subparsers = parser.add_subparsers(dest='command')
# Audio processing
audio_parser = subparsers.add_parser('audio', help='Process audio')
audio_parser.add_argument('--generate', choices=['sine', 'chirp', 'note'],
help='Generate test signal')
audio_parser.add_argument('--filter', choices=['lowpass', 'highpass', 'bandpass'],
help='Apply filter')
audio_parser.add_argument('--cutoff', type=float, help='Filter cutoff frequency')
audio_parser.add_argument('--visualize', action='store_true',
help='Generate visualization')
# Image processing
image_parser = subparsers.add_parser('image', help='Process image')
image_parser.add_argument('input', help='Input image')
image_parser.add_argument('--filter', choices=['lowpass', 'highpass'],
help='Apply filter')
image_parser.add_argument('--compress', type=float,
help='Compression ratio (0-1)')
image_parser.add_argument('--output', help='Output image')
# Benchmark
bench_parser = subparsers.add_parser('benchmark', help='Run benchmarks')
bench_parser.add_argument('--type', choices=['fft', 'matrix', 'all'],
default='all')
args = parser.parse_args()
if args.command == 'audio':
# Audio processing logic
pass
elif args.command == 'image':
# Image processing logic
pass
elif args.command == 'benchmark':
# Benchmark logic
pass
if __name__ == '__main__':
main()19.9 11.9 Summary and Key Takeaways
Core Algorithms: 1. FFT: O(n log n) frequency analysis, revolutionized signal processing 2. Polynomial multiplication: Use FFT for O(n log n) multiplication 3. Matrix operations: Cache-aware blocking makes huge difference 4. Numerical stability: Always consider floating-point errors
Key Insights: - FFT is everywhere: Audio, images, telecommunications, science - Convolution theorem: Time-domain convolution = frequency-domain multiplication - Cache matters: Algorithm complexity isn’t everything - Numerical errors accumulate: Use stable algorithms
Real-World Applications: - Audio: MP3 compression, noise reduction, equalizers - Images: JPEG compression, filters, edge detection - Telecommunications: Modulation, channel estimation - Scientific computing: Solving PDEs, quantum simulations
Best Practices: - Use established libraries (NumPy, FFTW) for production - Always validate numerical accuracy - Profile before optimizing - Consider numerical stability from the start
19.10 11.10 Exercises
19.10.1 Conceptual Understanding
FFT Complexity: Prove that FFT has O(n log n) complexity using the Master Theorem.
Nyquist Frequency: Explain why you can’t measure frequencies above sample_rate/2.
Matrix Blocking: Calculate the optimal block size for your CPU’s L1 cache.
19.10.2 Implementation Challenges
2D FFT: Implement 2D FFT for images using 1D FFT.
Real FFT: Implement optimized FFT for real-valued signals (half the work!).
Convolution: Implement fast convolution using FFT (used in neural networks).
19.10.3 Application Problems
Audio Codec: Build a simple audio codec using FFT + quantization.
Image Denoising: Implement image denoising using frequency domain filtering.
Polynomial GCD: Use FFT-based polynomial multiplication to compute GCD of polynomials.
19.10.4 Advanced Topics
Number-Theoretic Transform: Implement NTT for exact integer polynomial multiplication.
Chirp Z-Transform: Implement CZT for computing FFT at arbitrary frequencies.
Parallel FFT: Implement parallel FFT using threading or GPU.
19.11 11.11 Further Reading
Classic Papers: - Cooley & Tukey (1965): “An Algorithm for the Machine Calculation of Complex Fourier Series” - Strassen (1969): “Gaussian Elimination is Not Optimal” - Wilkinson (1963): “Rounding Errors in Algebraic Processes”
Books: - Oppenheim & Schafer: “Discrete-Time Signal Processing” (the Bible) - Trefethen & Bau: “Numerical Linear Algebra” - Golub & Van Loan: “Matrix Computations” - Press et al.: “Numerical Recipes”
Modern Resources: - Scipy Lecture Notes: Excellent practical guide - FFTW Documentation: Fastest FFT library - Intel MKL: Highly optimized matrix operations
You’ve now mastered the algorithms that power the digital world! From the music you listen to, to the images you see, to the data your phone transmits—FFT and matrix algorithms are working behind the scenes.
Next up: Advanced data structures that can query ranges in O(log n) time and support persistent versions of themselves!