16 Chapter 8: Approximation Algorithms - When “Good Enough” is Perfect
16.1 The Art of Strategic Compromise
“The perfect is the enemy of the good.” - Voltaire “But in computer science, we can prove exactly how good ‘good’ is.” - Modern CS
16.2 Introduction: The 99% Solution
Imagine you’re planning a road trip to visit 50 tourist attractions across the country. Finding the absolute shortest route would take longer than the age of the universe (it’s NP-complete!). But what if I told you that in just a few seconds, we could find a route that’s guaranteed to be at most 50% longer than the shortest possible route? Would you take it?
Of course you would! An extra few hours of driving is infinitely better than waiting billions of years for the perfect route.
This is the essence of approximation algorithms: trading perfection for practicality while maintaining mathematical guarantees about solution quality.
16.2.1 A Real-World Success Story
In 1999, UPS implemented an approximation algorithm for their delivery routing (a variant of the Vehicle Routing Problem, which is NP-hard). The results were staggering:
- Before: Human dispatchers planning routes by intuition
- After: Approximation algorithm guaranteeing routes within 10% of optimal
- Impact: Saved 10 million gallons of fuel per year, $300+ million annually
- Computation time: Seconds instead of centuries
The routes weren’t perfect, but they were provably good and computationally achievable. That’s the power of approximation!
16.2.2 Why Approximation Algorithms Matter
When faced with an NP-hard problem, you have several options:
- Exponential exact algorithms - Perfect but impossibly slow
- Heuristics - Fast but no quality guarantee
- Approximation algorithms - Fast WITH quality guarantees ✨
Approximation algorithms give you the best of both worlds: speed and confidence.
16.2.3 The Approximation Guarantee
The key concept is the approximation ratio:
For a minimization problem: - Algorithm solution ≤ α × optimal solution
For a maximization problem: - Algorithm solution ≥ (1/α) × optimal solution
Where α is the approximation ratio (α ≥ 1).
Example: - 2-approximation for TSP means: your route ≤ 2 × shortest route - 0.5-approximation for Max-Cut means: your cut ≥ 0.5 × maximum cut
16.2.4 What You’ll Learn
This chapter will teach you to:
- Design approximation algorithms with provable guarantees
- Analyze approximation ratios rigorously
- Apply standard techniques (greedy, LP relaxation, randomization)
- Recognize when approximation is possible (and when it’s not)
- Implement practical approximation algorithms
By the end, you’ll have a powerful toolkit for tackling NP-hard problems in the real world!
16.3 Section 8.1: The Fundamentals of Approximation
16.3.1 Understanding Approximation Ratios
Let’s start with a simple example to build intuition.
16.3.1.1 The Lemonade Stand Location Problem
You want to place lemonade stands to serve houses along a street. Each stand can serve houses within 1 block. What’s the minimum number of stands needed?
Optimal Solution: NP-hard to find!
Greedy Approximation: 1. Start from the leftmost uncovered house 2. Place a stand 1 block to its right 3. Repeat until all houses covered
def lemonade_stands_greedy(houses):
"""
Place minimum number of lemonade stands to cover all houses.
Each stand covers houses within distance 1.
This is a 2-approximation algorithm!
"""
houses = sorted(houses) # Sort by position
stands = []
i = 0
while i < len(houses):
# Place stand 1 unit to the right of current house
stand_position = houses[i] + 1
stands.append(stand_position)
# Skip all houses covered by this stand
while i < len(houses) and houses[i] <= stand_position + 1:
i += 1
return stands
# Example
houses = [1, 2, 3, 6, 7, 10, 11]
stands = lemonade_stands_greedy(houses)
print(f"Houses: {houses}")
print(f"Stands at: {stands}")
print(f"Number of stands: {len(stands)}")
# Output: Stands at: [2, 7, 11], Number of stands: 3Why is this a 2-approximation?
Proof intuition: - Our greedy algorithm places stands at positions based on leftmost uncovered house - The optimal solution must also cover these houses - In the worst case, optimal places stands perfectly between our stands - But that means optimal needs at least half as many stands as we use - Therefore: our solution ≤ 2 × optimal
16.3.2 Types of Approximation Guarantees
16.3.2.1 Constant Factor Approximation
Definition: Algorithm always within constant factor of optimal.
Example: 2-approximation for Vertex Cover - Your solution ≤ 2 × optimal - Works for ANY input - The “2” doesn’t grow with input size
16.3.2.2 Logarithmic Approximation
Definition: Factor grows logarithmically with input size.
Example: O(log n)-approximation for Set Cover - Your solution ≤ (ln n) × optimal - Factor grows, but slowly - Still practical for large inputs
16.3.2.3 Polynomial Approximation Schemes (PTAS)
Definition: Get arbitrarily close to optimal, but time grows with accuracy.
Example: (1 + ε)-approximation for Knapsack - Choose any ε > 0 - Get solution within (1 + ε) factor - Runtime like O(n^(1/ε)) - polynomial for fixed ε
16.3.3 When Approximation is Impossible
Some problems resist approximation!
16.3.3.1 The Traveling Salesman Problem (General)
Without triangle inequality: - Cannot approximate within ANY constant factor (unless P = NP) - Even getting within 1000000 × optimal is NP-hard!
Why? If we could approximate general TSP, we could solve Hamiltonian Cycle (NP-complete).
16.3.3.2 Clique Problem
Cannot approximate within n^(1-ε) for any ε > 0 (unless P = NP)
This means for a graph with 1000 nodes: - Can’t even guarantee finding a clique of size 10 when optimal is 100!
16.3.4 The Approximation Algorithm Design Process
- Understand the problem structure
- What makes it hard?
- Are there special cases?
- Design a simple algorithm
- Often greedy or based on relaxation
- Must run in polynomial time
- Prove the approximation ratio
- Compare to optimal (without finding it!)
- Use bounds and problem structure
- Optimize if possible
- Can you improve the constant?
- Can you make it faster?
16.4 Section 8.2: Vertex Cover - A Classic 2-Approximation
16.4.1 The Vertex Cover Problem
Problem: Find the smallest set of vertices that “covers” all edges (every edge has at least one endpoint in the set).
Applications: - Security camera placement (cover all corridors) - Network monitoring (monitor all connections) - Facility location (serve all demands)
16.4.2 The Naive Approach
def vertex_cover_naive(graph):
"""
Try all possible vertex subsets - exponential time!
Only feasible for tiny graphs.
"""
n = len(graph.vertices)
min_cover = set(graph.vertices) # Worst case: all vertices
# Try all 2^n subsets
for mask in range(1 << n):
subset = {v for i, v in enumerate(graph.vertices) if mask & (1 << i)}
# Check if it's a valid cover
if all(u in subset or v in subset for u, v in graph.edges):
if len(subset) < len(min_cover):
min_cover = subset
return min_coverTime: O(2^n × m) - Impossibly slow for n > 20!
16.4.3 The Greedy 2-Approximation
Here’s a beautifully simple algorithm:
def vertex_cover_approx(graph):
"""
2-approximation for Vertex Cover.
Algorithm: Repeatedly pick an edge and add BOTH endpoints.
Time: O(V + E)
Approximation ratio: 2
"""
cover = set()
edges = list(graph.edges)
while edges:
# Pick any uncovered edge
u, v = edges[0]
# Add both endpoints to cover
cover.add(u)
cover.add(v)
# Remove all edges incident to u or v
edges = [(a, b) for (a, b) in edges
if a != u and a != v and b != u and b != v]
return cover
# Example
class Graph:
def __init__(self):
self.edges = [
('A', 'B'), ('B', 'C'), ('C', 'D'),
('D', 'E'), ('E', 'A'), ('B', 'D')
]
self.vertices = ['A', 'B', 'C', 'D', 'E']
g = Graph()
cover = vertex_cover_approx(g)
print(f"Vertex cover: {cover}")
print(f"Size: {len(cover)}")
# Might output: {'B', 'D', 'A', 'E'}, Size: 4
# Optimal might be: {'B', 'E'}, Size: 216.4.4 Why This is a 2-Approximation
The Brilliant Proof:
- Let M = edges selected by our algorithm
- These edges are disjoint (no common vertices)
- Our cover has size 2|M|
- Any vertex cover must cover all edges in M
- Since edges in M are disjoint
- Optimal cover needs at least |M| vertices
- Therefore: OPT ≥ |M|
- Our approximation ratio:
- Our size / OPT ≤ 2|M| / |M| = 2 ✓
Visual Proof:
Selected edges (M): A---B C---D E---F
Our cover: A,B C,D E,F (6 vertices)
Optimal must pick: A or B C or D E or F (≥3 vertices)
Ratio: 6/3 = 2
16.4.5 An Improved Algorithm: Maximum Matching
def vertex_cover_matching(graph):
"""
Better 2-approximation using maximal matching.
Often produces smaller covers in practice.
"""
cover = set()
edges = list(graph.edges)
covered_vertices = set()
for u, v in edges:
# Only add edge if neither endpoint is covered
if u not in covered_vertices and v not in covered_vertices:
cover.add(u)
cover.add(v)
covered_vertices.add(u)
covered_vertices.add(v)
return cover16.4.6 Can We Do Better Than 2?
The Unique Games Conjecture suggests we cannot approximate Vertex Cover better than 2 - ε for any ε > 0.
But we can do better for special cases:
def vertex_cover_tree(tree):
"""
Exact algorithm for Vertex Cover on trees.
Uses dynamic programming - polynomial time!
"""
def dp(node, parent, must_include):
"""
Returns minimum cover size for subtree rooted at node.
must_include: whether node must be in cover
"""
if not tree[node]: # Leaf
return 1 if must_include else 0
if must_include:
# Node is in cover, children can be anything
size = 1
for child in tree[node]:
if child != parent:
size += min(dp(child, node, True),
dp(child, node, False))
else:
# Node not in cover, all children must be
size = 0
for child in tree[node]:
if child != parent:
size += dp(child, node, True)
return size
root = tree.get_root()
return min(dp(root, None, True), dp(root, None, False))16.5 Section 8.3: The Traveling Salesman Problem
16.5.1 TSP with Triangle Inequality
When distances satisfy the triangle inequality (direct routes are shortest), we can approximate!
Triangle Inequality: dist(A,C) ≤ dist(A,B) + dist(B,C)
This is true for: - Euclidean distances (straight-line) - Road networks (usually) - Manhattan distances
16.5.2 The 2-Approximation Algorithm
Key Insight: Use Minimum Spanning Tree (MST)!
import heapq
def tsp_2_approximation(graph):
"""
2-approximation for metric TSP using MST.
Algorithm:
1. Find MST
2. Do DFS traversal
3. Create tour using traversal order
Time: O(V² log V) for complete graph
Approximation ratio: 2
"""
def find_mst(graph):
"""Find MST using Prim's algorithm."""
n = len(graph)
mst = [[] for _ in range(n)]
visited = [False] * n
min_heap = [(0, 0, -1)] # (weight, node, parent)
while min_heap:
weight, u, parent = heapq.heappop(min_heap)
if visited[u]:
continue
visited[u] = True
if parent != -1:
mst[parent].append(u)
mst[u].append(parent)
for v in range(n):
if not visited[v]:
heapq.heappush(min_heap, (graph[u][v], v, u))
return mst
def dfs_traversal(mst, start=0):
"""DFS traversal of MST."""
visited = [False] * len(mst)
tour = []
def dfs(u):
visited[u] = True
tour.append(u)
for v in mst[u]:
if not visited[v]:
dfs(v)
dfs(start)
tour.append(start) # Return to start
return tour
# Step 1: Find MST
mst = find_mst(graph)
# Step 2: DFS traversal
tour = dfs_traversal(mst)
return tour
# Example with cities
def create_distance_matrix(cities):
"""Create distance matrix from city coordinates."""
n = len(cities)
dist = [[0] * n for _ in range(n)]
for i in range(n):
for j in range(n):
dx = cities[i][0] - cities[j][0]
dy = cities[i][1] - cities[j][1]
dist[i][j] = (dx*dx + dy*dy) ** 0.5
return dist
cities = [(0,0), (1,0), (1,1), (0,1)] # Square
distances = create_distance_matrix(cities)
tour = tsp_2_approximation(distances)
print(f"TSP tour: {tour}")
# Output: [0, 1, 2, 3, 0] or similar16.5.3 Why This is a 2-Approximation
The Proof:
- MST weight ≤ OPT
- Optimal TSP tour minus one edge is a spanning tree
- MST is the minimum spanning tree
- So: weight(MST) ≤ weight(OPT)
- DFS traversal = 2 × MST
- DFS visits each edge twice (down and up)
- Total: 2 × weight(MST)
- Triangle inequality saves us
- Shortcuts never increase distance
- Final tour ≤ DFS traversal
- Therefore:
- Tour ≤ 2 × MST ≤ 2 × OPT ✓
16.5.4 Christofides Algorithm: 1.5-Approximation
We can do better with a clever trick!
def christofides_tsp(graph):
"""
1.5-approximation for metric TSP.
Algorithm:
1. Find MST
2. Find odd-degree vertices in MST
3. Find minimum weight perfect matching on odd vertices
4. Combine MST + matching to get Eulerian graph
5. Find Eulerian tour
6. Convert to Hamiltonian tour
Time: O(V³)
Approximation ratio: 1.5
"""
def find_odd_degree_vertices(mst):
"""Find vertices with odd degree in MST."""
degree = [0] * len(mst)
for u in range(len(mst)):
degree[u] = len(mst[u])
return [v for v in range(len(mst)) if degree[v] % 2 == 1]
def min_weight_matching(graph, vertices):
"""
Find minimum weight perfect matching.
Simplified version - in practice use Blossom algorithm.
"""
if not vertices:
return []
# Greedy matching (not optimal but demonstrates idea)
matching = []
used = set()
vertices_copy = vertices.copy()
while len(vertices_copy) > 1:
min_weight = float('inf')
min_pair = None
for i in range(len(vertices_copy)):
for j in range(i+1, len(vertices_copy)):
u, v = vertices_copy[i], vertices_copy[j]
if graph[u][v] < min_weight:
min_weight = graph[u][v]
min_pair = (i, j)
i, j = min_pair
u, v = vertices_copy[i], vertices_copy[j]
matching.append((u, v))
# Remove matched vertices
vertices_copy = [vertices_copy[k] for k in range(len(vertices_copy))
if k != i and k != j]
return matching
def find_eulerian_tour(graph):
"""Find Eulerian tour in graph with all even degrees."""
# Hierholzer's algorithm
tour = []
stack = [0]
graph_copy = [edges.copy() for edges in graph]
while stack:
v = stack[-1]
if graph_copy[v]:
u = graph_copy[v].pop()
graph_copy[u].remove(v)
stack.append(u)
else:
tour.append(stack.pop())
return tour[::-1]
# Step 1: Find MST
mst = find_mst(graph)
# Step 2: Find odd degree vertices
odd_vertices = find_odd_degree_vertices(mst)
# Step 3: Find minimum matching on odd vertices
matching = min_weight_matching(graph, odd_vertices)
# Step 4: Combine MST and matching
multigraph = [edges.copy() for edges in mst]
for u, v in matching:
multigraph[u].append(v)
multigraph[v].append(u)
# Step 5: Find Eulerian tour
eulerian = find_eulerian_tour(multigraph)
# Step 6: Convert to Hamiltonian (skip repeated vertices)
visited = set()
tour = []
for v in eulerian:
if v not in visited:
tour.append(v)
visited.add(v)
tour.append(tour[0]) # Return to start
return tourWhy 1.5-Approximation?
- MST weight ≤ OPT (as before)
- Matching weight ≤ OPT/2 (clever argument using optimal tour on odd vertices)
- Eulerian tour = MST + matching ≤ 1.5 × OPT
- Shortcuts only improve, so final tour ≤ 1.5 × OPT ✓
16.6 Section 8.4: Set Cover and Greedy Algorithms
16.6.1 The Set Cover Problem
Problem: Given a universe of elements and collection of sets, find minimum number of sets that cover all elements.
Real-World Applications: - Sensor placement (cover all areas) - Feature selection in ML (cover all data characteristics) - Committee formation (cover all skills)
16.6.2 The Greedy Algorithm
def set_cover_greedy(universe, sets):
"""
Greedy approximation for Set Cover.
Algorithm: Repeatedly pick set covering most uncovered elements.
Time: O(|universe| × |sets| × |largest set|)
Approximation ratio: ln(|universe|) + 1
"""
covered = set()
cover = []
sets_copy = [set(s) for s in sets] # Copy to avoid modifying
while covered != universe:
# Find set covering most uncovered elements
best_set_idx = -1
best_count = 0
for i, s in enumerate(sets_copy):
uncovered_count = len(s - covered)
if uncovered_count > best_count:
best_count = uncovered_count
best_set_idx = i
if best_set_idx == -1:
return None # Cannot cover universe
# Add best set to cover
cover.append(best_set_idx)
covered.update(sets_copy[best_set_idx])
return cover
# Example: Skill coverage for team formation
universe = set(range(10)) # Skills 0-9
sets = [
{0, 1, 2}, # Person A's skills
{1, 3, 4, 5}, # Person B's skills
{4, 5, 6, 7}, # Person C's skills
{0, 6, 8, 9}, # Person D's skills
{2, 3, 7, 8, 9} # Person E's skills
]
cover = set_cover_greedy(universe, sets)
print(f"Selected sets: {cover}")
print(f"Number of sets: {len(cover)}")
# Might output: [1, 4, 3] (persons B, E, D)16.6.3 Why ln(n) Approximation?
The Analysis (Intuitive):
- Each iteration covers significant fraction
- If k sets remain in optimal solution
- Some set must cover ≥ |uncovered|/k elements
- Greedy picks set covering at least this many
- Uncovered elements decrease geometrically
- After t iterations, uncovered ≤ n × (1 - 1/OPT)^t
- This shrinks like e^(-t/OPT)
- Total iterations needed
- About OPT × ln(n) iterations
- Each iteration adds one set
- Total: O(OPT × ln(n)) sets
16.6.4 The Weighted Version
def weighted_set_cover_greedy(universe, sets, weights):
"""
Greedy approximation for Weighted Set Cover.
Algorithm: Pick set with best cost/benefit ratio.
Approximation ratio: ln(|universe|) + 1
"""
covered = set()
cover = []
total_cost = 0
while covered != universe:
best_ratio = float('inf')
best_idx = -1
for i, s in enumerate(sets):
uncovered = s - covered
if uncovered:
ratio = weights[i] / len(uncovered)
if ratio < best_ratio:
best_ratio = ratio
best_idx = i
if best_idx == -1:
return None, float('inf')
cover.append(best_idx)
covered.update(sets[best_idx])
total_cost += weights[best_idx]
return cover, total_cost
# Example: Minimize cost of skill coverage
weights = [100, 150, 200, 180, 160] # Salaries
cover, cost = weighted_set_cover_greedy(universe, sets, weights)
print(f"Selected people: {cover}")
print(f"Total cost: ${cost}")16.6.5 When Greedy is Optimal: Matroids
For some problems, greedy gives OPTIMAL solutions!
Matroid Property: 1. Hereditary: Subsets of independent sets are independent 2. Exchange: Can always extend smaller independent set
Examples where greedy is optimal: - Maximum weight spanning tree (Kruskal’s) - Finding maximum weight independent set in matroid - Task scheduling with deadlines
16.7 Section 8.5: Randomized Approximation
16.7.1 The Power of Random Choices
Sometimes flipping coins gives great approximations!
16.7.2 MAX-CUT: A Simple Randomized Algorithm
Problem: Partition vertices to maximize edges between partitions.
import random
def max_cut_random(graph):
"""
Randomized 0.5-approximation for MAX-CUT.
Algorithm: Randomly assign each vertex to partition A or B.
Expected approximation: 0.5
Amazing fact: This trivial algorithm is hard to beat!
"""
vertices = list(graph.vertices)
partition_A = set()
partition_B = set()
# Randomly partition vertices
for v in vertices:
if random.random() < 0.5:
partition_A.add(v)
else:
partition_B.add(v)
# Count edges in cut
cut_size = 0
for u, v in graph.edges:
if (u in partition_A and v in partition_B) or \
(u in partition_B and v in partition_A):
cut_size += 1
return partition_A, partition_B, cut_size
def max_cut_derandomized(graph):
"""
Derandomized version using conditional expectation.
Guaranteed 0.5-approximation (not just expected).
"""
vertices = list(graph.vertices)
partition_A = set()
partition_B = set()
for v in vertices:
# Calculate expected cut size for each choice
cut_if_A = 0
cut_if_B = 0
for u, w in graph.edges:
if v in [u, w]:
other = w if u == v else u
if other in partition_B:
cut_if_A += 1
elif other in partition_A:
cut_if_B += 1
else:
# Other vertex not yet assigned
cut_if_A += 0.5 # Expected value
cut_if_B += 0.5
# Choose partition giving larger expected cut
if cut_if_A >= cut_if_B:
partition_A.add(v)
else:
partition_B.add(v)
# Count actual cut size
cut_size = sum(1 for u, v in graph.edges
if (u in partition_A) != (v in partition_A))
return partition_A, partition_B, cut_sizeWhy 0.5-Approximation?
For each edge (u,v): - Probability u and v in different partitions = 0.5 - Expected edges in cut = 0.5 × |E| - Maximum possible cut ≤ |E| - Therefore: expected cut ≥ 0.5 × MAX-CUT ✓
16.7.3 MAX-SAT: Randomized Rounding
def max_sat_random(clauses, num_vars):
"""
Randomized approximation for MAX-SAT.
For k-SAT (clauses of size k):
Expected approximation: 1 - 1/2^k
For 3-SAT: 7/8-approximation (87.5% of optimal!)
"""
# Random assignment
assignment = [random.choice([True, False]) for _ in range(num_vars)]
# Count satisfied clauses
satisfied = 0
for clause in clauses:
# Check if at least one literal is true
for var, is_positive in clause:
if is_positive and assignment[var]:
satisfied += 1
break
elif not is_positive and not assignment[var]:
satisfied += 1
break
return assignment, satisfied
def max_sat_lp_rounding(clauses, num_vars):
"""
Better approximation using LP relaxation and randomized rounding.
1. Solve LP relaxation (fractional assignment)
2. Round probabilistically based on LP solution
Approximation: 1 - 1/e ≈ 0.632 for general SAT
"""
# For demonstration, using simple randomized rounding
# In practice, solve actual LP
# Pretend we solved LP and got fractional values
lp_solution = [random.random() for _ in range(num_vars)]
# Round probabilistically
assignment = [random.random() < prob for prob in lp_solution]
satisfied = 0
for clause in clauses:
for var, is_positive in clause:
if is_positive and assignment[var]:
satisfied += 1
break
elif not is_positive and not assignment[var]:
satisfied += 1
break
return assignment, satisfied16.7.4 The Method of Conditional Expectations
We can derandomize many randomized algorithms!
The Idea: 1. Instead of random choices, make greedy choices 2. At each step, choose option maximizing expected outcome 3. Final result at least as good as expected value of randomized algorithm
def derandomize_vertex_cover(graph):
"""
Derandomize the randomized 2-approximation for Vertex Cover.
Original: Include each vertex with probability 0.5
Derandomized: Include vertex if it improves expected coverage
"""
vertices = list(graph.vertices)
cover = set()
for v in vertices:
# Calculate expected uncovered edges for each choice
uncovered_if_include = 0
uncovered_if_exclude = 0
for u, w in graph.edges:
if v not in [u, w]:
# Edge doesn't involve v
if u not in cover and w not in cover:
# Currently uncovered
uncovered_if_include += 0.25 # Prob both excluded later
uncovered_if_exclude += 0.25
elif v == u or v == w:
other = w if v == u else u
if other in cover:
# Already covered
pass
elif other in vertices[vertices.index(v)+1:]:
# Other vertex not yet decided
uncovered_if_exclude += 0.5 # Prob other excluded
# uncovered_if_include = 0 (edge covered by v)
# Choose option with fewer expected uncovered edges
if uncovered_if_include <= uncovered_if_exclude:
cover.add(v)
return cover16.8 Section 8.6: Linear Programming Relaxation
16.8.1 The Power of Relaxation
Many discrete optimization problems become easy when we relax integrality constraints!
16.8.2 Vertex Cover via LP Relaxation
def vertex_cover_lp_relaxation(graph):
"""
LP relaxation approach for Vertex Cover.
1. Formulate as Integer Linear Program (ILP)
2. Relax to Linear Program (LP)
3. Solve LP (polynomial time)
4. Round fractional solution
Approximation ratio: 2
"""
# ILP formulation:
# Minimize: Σ x_v
# Subject to: x_u + x_v ≥ 1 for each edge (u,v)
# x_v ∈ {0,1} for each vertex v
# LP relaxation:
# Same but x_v ∈ [0,1]
# For demonstration, using simple heuristic
# In practice, use LP solver like scipy.optimize.linprog
# Simple fractional solution: x_v = 0.5 for all v
# This satisfies all constraints!
# Deterministic rounding: include if x_v ≥ 0.5
cover = set()
for v in graph.vertices:
if True: # In real implementation: if lp_solution[v] >= 0.5
cover.add(v)
return cover
def vertex_cover_primal_dual(graph):
"""
Primal-Dual approach for Vertex Cover.
Provides both solution and certificate of optimality.
"""
cover = set()
dual_values = {} # Dual variable for each edge
for u, v in graph.edges:
if u not in cover and v not in cover:
# Increase dual variable for this edge
dual_values[(u, v)] = 1
# Add vertices when dual constraint tight
u_dual_sum = sum(val for edge, val in dual_values.items()
if u in edge)
v_dual_sum = sum(val for edge, val in dual_values.items()
if v in edge)
if u_dual_sum >= 1:
cover.add(u)
if v_dual_sum >= 1:
cover.add(v)
return cover16.8.3 Set Cover via LP Relaxation
import numpy as np
from scipy.optimize import linprog
def set_cover_lp(universe, sets, weights=None):
"""
LP relaxation for Weighted Set Cover.
Better than ln(n) approximation in practice!
"""
n_sets = len(sets)
n_elements = len(universe)
if weights is None:
weights = [1] * n_sets
# Create constraint matrix
# A[i][j] = 1 if element i is in set j
A = np.zeros((n_elements, n_sets))
for j, s in enumerate(sets):
for i, elem in enumerate(universe):
if elem in s:
A[i][j] = 1
# Solve LP: minimize c^T x subject to Ax >= 1, 0 <= x <= 1
result = linprog(
c=weights,
A_ub=-A, # Convert to <=
b_ub=-np.ones(n_elements),
bounds=[(0, 1) for _ in range(n_sets)],
method='highs'
)
if not result.success:
return None
# Round fractional solution
# Strategy 1: Include if x_i >= 1/f where f is max frequency
max_frequency = max(sum(1 for s in sets if elem in s)
for elem in universe)
threshold = 1 / max_frequency
cover = [i for i, x in enumerate(result.x) if x >= threshold]
return cover16.8.4 The Integrality Gap
The integrality gap measures how much we lose by relaxing:
Integrality Gap = (Worst integer solution) / (Best fractional solution)
Examples: - Vertex Cover: Gap = 2 - Set Cover: Gap = ln(n) - TSP: Gap can be arbitrarily large!
Understanding the gap helps us know how well LP relaxation can work.
16.9 Section 8.7: Approximation Schemes
16.9.1 PTAS: Polynomial Time Approximation Scheme
Get arbitrarily close to optimal, trading time for accuracy!
16.9.2 Knapsack: A Classic FPTAS
def knapsack_fptas(weights, values, capacity, epsilon=0.1):
"""
FPTAS for 0/1 Knapsack.
Achieves (1 + epsilon) approximation in time O(n³/epsilon).
Algorithm:
1. Scale down values
2. Solve scaled problem exactly with DP
3. Solution is approximately optimal for original
"""
n = len(weights)
# Find scaling factor
max_value = max(values)
K = epsilon * max_value / n
# Scale values
scaled_values = [int(v / K) for v in values]
# DP on scaled problem
max_scaled_value = sum(scaled_values)
dp = [[False] * (max_scaled_value + 1) for _ in range(capacity + 1)]
dp[0][0] = True
for i in range(n):
# Traverse in reverse to avoid using item multiple times
for w in range(capacity, weights[i] - 1, -1):
for v in range(max_scaled_value + 1):
if dp[w - weights[i]][v]:
dp[w][v + scaled_values[i]] = True
# Find maximum achievable value
max_achieved = 0
for v in range(max_scaled_value + 1):
for w in range(capacity + 1):
if dp[w][v]:
max_achieved = max(max_achieved, v)
# Reconstruct solution
current_weight = 0
current_value = max_achieved
selected = []
for i in range(n - 1, -1, -1):
if current_weight + weights[i] <= capacity and \
current_value >= scaled_values[i] and \
dp[current_weight + weights[i]][current_value - scaled_values[i]]:
selected.append(i)
current_weight += weights[i]
current_value -= scaled_values[i]
# Calculate actual value
actual_value = sum(values[i] for i in selected)
return selected, actual_value
# Example
weights = [10, 20, 30, 40]
values = [60, 100, 120, 240]
capacity = 50
for epsilon in [0.5, 0.1, 0.01]:
items, value = knapsack_fptas(weights, values, capacity, epsilon)
print(f"ε={epsilon}: Value={value}, Items={items}")Why This Works:
- Scaling preserves relative order (mostly)
- Error per item ≤ K
- Total error ≤ n × K = ε × max_value
- Approximation ratio ≤ (1 + ε)
16.9.3 Euclidean TSP: A PTAS
def euclidean_tsp_ptas(points, epsilon=0.1):
"""
PTAS for Euclidean TSP using geometric decomposition.
Simplified version of Arora's algorithm.
Time: O(n × (log n)^(O(1/epsilon)))
"""
def divide_and_conquer(points, depth, max_depth):
"""
Recursively partition plane and solve subproblems.
"""
if len(points) <= 3 or depth >= max_depth:
# Base case: solve small instance exactly
return tsp_exact_small(points)
# Partition into quadrants
mid_x = sorted(p[0] for p in points)[len(points)//2]
mid_y = sorted(p[1] for p in points)[len(points)//2]
quadrants = [[], [], [], []]
for p in points:
if p[0] <= mid_x and p[1] <= mid_y:
quadrants[0].append(p)
elif p[0] > mid_x and p[1] <= mid_y:
quadrants[1].append(p)
elif p[0] <= mid_x and p[1] > mid_y:
quadrants[2].append(p)
else:
quadrants[3].append(p)
# Solve each quadrant
tours = []
for quad in quadrants:
if quad:
tours.append(divide_and_conquer(quad, depth + 1, max_depth))
# Combine tours (simplified - real algorithm is complex)
return combine_tours(tours)
# Set recursion depth based on epsilon
max_depth = int(1 / epsilon)
return divide_and_conquer(points, 0, max_depth)16.9.4 When PTAS Exists
Problems admitting PTAS often have: 1. Geometric structure (Euclidean space) 2. Bounded treewidth (planar graphs) 3. Fixed parameter (k-center for fixed k)
Problems usually NOT admitting PTAS: 1. General graphs (no structure) 2. Strong NP-hard problems (unless P = NP) 3. Problems with large integrality gaps
16.10 Section 8.8: Hardness of Approximation
16.10.1 Some Problems Resist Approximation
Not all NP-hard problems can be approximated!
16.10.2 Inapproximability Results
def why_general_tsp_is_hard():
"""
Proof that general TSP cannot be approximated.
"""
explanation = """
Theorem: Unless P = NP, no polynomial-time algorithm can
approximate general TSP within ANY constant factor.
Proof idea:
1. Suppose we have α-approximation for TSP
2. Given Hamiltonian Cycle instance G:
- Create TSP instance with:
* distance 1 for edges in G
* distance α×n + 1 for non-edges
3. If G has Hamiltonian cycle:
- Optimal TSP = n
- Algorithm returns ≤ α×n
4. If G has no Hamiltonian cycle:
- Optimal TSP > α×n
- Algorithm returns > α×n
5. We can decide Hamiltonian Cycle!
6. But Hamiltonian Cycle is NP-complete
7. Therefore, no such approximation exists
"""
return explanation
def gap_preserving_reductions():
"""
How we prove hardness of approximation.
"""
explanation = """
Gap-Preserving Reduction:
Transform problem A to problem B such that:
- YES instance of A → OPT(B) ≥ c
- NO instance of A → OPT(B) < c/α
This creates a "gap" that approximation must distinguish.
Example: Proving MAX-3SAT is hard to approximate:
1. Start with 3SAT (NP-complete)
2. Create MAX-3SAT instance
3. Satisfiable → can satisfy all clauses
4. Unsatisfiable → can't satisfy > 7/8 + ε fraction
5. Gap of 1 vs 7/8 + ε
6. So can't approximate better than 7/8 + ε
"""
return explanation16.10.3 The PCP Theorem
The most important result in hardness of approximation:
def pcp_theorem():
"""
The PCP (Probabilistically Checkable Proofs) Theorem.
"""
explanation = """
PCP Theorem: NP = PCP(log n, 1)
In English:
Every NP problem has proofs that can be verified by:
- Reading only O(log n) random bits
- Examining only O(1) bits of the proof
- Accepting correct proofs with probability 1
- Rejecting incorrect proofs with probability ≥ 1/2
Implications:
1. MAX-3SAT cannot be approximated better than 7/8
2. MAX-CLIQUE cannot be approximated within n^ε
3. Set Cover cannot be approximated better than ln n
The PCP theorem revolutionized our understanding of approximation!
"""
return explanation16.10.4 APX-Completeness
Some problems are “hardest to approximate”:
class APXComplete:
"""
Problems that are complete for APX (constant-factor approximable).
"""
PROBLEMS = [
"MAX-3SAT",
"Vertex Cover",
"MAX-CUT",
"Metric TSP",
"Bin Packing"
]
def implications(self):
"""
What APX-completeness means.
"""
return """
If any APX-complete problem has a PTAS, then ALL do!
This is unlikely because:
- Would imply PTAS for problems we've studied for decades
- No progress despite enormous effort
- Would collapse complexity classes
APX-complete = "Goldilocks zone" of approximation
- Not too easy (has PTAS)
- Not too hard (no constant approximation)
- Just right (constant factor, but no PTAS)
"""16.11 Chapter 8: Practical Implementation Guide
16.11.1 A Complete Approximation Algorithm Toolkit
class ApproximationToolkit:
"""
Ready-to-use approximation algorithms for common problems.
"""
def __init__(self):
self.algorithms = {
'vertex_cover': {
'simple': self.vertex_cover_simple,
'matching': self.vertex_cover_matching,
'lp': self.vertex_cover_lp
},
'set_cover': {
'greedy': self.set_cover_greedy,
'lp': self.set_cover_lp
},
'tsp': {
'mst': self.tsp_mst,
'christofides': self.tsp_christofides
},
'max_cut': {
'random': self.max_cut_random,
'sdp': self.max_cut_sdp
}
}
def solve(self, problem, instance, method='best'):
"""
Solve problem with specified or best method.
"""
if method == 'best':
# Choose based on instance characteristics
method = self.choose_best_method(problem, instance)
return self.algorithms[problem][method](instance)
def choose_best_method(self, problem, instance):
"""
Heuristic to choose best algorithm for instance.
"""
if problem == 'vertex_cover':
# Use LP for dense graphs, matching for sparse
density = len(instance.edges) / (len(instance.vertices) ** 2)
return 'lp' if density > 0.3 else 'matching'
elif problem == 'tsp':
# Use Christofides for metric TSP
if self.is_metric(instance):
return 'christofides'
return 'mst'
# Default choices
return list(self.algorithms[problem].keys())[0]
def analyze_performance(self, problem, instance, method):
"""
Analyze algorithm performance on instance.
"""
import time
start = time.time()
solution = self.solve(problem, instance, method)
runtime = time.time() - start
# Calculate approximation ratio (if optimal known)
ratio = None
if hasattr(instance, 'optimal'):
if problem in ['vertex_cover', 'set_cover', 'tsp']:
# Minimization
ratio = len(solution) / instance.optimal
else:
# Maximization
ratio = instance.optimal / len(solution)
return {
'solution': solution,
'runtime': runtime,
'approximation_ratio': ratio,
'theoretical_guarantee': self.get_guarantee(problem, method)
}
def get_guarantee(self, problem, method):
"""
Return theoretical approximation guarantee.
"""
guarantees = {
'vertex_cover': {'simple': 2, 'matching': 2, 'lp': 2},
'set_cover': {'greedy': 'ln(n)', 'lp': 'f'},
'tsp': {'mst': 2, 'christofides': 1.5},
'max_cut': {'random': 0.5, 'sdp': 0.878}
}
return guarantees.get(problem, {}).get(method, 'Unknown')16.12 Chapter 8 Exercises
16.12.1 Conceptual Understanding
8.1 Approximation Ratios For each algorithm, determine its approximation ratio:
- Always pick the largest available item for bin packing
- Color vertices greedily with minimum available color
- For MAX-SAT, set each variable to satisfy majority of its clauses
- For facility location, open facility at each client location
8.2 Hardness of Approximation Prove that these problems are hard to approximate:
- General TSP (any constant factor)
- Graph coloring (within n^(1-ε))
- Maximum independent set (within n^(1-ε))
8.3 Algorithm Design Design approximation algorithms for:
- Minimum dominating set in graphs
- Maximum weight matching
- Minimum feedback vertex set
- k-median clustering
16.12.2 Implementation Problems
8.4 Implement Core Algorithms
def implement_core_approximations():
"""Implement these essential approximation algorithms."""
def weighted_vertex_cover(graph, weights):
"""2-approximation for weighted vertex cover."""
pass
def max_3sat_random(formula):
"""7/8-approximation for MAX-3SAT."""
pass
def bin_packing_first_fit(items, bin_size):
"""First-fit algorithm for bin packing."""
pass
def k_center_greedy(points, k):
"""2-approximation for k-center clustering."""
pass8.5 Advanced Techniques
def advanced_approximations():
"""Implement advanced approximation techniques."""
def primal_dual_set_cover(universe, sets):
"""Primal-dual approach for set cover."""
pass
def sdp_max_cut(graph):
"""SDP relaxation for MAX-CUT."""
pass
def local_search_k_median(points, k):
"""Local search (5+ε)-approximation."""
pass16.12.3 Analysis Problems
8.6 Prove Approximation Ratios Prove the approximation ratio for:
- First-fit decreasing for bin packing (11/9 OPT + 6/9)
- Greedy set cover (ln n + 1)
- Random assignment for MAX-CUT (0.5)
- MST-based TSP (2.0)
8.7 Compare Algorithms Experimentally compare:
- Different vertex cover algorithms on random graphs
- Greedy vs LP rounding for set cover
- Various TSP approximations on Euclidean instances
- Randomized vs deterministic MAX-CUT
8.8 Real-World Applications Apply approximation algorithms to:
- Amazon delivery route optimization
- Cell tower placement for coverage
- Course scheduling minimizing conflicts
- Data center task allocation
16.13 Chapter 8 Summary
16.13.1 Key Takeaways
- Approximation Guarantees Matter
- Not just heuristics—provable quality bounds
- Know exactly how far from optimal you might be
- Different guarantees: constant, logarithmic, PTAS
- Standard Techniques
- Greedy: Simple, often optimal for special structures
- LP Relaxation: Powerful, good bounds
- Randomization: Surprisingly effective
- Local Search: Practical, good empirical performance
- Problem-Specific Insights
- Vertex Cover: Any maximal matching gives 2-approx
- TSP: Metric property enables approximation
- Set Cover: Greedy is nearly optimal
- MAX-CUT: Random is hard to beat!
- Hardness Results
- Some problems resist approximation
- PCP theorem revolutionized the field
- Knowing limits prevents wasted effort
- Practical Considerations
- Approximation algorithms used everywhere
- Often perform better than worst-case guarantee
- Can combine with heuristics for better results
- Speed vs quality trade-off is controllable
16.13.2 Decision Framework
When facing an NP-hard optimization problem:
- Check for approximation algorithms
- Look for existing results
- Consider problem structure
- Choose your approach
- Need guarantee? → Approximation algorithm
- Need speed? → Simple greedy
- Need quality? → LP relaxation or PTAS
- Instance-specific? → Heuristics
- Implement and evaluate
- Start simple (greedy)
- Measure actual performance
- Refine based on results
- Know the limits
- Check hardness results
- Don’t seek impossible guarantees
- Focus effort where it matters
16.13.3 The Art of Approximation
Approximation algorithms represent a beautiful compromise between theory and practice:
- Theory: Rigorous guarantees, worst-case analysis
- Practice: Fast algorithms, good solutions
- Together: Practical algorithms with confidence
16.13.4 Looking Forward
The field of approximation algorithms continues to evolve:
- Improved bounds for classic problems
- New techniques (SDP, metric embeddings)
- Practical implementations beating guarantees
- Machine learning guiding algorithm choice
16.13.5 Next Chapter Preview
In Chapter 9, we explore Advanced Graph Algorithms, where we’ll use our approximation techniques alongside exact algorithms to solve complex network problems!
16.13.6 Final Thought
“In the real world, a bird in the hand is worth two in the bush. In computer science, we can prove it’s worth at least half a bird in the bush—and that’s often good enough!”
Approximation algorithms teach us that perfection is not always necessary or even desirable. By accepting solutions that are provably close to optimal, we can solve problems that would otherwise be impossible. This is not giving up—it’s strategic compromise with mathematical backing.
Master approximation algorithms, and you’ll never be stuck waiting for the perfect solution when a great one is available now!