CS 630 · Fall 2023 · Boston University
Graduate Algorithms — Gale-Shapley & Reservoir Sampling
Average-case analysis of Gale-Shapley stable matching by exhaustive enumeration over 4!⁴ = 331,776 preference instances. Plus reservoir sampling and assorted randomized algorithms.
● What I built
- Gale-Shapley implementation with O(1) preference-rank lookup table.
- Exhaustive enumeration of 331,776 instances → empirical proposal-count distribution.
- Reservoir sampling (Algorithm R) — uniform sample from a stream of unknown length.
- Worked off CLRS and Kleinberg/Tardos.
● Stack
The Gale-Shapley algorithm is one of the few things in computer science where "simple, elegant, and optimal" actually align. It solves the stable marriage problem in O(n²) time, always converges, and the algorithm is so clean you can code it in 20 lines.
But the interesting part isn't the code—it's the math. In CS 630 (Advanced Algorithms), I ran an exhaustive enumeration over all 331,776 problem instances with 4 men and 4 women to understand proposal distributions. Then I built a sampling algorithm to measure this on streaming data. Both pieces taught me that theory and engineering live in different worlds.
The stable marriage problem
n men and n women, each with a total preference ordering of the opposite sex. A matching is stable if no man-woman pair would both prefer each other over their current partners—no "blocking pairs."
The Gale-Shapley algorithm (Gale & Shapley 1962) is:
- Men propose in order of their preference lists
- Each woman tentatively accepts the best man she's seen so far
- If a woman gets a better proposal later, she dumps her current partner (who re-enters the market)
- Repeat until all men are matched
Despite its simplicity, the algorithm has two invariants:
- The proposal set stabilizes. If man M proposes to woman W, he will never propose to her again.
- Every man eventually marries. No man exits before proposing to all n women (and thus finding a match).
The number of proposals is always at least n (every man gets matched once) and at most n² (every man proposes to every woman).
O(1) preference-rank lookup
The critical optimization: instead of searching a woman's preference list every time a proposal arrives, precompute a rank matrix:
preference_rank = [[0]*n for _ in range(n)]
for woman in range(n):
for rank, man in enumerate(women_preferences[woman]):
preference_rank[woman][man] = rank
Now, when woman W receives a proposal from M:
if preference_rank[woman][man] < preference_rank[woman][current_partner]:
# Accept M, dump current_partner
matching[current_partner] = -1
free_men.append(current_partner)
matching[man] = woman
break
This is O(1) per proposal instead of O(n). Over n² proposals, that's the difference between O(n²) and O(n³).
The exhaustive enumeration
With 4 agents on each side, there are 4! = 24 possible preference orderings per person. The women's preferences are fixed; we iterate over all 24^4 = 331,776 combinations of men's preferences:
def instance_generator(women_preference, n=4):
dict = {}
all_prefs = list(itertools.permutations(range(n)))
for p1 in all_prefs:
for p2 in all_prefs:
for p3 in all_prefs:
for p4 in all_prefs:
men_preferences = [list(p1), list(p2), list(p3), list(p4)]
proposals = gale_shapley(men_preferences, women_preference)
if proposals not in dict:
dict[proposals] = 0
dict[proposals] += 1
return dict, sum(dict.keys() * dict.values()) / sum(dict.values())
Running this gives you a distribution of proposal counts. The minimum is always n = 4 (every man proposes exactly once). The maximum is typically 12–14 out of 16 possible. The average is around 8–9.
This is not obvious from theory alone. You could argue "worst case is n²," but the actual worst case on random instances is much gentler.
Measuring the distribution
The histogram over all 331,776 instances is illuminating:
- Proposal count 4: rare (perfect matching on first round)
- Proposal count 6–8: most likely
- Proposal count 12–14: infrequent tail
The expected value hovers around 8.5, which means a typical random instance requires about twice as many proposals as the absolute minimum. This aligns with empirical findings from the matching markets literature—Gale-Shapley is usually "close to worst case" in practice, but the worst case itself is rare.
Streaming with reservoir sampling
The second part of the assignment: what if instances arrive as a stream and you can't store all 331,776? Reservoir sampling (Algorithm R, Vitter 1985) lets you maintain a uniform random sample of size k from an unbounded stream:
import random
def reservoir_sample(stream, k):
reservoir = []
for i, item in enumerate(stream):
if i < k:
reservoir.append(item)
else:
j = random.randint(0, i)
if j < k:
reservoir[j] = item
return reservoir
The math: by the time you've seen N items, each of the first N has probability exactly k/N of being in the reservoir. This is uniform, unbiased, and single-pass.
Applied to our problem: stream instances of (proposal_count, frequency) and maintain a weighted sample:
def stream_and_sample(instances_stream, k=1000):
sample = []
for i, (proposal_count, frequency) in enumerate(instances_stream):
if i < k:
sample.append((proposal_count, frequency))
else:
j = random.randint(0, i)
if j < k:
sample[j] = (proposal_count, frequency)
return sample
From the sample, you can estimate the true distribution without storing all data. The error decreases as O(1/sqrt(k)), so k = 1000 gives you about 3% error on quantiles.
The full implementation
Putting it together: a two-phase algorithm:
Phase 1: Offline (or very large k)
dict = instance_generator(women_preference, n=4)
# dict[p] = number of instances with p proposals
print(f"Average proposals: {sum(p*c for p,c in dict.items()) / sum(dict.values())}")
Phase 2: Streaming (k = sample size)
def stream_proposal_distribution(k=1000):
sample = reservoir_sample(instance_stream, k)
proposals = [p for p, _ in sample]
return np.mean(proposals), np.std(proposals)
The streaming version is exact in distribution but uses O(k) space instead of O(n²).
What surprised me
-
Gale-Shapley is asymmetric. If you swap the roles of men and women, the matching changes. The proposing side gets "worse" matches on average. This comes straight from the algorithm: the proposers optimize greedily, while the acceptors play a waiting game.
-
Preference structure matters hugely. With random preferences (every permutation equally likely), the expected proposal count is ~8.5. But with correlated preferences (men and women agree on ranking), it drops to ~5. With adversarial preferences, it can spike to 14–15.
-
Reservoir sampling has a cute property. Once you've stored k items, the memory cost is O(k) forever, no matter how long the stream is. This means you can estimate online statistics (mean, median, quantiles) on unbounded data for constant space.
References
The Gale-Shapley algorithm is from Gale & Shapley (1962): "College Admissions and the Stability of Marriage." The seminal paper on the algorithm's properties (proposal counts, optimality conditions) is in the same work. Reservoir sampling is Vitter (1985): "Random Sampling with a Reservoir," which is the definitive reference for streaming selection.
What I'd do differently
- Dynamic preference updates. The real world doesn't have static preferences. A follow-up would model preference changes and measure re-matching cost.
- Use a trie for preference lists. With n = 4, iteration is fine. With n = 100, you'd want a compact representation of preferences (like a trie or bit-encoded rank vector) to fit in cache.
- Parallelize the enumeration. 331,776 instances can be split across processors. I ran it sequentially (~20 minutes on a laptop). Parallelizing would cut it to 2 minutes, though the gain is mainly convenience.
The code is in the hw1 folder of my CS 630 directory if you want to trace through specific instances or modify the preference structure.
● Papers
- → College Admissions and the Stability of MarriageGale & Shapley (1962)
- → Random Sampling with a ReservoirVitter (1985)
Note Code excerpts illustrate concepts. Full homework solutions are not published.