Gale-Shapley Stable Matching Algorithm: Theory, Variants, and Applications

Source: gale_shapley_algorithm.md


Gale-Shapley Stable Matching Algorithm: Theory, Variants, and Applications

Algorithm Description with Pseudocode

The Stable Matching Problem

The stable matching problem was first formulated by David Gale and Lloyd Shapley in their 1962 paper "College Admissions and the Stability of Marriage" (American Mathematical Monthly, Vol. 69, pp. 9-15). The problem asks: given two equally sized sets of agents (e.g., students and colleges, men and women, doctors and hospitals), each with strict preference orderings over agents on the other side, find a matching such that no unmatched pair would prefer each other to their current partners.

Definition (Blocking Pair). Given a matching M, a pair (s, c) where s is a student and c is a college forms a blocking pair if:

  1. s prefers c to their current match in M (or s is unmatched), AND
  2. c prefers s to at least one of its current matches in M (or c has unfilled capacity).

Definition (Stable Matching). A matching M is stable if it contains no blocking pairs.

Theorem (Gale-Shapley, 1962). For every instance of the stable matching problem with strict preferences, a stable matching always exists and can be found by the Deferred Acceptance algorithm.

Student-Proposing Deferred Acceptance (DA) Algorithm

This is the canonical version where students (the "proposing" side) make offers to colleges (the "receiving" side).

ALGORITHM: Student-Proposing Deferred Acceptance
INPUT:  Set of students S = {s_1, ..., s_n}
        Set of colleges C = {c_1, ..., c_m}
        Preference list P(s_i) for each student over colleges
        Preference list P(c_j) for each college over students
        Capacity q_j for each college c_j

Initialize: All students are FREE. All colleges have empty tentative lists.

WHILE there exists a free student s who has not yet proposed to all colleges:
    c <- first college on s's preference list to whom s has not yet proposed
    s proposes to c

    IF c has fewer than q_j tentative holds:
        c tentatively holds s        // "maybe"
    ELSE IF c prefers s to its least-preferred tentative hold s':
        c replaces s' with s         // "maybe" to s, "no" to s'
        s' becomes FREE
    ELSE:
        c rejects s                  // "no"

OUTPUT: All tentative holds become final assignments.

Key mechanism: Acceptances are deferred -- colleges hold proposals tentatively and can replace them with better offers in later rounds. This deferral is what guarantees stability.

College-Proposing Deferred Acceptance Algorithm

The roles are reversed: colleges make offers to students.

ALGORITHM: College-Proposing Deferred Acceptance
INPUT:  Same as above.

Initialize: All colleges have q_j unfilled positions. All students are FREE.

WHILE there exists a college c with unfilled positions that has not
      proposed to all acceptable students:
    s <- highest-ranked student on c's list to whom c has not yet proposed
    c proposes to s

    IF s is FREE:
        s tentatively holds c's offer    // "maybe"
    ELSE IF s prefers c to current tentative hold c':
        s switches from c' to c          // "maybe" to c, "no" to c'
        c' regains one unfilled position
    ELSE:
        s rejects c's offer              // "no"

OUTPUT: All tentative holds become final assignments.

Termination and Correctness

Termination: The algorithm must terminate because:

Stability: The output is always stable. Proof by contradiction: Suppose (s, c) is a blocking pair in the output matching M. Then s prefers c to their assigned college. This means s proposed to c before their assigned college. Since s is not matched to c, c must have rejected s (either immediately or by replacing s with a more preferred student). But colleges only replace students with more preferred ones, so c's final match is at least as preferred as s. This contradicts the assumption that c prefers s to a current match.


Asymmetry and Optimality Results

The Proposer-Optimality Theorem

Theorem (Gale-Shapley, 1962). The student-proposing DA produces a matching that is student-optimal among all stable matchings. That is, every student is matched to their most-preferred college achievable in ANY stable matching.

Definition (Valid partner). A college c is a valid partner for student s if there exists some stable matching in which s is matched to c.

Theorem (restated). In the student-proposing DA, each student is matched to their most-preferred valid partner.

Proof sketch: By induction on the rounds of the algorithm. Suppose at some round, a student s is rejected by a valid partner c for the first time. Then c rejected s in favor of some student s' whom c prefers. But since this is the first rejection of a valid partner, s' has not yet been rejected by any valid partner, and so s' has only proposed to colleges they prefer to c (or c itself). There exists a stable matching M where s is matched to c. In M, s' must be matched to some college c' that s' likes no better than c (since s' proposed to c before c'). Then (s', c) would be a blocking pair in M, contradicting its stability.

The Receiver-Pessimality Theorem

Theorem. The student-proposing DA produces the college-pessimal stable matching. Every college is matched to its least-preferred set of valid partners.

This creates a fundamental asymmetry:

The college-proposing DA reverses these roles: it produces the college-optimal, student-pessimal stable matching.

The Lattice of Stable Matchings

Theorem (Knuth, 1976; Conway, unpublished). The set of all stable matchings forms a distributive lattice under the partial order defined by student preferences.

Given two stable matchings M and M', define:

Both M v M' and M ^ M' are themselves stable matchings. The student-optimal matching is the supremum (top) of this lattice, and the college-optimal matching is the infimum (bottom).

Implication: There can be exponentially many stable matchings (up to 2^{n/2} for n students), but the lattice structure means the proposer-optimal and receiver-optimal matchings bound them all.

Opposition of Interests

Theorem. For any two stable matchings M and M', if every student weakly prefers M to M', then every college weakly prefers M' to M. The interests of the two sides are diametrically opposed within the lattice of stable matchings.


Variants and Extensions

The Hospitals/Residents Problem (Many-to-One Matching)

The Hospitals/Residents (HR) problem is a direct generalization of the stable marriage problem where hospitals (colleges) can accept multiple residents (students). Each hospital h has a capacity q_h.

Formulation: Given n residents and m hospitals with capacities, find an assignment where:

The student-proposing DA generalizes directly: hospitals hold up to q_h tentative offers at a time, rejecting only when full and receiving a better applicant. All results (existence, optimality, lattice structure) carry over.

Equivalence Theorem (Roth, 1984). The set of stable matchings in the HR problem has the same structure as if each hospital position were treated as a separate agent with identical preferences. This means many-to-one matching reduces theoretically to one-to-one matching.

Stable Matching with Ties and Incomplete Lists (SMTI)

In practice, agents may:

Three stability concepts arise when ties are allowed:

Stability Definition
Weak stability No blocking pair where both agents strictly prefer each other to current match
Strong stability No blocking pair where one agent strictly prefers and the other weakly prefers
Super stability No blocking pair where both agents weakly prefer each other

Complexity results:

Practical importance: SMTI arises in Scottish hospital matching (HRT -- Hospitals/Residents with Ties) and adoption matching (pairing children with families), where strict orderings over all candidates are unrealistic.

The College Admissions Problem with Lower Quotas

Some extensions require each college to fill a minimum number of seats, not just a maximum. This introduces feasibility constraints that can conflict with stability: a stable matching satisfying all lower quotas may not exist.

Matching with Contracts (Hatfield-Milgrom, 2005)

A unifying generalization where agents negotiate over contracts (specifying salary, position type, etc.) rather than simple matches. The DA algorithm generalizes under certain "substitutability" conditions on colleges' choice functions. Many classical results (existence, optimality, strategy-proofness) extend to this framework.

Matching with Couples (Roth-Peranson, 1999)

When residents apply as couples (both must be placed in the same city), stable matchings may not exist, and finding one is NP-complete. Roth and Peranson developed heuristic algorithms for the NRMP that handle couples in practice, though without theoretical guarantees.


Nobel Contributions (Roth)

The 2012 Nobel Prize

The Sveriges Riksbank Prize in Economic Sciences in Memory of Alfred Nobel 2012 was awarded jointly to Alvin E. Roth and Lloyd S. Shapley "for the theory of stable allocations and the practice of market design."

Shapley (with Gale) provided the foundational theory in 1962. Roth, starting in the 1980s, transformed this theory into practical market design, creating some of the most successful applications of economic theory in the real world.

Roth's Key Contributions

1. Connecting Theory to Practice: The NRMP Discovery (1984)

Roth made the remarkable discovery that the National Resident Matching Program (NRMP), which had been matching medical residents to hospitals since 1952, was already using an algorithm equivalent to the hospital-proposing DA -- even though the NRMP designers were unaware of Gale and Shapley's work. This provided powerful empirical evidence for the theory: the NRMP had independently converged on a stable matching mechanism because unstable alternatives had failed in practice.

Roth showed that earlier clearinghouse mechanisms in the UK (for medical graduates) that did NOT produce stable matchings had collapsed, while those that did produce stable matchings had survived. This established stability as a key practical requirement.

2. The NRMP Redesign (1995-1998)

By the 1990s, the NRMP faced new challenges:

Roth, working with Elliott Peranson, redesigned the NRMP to use:

3. Strategy-Proofness and the Impossibility Theorem

Theorem (Roth, 1982). The student-proposing DA is strategy-proof for students: no student can obtain a better outcome by misrepresenting their preferences, regardless of what other students or colleges do. Truth-telling is a dominant strategy for the proposing side.

Impossibility Theorem (Roth, 1982). No stable matching mechanism can be strategy-proof for both sides simultaneously (when there are at least 3 agents on each side). Colleges always have potential incentives to misrepresent preferences or capacities under any stable mechanism.

This result has profound implications: it justifies using the student-proposing DA in school choice (where protecting students from strategic manipulation is paramount), even though colleges may have incentives to game the system.

4. The Rural Hospital Theorem

Theorem (Roth, 1984; building on McVitie-Wilson, 1970). In every stable matching:

  1. The same set of residents (students) are unmatched.
  2. Each hospital (college) fills the same number of positions.
  3. Any hospital that has unfilled positions in one stable matching receives exactly the same set of residents in every stable matching.

Implication: The choice between student-optimal and hospital-optimal DA does NOT affect which hospitals have empty seats or which students go unmatched. Rural hospitals that struggle to attract residents cannot be helped by changing the matching algorithm -- the problem lies in preferences, not mechanism design.

5. School Choice: Boston and New York City

Roth (with Abdulkadiroglu and Sonmez, 2003) showed that the "Boston mechanism" used in many school districts was:

They proposed replacing it with student-proposing DA. Implementations:

6. Kidney Exchange

Roth, with Sonmez and Unver (2004-2007), applied matching theory to kidney exchange:


Complexity Analysis

Time Complexity

Theorem. The Gale-Shapley DA algorithm runs in O(n^2) time for n agents on each side.

Proof:

Lower bound: Any algorithm that finds a stable matching must read Omega(n^2) preference data in the worst case (Ng and Hirschberg, 1990), so Gale-Shapley is asymptotically optimal.

Space Complexity

Number of Stable Matchings

Proposals Until Termination


Application to College Admissions Simulation

Relevance to the College-Sim Project

The current simulation uses a score-based admission model with rounds (ED, EA, RD, etc.) rather than a pure Gale-Shapley matching. This is realistic because real college admissions are NOT a pure stable matching market:

  1. Information asymmetry: Colleges do not know students' full preference orderings, and students do not know colleges' evaluation criteria precisely.
  2. Non-transparent preferences: Colleges use holistic review with opaque criteria, not strict orderable preferences.
  3. Strategic application: Students strategically choose where to apply (portfolio construction), unlike DA where all preferences are submitted simultaneously.
  4. Binding early rounds: ED commitments create a sequential game, not a single-round matching.

How Gale-Shapley Theory Informs the Simulation

Despite the differences, Gale-Shapley theory provides several insights:

  1. Stability as a benchmark: The simulation can check whether its output matching is "approximately stable" -- whether there are many pairs where a student and college would both prefer each other to their current match. High instability suggests the mechanism is performing poorly.

  2. Yield modeling: Student-proposing DA predicts students get better outcomes. In reality, the college-proposing nature of admissions (colleges choose whom to admit, students choose among admits) creates an intermediate outcome. The simulation's yield decision phase mirrors students' "accepting" or "rejecting" offers.

  3. The role of ED/EA: Early Decision acts as a partial commitment device. In Gale-Shapley terms, it is as if a student irrevocably proposes to one college first, which has been shown to benefit students at the cost of reducing the set of stable matchings.

  4. Hook multipliers and affirmative action: These correspond to "modified preferences" -- colleges effectively rank students differently than pure academic merit would suggest. Gale-Shapley theory shows that modified preferences change the stable matching set but the algorithm still produces a stable outcome relative to the modified preferences.

  5. Capacity constraints: The many-to-one HR problem directly models colleges with limited seats. The simulation's enrollment targets correspond to hospital capacities in the HR formulation.

Potential Enhancement: Hybrid DA-Scoring Model

A possible enhancement for the simulation would be a hybrid approach:


Key References

Sources