Research Summary: College Admissions Simulation

Source: RESEARCH_SUMMARY.md


Research Summary: College Admissions Simulation

Comprehensive Synthesis of 16 Research Documents

Compiled: 2026-03-01 Source files: 16 research documents in /research/ Purpose: Single canonical reference for simulation calibration, parameter tuning, and theoretical grounding


Table of Contents

  1. Executive Summary
  2. MIT Admissions Parameters

  3. 2.1 Race & Gender Effects

  4. 2.2 Athletic Hooks

  5. 2.3 Extracurricular Weighting

  6. 2.4 Feeder School Effects

  7. Agent Behavior Models

  8. 3.1 Student Portfolio Construction

  9. 3.2 Student Yield & Enrollment Decisions

  10. 3.3 College Admissions Decision Model

  11. 3.4 College Enrollment Management

  12. Open-Source Data Catalog

  13. 4.1 CommonApp & NACAC Data

  14. 4.2 HYPSM Common Data Sets

  15. 4.3 Top 20-50 Common Data Sets

  16. 4.4 Feeder School Datasets

  17. Optimal Matching Theory

  18. 5.1 Gale-Shapley Algorithm

  19. 5.2 US Admissions as a Matching Market

  20. 5.3 Student Welfare Optimization

  21. 5.4 K-12 School Choice Parallels

  22. Simulation Calibration Recommendations

  23. 6.1 Hook Multipliers

  24. 6.2 Round Multipliers by College Tier

  25. 6.3 Yield Rates by Tier

  26. 6.4 Applications Per Student

  27. 6.5 EC Scoring Weights

  28. 6.6 Gender Multipliers

  29. 6.7 Feeder School Multiplier

  30. 6.8 Perception Noise Parameter

  31. 6.9 Acceptance Rate Calibration Targets

  32. 6.10 Enrollment Management Parameters

  33. Data Sources & References

1. Executive Summary

1.1 Project Context

This synthesis covers 16 research documents produced to support a single-file, agent-based college admissions simulation. The simulation models the full US selective college admissions cycle — from student portfolio construction through Early Decision, Early Action, Regular Decision, and waitlist rounds — for approximately 20 high schools and 30 colleges. The simulation is implemented in vanilla ES6+ JavaScript with D3.js visualization, requiring no server.

The 16 source documents fall into four clusters:

Cluster Files Theme
MIT-specific parameters mit_race_gender, mit_athletic_hooks, mit_extracurriculars, exeter_mit_pipeline Granular parameter calibration from primary sources
Agent behavior student_portfolio_behavior, student_yield_behavior, college_decision_model, college_enrollment_management How students and colleges actually behave
Open-source data data_commonapp_nacac, data_hypsm_cds, data_top20_50_cds, data_feeder_schools Publicly verifiable ground truth
Matching theory gale_shapley_algorithm, college_matching_market, student_welfare_matching, k12_school_choice Theoretical grounding and design validation

1.2 Top 10 Simulation Implications

1. Hook multipliers are multiplicative, not additive. SFFA v. Harvard trial testimony confirmed hooks compound. An athlete who is also a legacy gets both multipliers applied sequentially, not summed. The current simulation correctly uses multiplicative hooks.

2. ED provides a 1.5-2.0x admit boost, not a flat percentage bump. The mechanism is a lower admit threshold (threshold_ED = threshold_RD × 0.70), which at realistic score distributions produces a 1.5-2.0x effective boost. This is empirically grounded in Avery, Fairbanks & Zeckhauser (2003).

3. Athletic hooks are the strongest single multiplier at most schools. SFFA trial data puts recruited athletes at ~86% admit rate at Harvard vs. 3.4% overall — roughly 25x, which collapses to a ~4-5x multiplier once you control for athlete self-selection into strong academic profiles. MIT is an outlier: no binding slots, no likely letters, estimated multiplier 2.5-3.5x.

4. Post-SFFA race multipliers must be set to 1.0x. The Supreme Court's June 2023 ruling in SFFA v. Harvard/UNC eliminated explicit race-conscious admissions. MIT's Class of 2028 (first post-SFFA cohort) showed dramatic demographic shifts: Black enrollment dropped from 13% to 5%, Asian enrollment rose from 41% to 47%. The simulation should not model race as a direct admit multiplier in post-SFFA mode.

5. Feeder school effects are large and undermodeled in most simulations. Harvard Crimson (2024) found 1 in 11 Harvard undergrads comes from just 21 schools (0.078% of US high schools). The counselor-quality premium (Exeter 33:1 vs. national 372:1) alone implies a 1.3-1.5x application quality boost, compounding to 2.0-2.5x overall for elite boarding schools after institutional trust and peer effects.

6. Applications per student have grown 46% since 2015-16. The mean is now 6.80 (2024-25 CommonApp data), up from 4.63 in 2013-14. Elite private school students average 12+ applications. This growth has increased overlap and intensified yield uncertainty for colleges.

7. Yield rates vary enormously by tier and must be tier-specific. MIT's 86.6% yield is roughly 3x UCLA's ~15-20% and roughly double a Near-Ivy like Johns Hopkins (~35%). Using a single yield parameter produces deeply unrealistic enrollment outcomes.

8. The US admissions market is NOT a stable matching. Unlike NRMP (medical residency matching), US college admissions is decentralized, sequential, and produces no stable matching in the Gale-Shapley sense. Many "blocking pairs" exist: students who prefer a college that would have preferred them over some admitted student. The simulation correctly models this as a decentralized round-based process, not a centralized algorithm.

9. EC scoring should use a tiered bonus system, not a continuous scale. Harvard's SFFA data shows a cliff: EC rating 1 students admit at 50.6% vs. EC rating 2 at 18.1% vs. EC rating 3 at 3.8%. This is better modeled by a Tier 1 spike bonus (+0.08) than a smooth linear transform.

10. Coordination matters more than algorithm choice in matching welfare. Abdulkadiroglu, Agarwal & Pathak (2017 AER) find that 80% of potential welfare gains in student-school matching come from coordination alone — ensuring students and schools express preferences simultaneously to the same clearinghouse — rather than from the specific algorithm. This validates the simulation's round-synchronized approach even without implementing full DA.


2. MIT Admissions Parameters

2.1 Race & Gender Effects

2.1.1 Historical Race Multipliers (Pre-SFFA, through Class of 2027)

The following multipliers derive from Peter Arcidiacono's expert testimony in SFFA v. Harvard (2018), applying a probit model to Harvard admissions data 2000-2017. MIT-specific figures are estimated by applying similar patterns with MIT's known demographic targets.

Racial/Ethnic Group Harvard (Arcidiacono) MIT (Estimated) Basis
African American 3.5x 3.0x SFFA trial testimony, MIT's ~13% Black enrollment target
Hispanic/Latino 2.3x 2.0x SFFA trial testimony
Native American 4.0x 3.5x Consistent with Harvard data
White 1.0x (baseline) 1.0x Baseline
Asian American 0.75x 0.80x Arcidiacono; MIT slightly less biased in STEM context
International 0.90x 0.85x US citizen preference; MIT cap ~11% international

Note: These multipliers apply to the admissions score calculation, not to external socioeconomic data generation.

2.1.2 Post-SFFA Race Multipliers (Class of 2028 onward)

Following the Supreme Court's June 29, 2023 decision in Students for Fair Admissions v. Harvard and Students for Fair Admissions v. UNC, race-conscious admissions is prohibited. All race multipliers become 1.0x.

Racial/Ethnic Group Post-SFFA Multiplier
All groups 1.0x

Indirect proxies that remain legal post-SFFA:

Proxy Factor Multiplier Legal Basis
First-generation college student 1.3-1.4x Socioeconomic diversity, race-neutral
Pell Grant eligibility 1.2-1.3x Socioeconomic diversity
Rural/underrepresented geography 1.1-1.2x Geographic diversity
Low-income zip code 1.1-1.2x Socioeconomic diversity
Underrepresented state/country 1.1x Geographic diversity

2.1.3 Observed Post-SFFA Demographic Shifts

MIT Class of 2028 (first post-SFFA cohort, enrolled fall 2024) showed dramatic shifts:

Group Pre-SFFA (Class of 2027) Post-SFFA (Class of 2028) Change
Black/African American ~13% ~5% -8 pp
Hispanic/Latino ~15% ~11% -4 pp
White ~38% ~37% -1 pp
Asian American ~41% ~47% +6 pp
Native American ~2% ~1% -1 pp

Harvard's Class of 2028 showed similar trends: Black enrollment dropped from ~15% to ~5-6%.

2.1.4 Gender Multipliers

Gender multipliers reflect actual imbalances in applicant pools and institutional diversity goals. They should be applied per college category, not universally.

College Category Male Multiplier Female Multiplier Rationale
STEM-heavy (MIT, Caltech, CMU) 1.0x (baseline) 1.8-2.0x Female admit rates historically ~2x male due to demand imbalance
Balanced research (HYPS, Ivies) 1.0x 1.0-1.1x Near-parity in applicant pool
Liberal arts colleges 1.2-1.3x 1.0x Male scarcity in LAC applicant pool
State flagships (UVA, Michigan) 1.0-1.1x 1.0x Near-parity; varies by year

MIT-specific data: MIT Class of 2029 admitted women at roughly 24% acceptance rate vs. approximately 12% for men (self-reported by MIT, reflecting its 24% female applicant pool against a target of ~49% female enrollment). This implies a multiplier of approximately 1.8-2.0x for women at MIT.

Implementation note: Gender multipliers interact with the academic score sigmoid. A female applicant with academic_index = 0.70 at MIT effectively competes as if her score were 0.70 × 1.9 = 1.33 before the sigmoid clips it. The correct implementation applies the multiplier to the admit probability, not to the raw score.

2.1.5 JavaScript Implementation Template

```javascript proof:W3sidHlwZSI6InByb29mQXV0aG9yZWQiLCJmcm9tIjowLCJ0byI6NTUwLCJhdHRycyI6eyJieSI6ImFpOmNsYXVkZSJ9fV0= // Post-SFFA (current) — race multipliers eliminated const RACE_MULTIPLIERS = { african_american: 1.0, hispanic: 1.0, white: 1.0, asian: 1.0, native_american: 1.0 };

// Socioeconomic proxy multipliers (legal post-SFFA) const SOCIOECONOMIC_MULTIPLIERS = { first_gen: 1.35, pell_eligible: 1.25, rural: 1.15, low_income_zip: 1.15 };

// Gender multipliers by college category const GENDER_MULTIPLIERS = { stem_heavy: { male: 1.0, female: 1.9 }, balanced: { male: 1.0, female: 1.05 }, lac: { male: 1.25, female: 1.0 } };

***

### 2.2 Athletic Hooks

#### 2.2.1 Harvard SFFA Trial Data

The SFFA v. Harvard trial produced the most granular public dataset on athletic hook effects:

| Group                                     | Admit Rate | Multiplier vs. Overall |
| ------------------------------------------------------------------------------------------------ | ----------------------------------------------------------------- | ----------------------------------------------------------------------------- |
| All recruited athletes                    | ~86%       | ~25x vs. 3.4% overall  |
| Recruited athletes w/ top academic rating | ~83%       | ~24x                   |
| Non-athletes w/ top academic rating       | ~16%       | ~5x                    |
| Walk-on athletes (not recruited)          | ~5-6%      | ~1.5x                  |

Controlling for the fact that recruited athletes are pre-screened to meet minimum academic thresholds, the effective hook multiplier collapses to approximately 4-5x at Harvard for the marginal admitted student.

**Espenshade & Chung (2005) SAT-equivalent estimates:**

* Athlete hook = +200 SAT points equivalent

* Legacy hook = +160 SAT points

* Black/African American = +230 points (pre-SFFA era)

* Hispanic = +185 points (pre-SFFA era)

* First-generation = +130 points

#### 2.2.2 MIT Athletic Model: No-Slot Exception

MIT is structurally unique among highly selective research universities:

* Fields 33 varsity sports (Division III, NEWMAC conference)

* Approximately 20-25% of undergrads participate in varsity athletics

* **No binding roster slots:** coaches cannot guarantee admission

* **No "likely letters":** MIT does not send pre-decision signals to recruits

* Admission decision made independently by admissions office; coaches submit advocacy letters

* Estimated admitted-athlete acceptance rate: 25-50% (compared to 4.6% overall)

* Implied gross multiplier: ~6-11x; effective multiplier controlling for academic self-selection: ~2.5-3.5x

This contrasts sharply with Harvard/Yale/Princeton, which operate formal recruitment "bands" and do issue likely letters.

#### 2.2.3 Recommended Multipliers by School Tier

| Tier                 | Schools                                    | Recommended Athlete Multiplier | Rationale                                                |
| --------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------- |
| HYPS (excluding MIT) | Harvard, Yale, Princeton, Stanford         | 4.0-5.0x                       | Formal slot system, likely letters, SFFA data            |
| MIT                  | MIT                                        | 2.5-3.5x                       | No slots, coach advocacy only, DIII                      |
| Other Ivy            | Columbia, Penn, Brown, Dartmouth, Cornell  | 3.5-4.5x                       | Ivy League formal recruitment system                     |
| Near-Ivy DI          | Duke, Northwestern, Notre Dame, Georgetown | 3.5-4.5x                       | DI with significant athletics program                    |
| Near-Ivy DIII        | Caltech, WashU, CMU                        | 1.5-2.5x                       | DIII, minimal athletics influence                        |
| NESCAC LACs          | Williams, Amherst, Middlebury              | 3.0-4.0x                       | NESCAC athletics culture, formal recruitment             |
| Selective publics    | UVA, UCLA, Michigan                        | 2.0-3.0x                       | Revenue sport athletes only; most athletes not recruited |

**Current simulation default:** 3.5x global (single tier). This is a defensible middle ground but under-represents HYPS and over-represents MIT. The recommended upgrade is per-tier multipliers.

#### 2.2.4 Sport-Type Differentiation

Not all recruited athletes receive the same boost. "Head count" sports (football, basketball, volleyball) that receive full scholarship commitments differ from "equivalency" sports.

| Sport Category                                   | Notes                               | Relative Multiplier |
| ------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------ | -------------------------------------------------------------------------- |
| Revenue/head-count sports (football, basketball) | Most scrutinized; large rosters     | High (3.5-5x)       |
| Olympic/NESCAC sports (rowing, squash, lacrosse) | Highest socioeconomic concentration | High (3.5-4.5x)     |
| DIII non-revenue                                 | No scholarships; weaker binding     | Lower (2.0-3.0x)    |
| Walk-ons (not recruited)                         | Essentially non-hook                | 1.1-1.3x            |

***

### 2.3 Extracurricular Weighting

#### 2.3.1 Harvard SFFA EC Rating Data

Harvard uses a 1-6 scale for extracurricular ratings (1 = highest). Admit rates by EC score from the SFFA trial dataset (2014-2019):

| EC Rating                | Admit Rate | Notes                                          |
| ------------------------------------------------------------------------------- | ----------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------- |
| 1 (Outstanding)          | 50.6%      | National-level achievement, elite competitions |
| 2 (Excellent)            | 18.1%      | State/regional leadership, significant impact  |
| 3 (Good)                 | 3.8%       | Active participant, some leadership            |
| 4 (Adequate)             | 1.6%       | Typical participant                            |
| 5-6 (Below average/None) | < 1%       | Weak or no EC record                           |

The cliff between EC 1 and EC 2 (50.6% vs. 18.1%) implies a nonlinear, threshold-based model rather than a smooth continuous function.

#### 2.3.2 CollegeVine Four-Tier Framework

CollegeVine's public-facing tier system is the best-validated public framework for categorizing ECs:

| Tier | Score Range | Description                                                     | Examples                                                                            | Population % |
| ----------------------------------------------------------- | ------------------------------------------------------------------ | ---------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------ | ------------------------------------------------------------------- |
| 1    | 8.5-10.0    | National/international recognition; elite competitions          | USAMO, Intel STS finalist, USIBO, national arts award, Olympic trial athlete        | 2-3%         |
| 2    | 6.5-8.4     | State/regional leadership; significant school-level achievement | State science fair winner, school newspaper editor-in-chief, student body president | 10-15%       |
| 3    | 4.0-6.4     | Active participant with some responsibility                     | Club member, JV sports, school play supporting cast, volunteer                      | 35-40%       |
| 4    | 1.5-3.9     | Minimal participation; listing only                             | One-time volunteer, unverifiable activities                                         | 40-50%       |

#### 2.3.3 Overall Weighting in Admissions Decision

Based on holistic review process documentation and SFFA trial testimony, the approximate weight breakdown:

| Factor                                    | Weight | Source                                 |
| ------------------------------------------------------------------------------------------------ | ------------------------------------------------------------- | --------------------------------------------------------------------------------------------- |
| Academic index (GPA + standardized tests) | 40-45% | Arcidiacono model; NACAC survey        |
| Extracurriculars                          | 25-30% | SFFA testimony; CollegeVine analysis   |
| Essays / personal statements              | 10-15% | NACAC 56.2% "considerable importance"  |
| Letters of recommendation                 | 8-12%  | NACAC 51-52% "considerable importance" |
| Demonstrated interest / alumni interview  | 5-10%  | College-specific; varies widely        |

#### 2.3.4 MIT-Specific EC Type Multipliers

MIT asks for only 4 activities (vs. CommonApp's 10-slot default), signaling depth over breadth. EC types receive different implicit weighting:

| EC Type                                     | MIT Multiplier | Rationale                             |
| -------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------- | -------------------------------------------------------------------------------------------- |
| STEM Research (published/presented)         | 1.15x          | Directly aligns with MIT mission      |
| Technical Competition (USAMO, USABO, USACO) | 1.10x          | Objective national benchmark          |
| Entrepreneurship / startup                  | 1.10x          | Innovation culture at MIT             |
| Community service (sustained, high impact)  | 1.00x          | Valued but not differentiating        |
| Arts (national-level)                       | 1.00x          | Valued; MIT has strong arts community |
| Generic volunteering                        | 0.85x          | Common; limited signal value          |
| Sports (DIII context)                       | 0.90x          | Less emphasized than Ivy coaching     |

#### 2.3.5 Spike Bonus Implementation

The "spike" concept captures the additional value of exceptional depth in a single activity vs. average performance across many activities:

```javascript proof:W3sidHlwZSI6InByb29mQXV0aG9yZWQiLCJmcm9tIjowLCJ0byI6NDczLCJhdHRycyI6eyJieSI6ImFpOmNsYXVkZSJ9fV0=
function calcECBonus(ecScore, ecTier) {
  // Base EC contribution (linear)
  let base = ecScore * EC_WEIGHT;  // EC_WEIGHT ~ 0.25-0.30

  // Tier-based spike bonus (nonlinear cliff)
  let spikeBonus = 0;
  if (ecTier === 1 || ecScore >= 8.5) {
    spikeBonus = 0.08;  // Substantial bonus for national-level achievement
  } else if (ecTier === 2 || ecScore >= 6.5) {
    spikeBonus = 0.03;  // Moderate bonus for regional/state leadership
  }

  return base + spikeBonus;
}

2.4 Feeder School Effects

2.4.1 Empirical Evidence

The feeder school effect is one of the most robust and undermodeled phenomena in elite admissions:

Harvard Crimson 2024 investigation:

Chetty, Deming & Friedman (2023, NBER Working Paper 31492):

2.4.2 Feeder Rate Benchmarks by School Type

School Type HYPSM Feeder Rate All Elite (top 30) Source
Elite boarding (Andover, Exeter, Groton) 15-20% of graduates 30-40% Harvard Crimson; school profiles
Selective day/magnet schools (Stuyvesant, Boston Latin) 8-12% 20-30% NSC estimates; journalism
Affluent suburban public (Lexington MA, Palo Alto) 5-8% 15-20% School profiles; journalism
Average suburban public 0.5-2% 3-6% NSC aggregate benchmarks
Rural/under-resourced public 0.05-0.5% 0.5-2% NSC; Chetty analysis

2.4.3 Decomposition of the Feeder Premium

The feeder premium is not a single mechanism but a compound of several factors:

Component Multiplier Range Mechanism
Application quality (counselor writing, strategy) 1.3-1.5x Experienced college counselors (33:1 ratio at Exeter vs. 372:1 national average)
Institutional trust/brand recognition 1.2-1.4x Admissions readers familiar with school rigor; grade inflation concerns
Peer effects (information, application norms) 1.1-1.2x Students apply to a wider and better-calibrated list
Direct alumni/counselor relationships 1.1-1.2x Informal communication between admissions and feeder schools
Combined (multiplicative) 2.0-2.5x After removing hooks already modeled (legacy, athlete, etc.)

Note: The feeder premium applies to unhooked students. Hooked students already receive substantial multipliers; the feeder premium represents the additional advantage from institutional affiliation.

High School Category Feeder Multiplier Validation Target
Elite boarding (Andover, Exeter, Deerfield) 2.0-2.5x Student with 1470 SAT, 3.9 GPA → 15-20% at HYPSM
Selective magnet/exam school 1.5-1.8x Student with same profile → 10-14% at HYPSM
Strong suburban public (top quartile) 1.3-1.5x Student with same profile → 8-12% at HYPSM
Average suburban public 1.0x (baseline) Student with same profile → 5-8% at HYPSM
Rural/under-resourced 0.9-1.0x Profile matters more than school

3. Agent Behavior Models

3.1 Student Portfolio Construction

CommonApp end-of-season reports provide the most reliable public data on applications per student:

Year Apps/Applicant Notes
2013-14 4.63 CommonApp first full digital season
2015-16 4.70 Baseline reference year
2018-19 5.20 Pre-COVID
2020-21 5.86 COVID test-optional surge
2021-22 6.11 Post-COVID continuation
2022-23 6.45 New high at time
2023-24 6.65 Continued growth
2024-25 6.80 Most recent season

Cumulative growth since 2015-16: approximately 44.7%.

3.1.2 Distribution by Student Type

The mean of 6.80 masks substantial heterogeneity:

Student Type Mean Apps Median 90th Pct Notes
Elite private school (Exeter, Andover) 12-14 10-12 18-20 Full Common App slate
Well-resourced suburban public 8-10 7-9 14-16 Counselor-guided
Average suburban public 5-7 5-6 10-12 Near overall mean
First-generation 3-5 3-4 7-8 Information/resource barriers
Under-resourced public 2-4 2-3 5-7 Fewer known options

3.1.3 Portfolio Strategy: Reach / Match / Safety

Standard college counseling framework:

Category Admit Probability Recommended # Notes
Reach < 20% given student profile 3-5 Dream schools; admit probability calculation important
Target (Match) 20-60% 3-5 Core list; good fit probability
Likely (Safety) > 70% 2-3 Insurance; student would attend
ED/EA choice Varies 1 Typically top-choice reach school

Minimum recommended applications: 3 safeties to protect against outcome risk. First-generation students disproportionately under-apply to safeties.

3.1.4 Early Decision Strategy

ED provides the strongest commitment signal and corresponding boost:

Factor Value Source
ED admit rate boost (unhooked) 1.3-1.5x Avery, Fairbanks & Zeckhauser 2003
ED admit rate boost (all, including hooked) 1.6-2.0x Aggregate CDS comparison
% of class filled via early rounds 40-60% CommonApp; individual CDS reports
ED yield assumption (colleges' model) ~98% ED is binding; near-100% by definition
Optimal ED targets: student selects school where ED boost is largest AND school is their true first choice Theory + behavioral economics

Behavioral distortions in ED strategy:

3.1.5 Portfolio Pseudocode

```javascript proof:W3sidHlwZSI6InByb29mQXV0aG9yZWQiLCJmcm9tIjowLCJ0byI6Nzk1LCJhdHRycyI6eyJieSI6ImFpOmNsYXVkZSJ9fV0= function buildPortfolio(student, colleges) { const meanApps = { elite_private: 13, well_resourced: 9, average_public: 6, first_gen: 4 };

const targetApps = samplePoisson(meanApps[student.schoolType]);

// Assign ED school (binding commitment) // Students select top-choice reach where ED boost is largest const edSchool = selectEDSchool(student, colleges);

// Build ranked list by "enrollment utility" // enrollmentUtility = prestige0.30 + netCost0.30 + programFit0.15 + // campusVisit0.10 + geography0.10 + peerInfluence0.05

const portfolio = colleges .map(c => ({ college: c, utility: enrollmentUtility(student, c) })) .sort((a, b) => b.utility - a.utility) .slice(0, targetApps);

return { edSchool, portfolio }; }

#### 3.1.6 Behavioral Biases in Portfolio Construction

| Bias               | Magnitude                                      | Effect                                                      |
| ------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------ |
| Overconfidence     | ~1.15x on self-assessed admit probability      | Students apply to too many reaches, too few safeties        |
| Herding            | Correlation within same high school            | Peer influence on school choice; amplifies prestige seeking |
| Rankings anchoring | 58% consult rankings; 3% know correct rank     | Prestige weight dominates fit factors                       |
| Loss aversion      | Stronger for waitlist outcomes than rejections | Waitlist "hope" is overvalued                               |
| Sunk cost          | More apps = more attachment                    | Once applied, students over-weight any school they got into |

***

### 3.2 Student Yield & Enrollment Decisions

#### 3.2.1 Yield Rates by College Tier

Yield rate = proportion of admitted students who enroll. Source: CDS Part C for each school.

| College         | Tier      | Yield Rate (Class of 2029) | Notes                                               |
| ---------------------------------------------------------------------- | ---------------------------------------------------------------- | --------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------- |
| MIT             | HYPSM     | 86.6%                      | Highest in simulation                               |
| Harvard         | HYPSM     | 83.6%                      | SCEA (non-binding EA)                               |
| Stanford        | HYPSM     | ~81%                       | SCEA                                                |
| Princeton       | HYPSM     | 78.3%                      | SCEA                                                |
| Yale            | HYPSM     | 67.7%                      | SCEA                                                |
| —               | —         | —                          | —                                                   |
| Cornell         | Ivy       | 68.4%                      | RD/ED split                                         |
| UPenn           | Ivy       | 67.9%                      | ED fills 53%                                        |
| Brown           | Ivy       | 67.3%                      | ED                                                  |
| Columbia        | Ivy       | 67.1%                      | ED                                                  |
| Dartmouth       | Ivy       | 63.7%                      | ED fills 48%                                        |
| —               | —         | —                          | —                                                   |
| Michigan        | Selective | ~40-45%                    | State preference; large pool                        |
| Duke            | Near-Ivy  | ~40-45%                    | ED 51%                                              |
| Notre Dame      | Near-Ivy  | ~55-60%                    | REA; strong loyalty                                 |
| Georgetown      | Near-Ivy  | ~45%                       | No ED/EA historically                               |
| Northwestern    | Near-Ivy  | ~35-40%                    | ED 53%                                              |
| Johns Hopkins   | Near-Ivy  | ~35%                       | ED                                                  |
| Carnegie Mellon | Near-Ivy  | ~30-35%                    | ED                                                  |
| WashU           | Near-Ivy  | ~35%                       | ED ~60%                                             |
| Rice            | Near-Ivy  | ~35%                       | ED                                                  |
| Vanderbilt      | Near-Ivy  | ~35%                       | ED                                                  |
| —               | —         | —                          | —                                                   |
| Williams        | LAC       | ~35%                       | SCEA                                                |
| Amherst         | LAC       | ~35%                       | ED                                                  |
| Middlebury      | LAC       | ~30%                       | ED fills 68%                                        |
| —               | —         | —                          | —                                                   |
| UVA             | Selective | ~30%                       | State preference (in-state ~85%, out-of-state ~15%) |
| UCLA            | Selective | ~15-20%                    | UC system; many students hold multiple UC offers    |
| Emory           | Selective | ~25-30%                    | ED                                                  |
| Tufts           | Selective | ~30%                       | ED                                                  |
| Boston College  | Selective | ~25-30%                    | EA                                                  |

#### 3.2.2 Yield Tier Summary

| Tier              | Yield Range | Key Drivers                                  |
| ------------------------------------------------------------------------ | ------------------------------------------------------------------ | --------------------------------------------------------------------------------------------------- |
| HYPSM             | 67-87%      | Brand dominance; financial aid; ED/SCEA      |
| Ivy+              | 63-69%      | Strong ED programs; financial aid packages   |
| Near-Ivy          | 35-55%      | Competition from HYPSM; cost sensitivity     |
| Selective private | 25-40%      | Many alternatives; cost sensitivity          |
| Selective public  | 15-45%      | In-state vs. out-of-state splits; UC overlap |
| LAC               | 28-38%      | Niche appeal; overlap with Ivies             |

#### 3.2.3 Yield Prediction Model

Enrollment utility model (based on college counseling research and behavioral economics):

enrollmentScore = prestige_weight * prestige_rank + // 0.30 financial_weight * net_cost_factor + // 0.30 program_weight * program_fit + // 0.15 visit_weight * campus_visit + // 0.10 geography_weight * geo_proximity + // 0.10 peer_weight * peer_influence // 0.05

**Financial aid is the #1 enrollment factor among admitted students:**

* NACAC survey: 49% rate financial aid "very important" in enrollment decision

* Students without demonstrated need: prestige weight rises to ~0.45

* Students with high demonstrated need: financial weight rises to ~0.50

**57% of students enroll at their first-choice school** (NACAC), implying 43% end up at lower-ranked options due to financial aid, waitlists, or other constraints.

#### 3.2.4 Income-Based Yield Heterogeneity

| Income Bracket               | Key Enrollment Driver                        | MIT-specific Pattern                                   |
| ----------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------- |
| Bottom quintile (< $40K)     | Financial aid package (Pell + school grants) | MIT no-loan policy eliminates cost barrier; yield ~90% |
| Middle quintile ($40K-$100K) | Net cost comparison across admits            | Yield ~80-85% at MIT                                   |
| Top quintile (> $200K)       | Prestige ranking; peer/family expectations   | Yield ~85-90%; strong prestige seekers                 |

***

### 3.3 College Admissions Decision Model

#### 3.3.1 Harvard's Holistic Rating System

Harvard uses six 1-6 rating scales (1 = best):

| Rating Dimension | Description                                                |
| ----------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------- |
| Academic         | Intellectual achievement; course rigor; GPA; test scores   |
| Extracurricular  | Depth, leadership, impact; national vs. regional vs. local |
| Personal         | Character, integrity, empathy; essays; recommendations     |
| Athletic         | Sports achievement; recruited athlete status               |
| Recommendation   | Quality and specificity of counselor/teacher letters       |
| Alumni Interview | If applicable; assessments vary by interviewer quality     |

**Admit rates by Overall (summary) rating:**

| Overall Rating | Admit Rate |
| --------------------------------------------------------------------- | ----------------------------------------------------------------- |
| 1              | 100%       |
| 2+             | ~90%       |
| 2              | ~70%       |
| 2-             | ~35%       |
| 3+             | ~20%       |
| 3              | ~3%        |
| 4+             | <1%        |

#### 3.3.2 ALDC Categories and Admit Rates

ALDC = Athletes, Legacies, Dean's Interest List, Children of faculty/staff. SFFA trial data for Harvard (2014-2019):

| Category                      | Admit Rate | Overall Pool Rate |
| ------------------------------------------------------------------------------------ | ----------------------------------------------------------------- | ------------------------------------------------------------------------ |
| Recruited Athletes            | ~86%       | 3.4%              |
| Faculty/Staff Children        | ~47%       | 3.4%              |
| Dean's Interest List (donors) | ~42%       | 3.4%              |
| Legacies (parent attended)    | ~34%       | 3.4%              |
| Non-ALDC                      | ~3.1%      | 3.4%              |

**ALDC composition of admitted classes (Harvard, 2014-2019):**

* 43% of white admits are ALDC

* < 16% of Black, Asian, or Hispanic admits are ALDC

* Among white ALDC admits: approximately 3 in 4 would be rejected on non-ALDC basis

#### 3.3.3 Hook Implementation: Multiplicative Approach

Hooks must be applied multiplicatively, not additively. If the baseline admit probability is p:

p_athlete = min(1.0, p × athlete_multiplier) p_legacy = min(1.0, p × legacy_multiplier) p_combined = min(1.0, p × athlete_multiplier × legacy_multiplier)

The alternative (additive) approach would overstate the benefit for students with very low baseline probabilities and understate it for moderate-probability students.

#### 3.3.4 Round-Based Admit Threshold Multipliers

The "multiplier" in a threshold-based model means: how much lower is the admission threshold in early rounds vs. RD?

| Round                             | Threshold Multiplier | Effective Boost | Notes                      |
| ---------------------------------------------------------------------------------------- | --------------------------------------------------------------------------- | ---------------------------------------------------------------------- | --------------------------------------------------------------------------------- |
| Early Decision (ED)               | 0.70                 | ~1.5-2.0x       | Binding; largest boost     |
| Early Action (EA)                 | 0.85                 | ~1.2-1.4x       | Non-binding; smaller boost |
| Single-Choice Early Action (SCEA) | 0.85                 | ~1.2-1.4x       | Non-binding but exclusive  |
| Early Decision II (EDII)          | 0.75                 | ~1.4-1.7x       | Between ED and EA          |
| Regular Decision (RD)             | 1.00                 | Baseline        | Full competition           |

The threshold multiplier is applied to the score threshold below which candidates are rejected: a lower threshold value = more students clear the bar = higher admit rate.

#### 3.3.5 Waitlist Mechanics

```javascript proof:W3sidHlwZSI6InByb29mQXV0aG9yZWQiLCJmcm9tIjowLCJ0byI6MzQwLCJhdHRycyI6eyJieSI6ImFpOmNsYXVkZSJ9fV0=
function assignWaitlist(student, college) {
  const admitThreshold = college.baseThreshold * roundMultiplier;
  const waitlistThreshold = admitThreshold * 1.30;  // 30% buffer above admit cutoff

  if (student.admitScore >= admitThreshold) return 'admit';
  if (student.admitScore >= waitlistThreshold) return 'waitlist';
  return 'deny';
}

Waitlist activation rates (% of waitlisted students eventually admitted):


3.4 College Enrollment Management

3.4.1 Core Enrollment Management Formula

Admits_needed = Target_class_size / Expected_yield_rate

Colleges must over-admit to hit enrollment targets because yield is uncertain. If yield is lower than expected, waitlist is activated. If yield is higher, incoming class exceeds capacity (rare but has occurred at UVA, others).

3.4.2 Early Round Fill Rates

Early rounds reduce yield uncertainty by locking in a portion of the class early (ED is binding; SCEA students are more likely to yield):

Tier Fill % via Early Rounds ED/SCEA Yield Notes
HYPSM 15-25% (SCEA/REA, non-binding) 70-85% via SCEA Harvard/Princeton/Yale/Stanford use non-binding SCEA
Ivy+ (with ED) 40-53% ~98% ED UPenn 53%, Northwestern 53%, Duke 51%
Near-Ivy 35-50% ~98% ED WashU ~60%, Middlebury 68%
Selective private 25-40% ~97% ED Varies widely
Selective public 5-15% EA, not ED No binding commitment; lower certainty

3.4.3 Melt Rates

Melt = students who commit but do not enroll (withdraw after May 1 deposit).

Tier Melt Rate Notes
Elite privates (HYPSM, Ivy+) 1-3% Strong brand; high engagement
Near-Ivy / selective private 3-7% Some competition from financial aid offers
Selective publics 5-10% UC system multiple admit; financial comparisons
Less selective 10-40% High competition; cost sensitivity

3.4.4 Waitlist Activation Logic

```javascript proof:W3sidHlwZSI6InByb29mQXV0aG9yZWQiLCJmcm9tIjowLCJ0byI6NTI4LCJhdHRycyI6eyJieSI6ImFpOmNsYXVkZSJ9fV0= function activateWaitlist(college, enrolledCount) { const shortfall = college.targetClassSize - enrolledCount;

if (shortfall <= 0) { college.waitlistActivated = false; return []; }

// Activate waitlist; admit top candidates until shortfall filled const waitlistAdmits = college.waitlist .sort((a, b) => b.admitScore - a.admitScore) .slice(0, Math.ceil(shortfall * 2.5)) // Over-offer by 150% for waitlist yield ~40% .map(s => ({ ...s, status: 'admitted_waitlist' }));

return waitlistAdmits; }

***

## 4. Open-Source Data Catalog

### 4.1 CommonApp & NACAC Data

#### 4.1.1 CommonApp Reports (Public)

| Report                                    | URL                         | Data Available                                        |
| ------------------------------------------------------------------------------------------------ | ---------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------ |
| 2024-25 End-of-Season                     | commonapp.org/research/data | Apps per applicant, applicant count, school breakdown |
| Historical reports (2014-2024)            | commonapp.org/research/data | 10-year trend series                                  |
| Annual State of College Admission (NACAC) | nacacnet.org/research       | Factor importance survey, application trends          |

**Key CommonApp statistics (2024-25):**

* Total applications: 10,193,579

* Unique applicants: 1,497,000+

* Applications per applicant: 6.80

* First-generation applicants: ~22-26% of domestic applicants

* International applicants: ~16% of total

#### 4.1.2 NACAC Factor Importance Survey

NACAC surveys admissions offices annually on factor importance. Most recent full dataset (Fall 2023, n=185 institutions):

| Factor                         | % Rating "Considerable Importance" |
| ------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------- |
| Grades in all courses          | 93.0%                              |
| Grades in college-prep courses | 91.9%                              |
| Strength of curriculum         | 86.5%                              |
| Character/personal qualities   | 65.8%                              |
| Essay/writing sample           | 56.2%                              |
| Counselor recommendation       | 51.9%                              |
| Teacher recommendation         | 51.3%                              |
| Extracurricular activities     | 50.8%                              |
| Demonstrated interest          | 43.3%                              |
| SAT/ACT scores                 | 30.3%                              |
| Class rank                     | 19.5%                              |
| Work experience                | 14.2%                              |
| State residency                | 11.3%                              |
| Interview                      | 9.4%                               |
| First-generation status        | 8.6%                               |
| Legacy                         | 2.7%                               |

**Simulation implication:** Grades dominate at 93% vs. SAT/ACT at 30.3%. The simulation's academic score should weight GPA more heavily than test scores (recommended 60/40 GPA/SAT split within the academic index).

#### 4.1.3 NACAC Data Access

* NACAC State of College Admission (annual PDF): free download at nacacnet.org

* Raw survey microdata: not publicly available; aggregate tables in annual report

* Trend reports: available for download 2005-present

***

### 4.2 HYPSM Common Data Sets

CDS data is published annually by each institution as required for participation in the U.S. News rankings. All data below is for Class of 2029 (admitted 2024-25 cycle).

#### 4.2.1 Harvard University

| Metric                   | Value     |
| ------------------------------------------------------------------------------- | ---------------------------------------------------------------- |
| Total applicants         | 47,893    |
| Admitted                 | 2,003     |
| Acceptance rate          | 4.2%      |
| Enrolled                 | 1,676     |
| Yield rate               | 83.6%     |
| EA admit rate            | 7.6%      |
| RD admit rate            | 2.6%      |
| SAT Composite middle 50% | 1500-1580 |
| SAT Math middle 50%      | 760-800   |
| SAT EBRW middle 50%      | 740-780   |
| ACT Composite middle 50% | 34-36     |

#### 4.2.2 Yale University

| Metric                   | Value     |
| ------------------------------------------------------------------------------- | ---------------------------------------------------------------- |
| Total applicants         | 57,517    |
| Admitted                 | 2,227     |
| Acceptance rate          | 3.9%      |
| SCEA admit rate          | 10.0%     |
| RD admit rate            | 3.5%      |
| SAT Composite middle 50% | 1480-1560 |
| SAT Math middle 50%      | 750-800   |
| ACT Composite middle 50% | 33-36     |

#### 4.2.3 Princeton University

| Metric                   | Value     |
| ------------------------------------------------------------------------------- | ---------------------------------------------------------------- |
| Total applicants         | 40,468    |
| Admitted                 | 1,868     |
| Acceptance rate          | 4.6%      |
| SCEA admit rate          | ~11%      |
| Yield rate               | 78.3%     |
| SAT Composite middle 50% | 1480-1570 |
| SAT Math middle 50%      | 760-800   |
| ACT Composite middle 50% | 33-36     |

#### 4.2.4 Stanford University

| Metric                   | Value          |
| ------------------------------------------------------------------------------- | --------------------------------------------------------------------- |
| Total applicants         | ~56,000 (est.) |
| Acceptance rate          | ~3.7%          |
| Yield rate               | ~81%           |
| SCEA admit rate          | ~9-11%         |
| SAT Composite middle 50% | 1480-1570      |
| ACT Composite middle 50% | 34-36          |

#### 4.2.5 MIT

| Metric                   | Value                 |
| ------------------------------------------------------------------------------- | ---------------------------------------------------------------------------- |
| Total applicants         | ~28,000               |
| Accepted                 | ~1,300                |
| Acceptance rate          | ~4.6%                 |
| Yield rate               | 86.6% (Class of 2029) |
| EA admit rate            | ~7-8%                 |
| RD admit rate            | ~3-4%                 |
| SAT Math middle 50%      | 780-800               |
| SAT Composite middle 50% | 1520-1580             |
| ACT Composite middle 50% | 35-36                 |

***

### 4.3 Top 20-50 Common Data Sets

#### 4.3.1 Acceptance Rates (Class of 2029)

| College         | Tier      | Acceptance Rate | Yield (est.)        |
| ---------------------------------------------------------------------- | ---------------------------------------------------------------- | ---------------------------------------------------------------------- | -------------------------------------------------------------------------- |
| Caltech         | Ivy+      | 2.6%            | ~50%                |
| Columbia        | Ivy       | 3.9%            | 67.1%               |
| Brown           | Ivy       | 5.5%            | 67.3%               |
| Vanderbilt      | Near-Ivy  | 5.6%            | ~35%                |
| UPenn           | Ivy       | 5.9%            | 67.9%               |
| Duke            | Near-Ivy  | 5.9%            | ~40-45%             |
| Dartmouth       | Ivy       | 6.2%            | 63.7%               |
| UChicago        | Ivy+      | 6.5%            | ~60%                |
| Johns Hopkins   | Near-Ivy  | 7.5%            | ~35%                |
| Williams        | LAC       | 7.5%            | ~35%                |
| Northwestern    | Near-Ivy  | 7.8%            | ~35-40%             |
| UCLA            | Selective | 8.6%            | ~15-20%             |
| Cornell         | Ivy       | 8.7%            | 68.4%               |
| Amherst         | LAC       | 9.0%            | ~35%                |
| Rice            | Near-Ivy  | 9.5%            | ~35%                |
| Middlebury      | LAC       | 10.0%           | ~30%                |
| Carnegie Mellon | Near-Ivy  | 11.3%           | ~30-35%             |
| Emory           | Selective | 11.4%           | ~25-30%             |
| Tufts           | Selective | 11.4%           | ~30%                |
| WashU           | Near-Ivy  | 12.0%           | ~35%                |
| Georgetown      | Near-Ivy  | 12.3%           | ~45%                |
| Notre Dame      | Near-Ivy  | 12.4%           | ~55-60%             |
| Boston College  | Selective | 16.7%           | ~25-30%             |
| Michigan        | Selective | 18.0%           | ~40-45%             |
| UVA             | Selective | 20.0%           | ~30% (out-of-state) |

#### 4.3.2 SAT Middle 50% by College

| College         | SAT Composite M50%    | SAT Math M50% |
| ---------------------------------------------------------------------- | ---------------------------------------------------------------------------- | -------------------------------------------------------------------- |
| Caltech         | 1530-1580             | 790-800       |
| Columbia        | 1500-1570             | 770-800       |
| UChicago        | 1510-1580             | 770-800       |
| Northwestern    | 1480-1570             | 750-800       |
| Brown           | 1460-1560             | 740-800       |
| UPenn           | 1460-1560             | 740-800       |
| Duke            | 1480-1570             | 760-800       |
| Dartmouth       | 1440-1560             | 730-790       |
| Cornell         | 1420-1560             | 720-790       |
| Rice            | 1480-1560             | 750-800       |
| WashU           | 1480-1570             | 760-800       |
| Vanderbilt      | 1480-1570             | 750-800       |
| Johns Hopkins   | 1490-1570             | 760-800       |
| Notre Dame      | 1430-1540             | 730-790       |
| Carnegie Mellon | 1480-1570             | 760-800       |
| Georgetown      | 1400-1530             | 700-780       |
| Williams        | 1430-1570             | 730-800       |
| Amherst         | 1420-1560             | 720-790       |
| Middlebury      | 1360-1530             | 680-770       |
| Emory           | 1380-1530             | 700-770       |
| Tufts           | 1400-1530             | 710-780       |
| Boston College  | 1350-1510             | 700-770       |
| Michigan        | 1340-1530             | 700-790       |
| UVA             | 1340-1520             | 700-790       |
| UCLA            | Test-free (UC system) | N/A           |

***

### 4.4 Feeder School Datasets

#### 4.4.1 Available Public Data Sources

| Source                                   | Data Available                                                                            | Access                     |
| ----------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------ | --------------------------------------------------------------------------------- |
| NSC High School Benchmarks               | % of HS seniors enrolling in 2-yr, 4-yr colleges; no school-specific college destinations | Free public PDF            |
| Harvard Crimson investigation (2024)     | Named feeder schools, 15-year totals, approximate counts                                  | Free web                   |
| Chetty/Deming/Friedman (2023) NBER 31492 | Ivy-Plus enrollment by parental income decile; private HS effect                          | Free NBER preprint         |
| Arcidiacono et al. SFFA trial exhibits   | School-level admit rate variation (Harvard only, 2000-2017)                               | Court documents; public    |
| Individual school profiles / naviance    | School-specific college send lists                                                        | Restricted (student login) |
| NSC StudentTracker                       | Institution-level college enrollment data                                                 | Restricted; subscription   |

#### 4.4.2 Key Feeder School Benchmarks (Harvard Crimson 2024)

| School               | Harvard Students (15 years) | Annual Rate (est.) | Category                 |
| --------------------------------------------------------------------------- | ---------------------------------------------------------------------------------- | ------------------------------------------------------------------------- | ------------------------------------------------------------------------------- |
| Boston Latin School  | 100+                        | 7-8/year           | Selective magnet/exam    |
| Phillips Andover     | 100+                        | 7-8/year           | Elite boarding           |
| Stuyvesant HS        | 100+                        | 7-8/year           | Selective exam school    |
| Phillips Exeter      | 100+                        | 7-8/year           | Elite boarding           |
| Noble & Greenough    | 70-100                      | 5-7/year           | Elite day school         |
| Trinity School (NYC) | 70-100                      | 5-7/year           | Elite day school         |
| Lexington HS (MA)    | 70-100                      | 5-7/year           | Affluent suburban public |

#### 4.4.3 Chetty et al. Key Finding

From NBER Working Paper 31492 (Chetty, Deming & Friedman 2023):

* Children from top-1% families are 2.0x as likely to attend an Ivy-Plus school with the same test scores

* Children from top-0.1% families are approximately 3x as likely

* This advantage is "almost entirely driven" by private high school attendance

* Post-SAT-score effect: roughly half of the remaining socioeconomic gradient is explained by non-academic characteristics (ECs, recommendations, essays) and half by institutional preferences

***

## 5. Optimal Matching Theory

### 5.1 Gale-Shapley Algorithm

#### 5.1.1 Problem Statement

The stable matching problem: given n students and n colleges, each with complete preference rankings over the other side, find an assignment that is **stable** — no student-college pair (s, c) where s prefers c to their current match AND c prefers s to one of their currently assigned students.

#### 5.1.2 Student-Proposing Deferred Acceptance

Algorithm: Student-Proposing DA Input: Student preferences P_S, College preferences P_C, College capacities q_c

Initialize: All students unmatched, all colleges have empty provisional lists

while exists an unmatched student s with proposals remaining: s proposes to the next college c on s's preference list

if c has space (|match(c)| < q_c): c tentatively accepts s

else: s' = c's least preferred current match if c prefers s over s': c rejects s' (s' becomes unmatched and can propose again) c tentatively accepts s else: c rejects s

return final tentative acceptances as matches

**Termination:** O(n²) total proposals in worst case; Ω(n²) lower bound — asymptotically optimal.

**Average case:** Under random preferences, expected proposals ≈ n ln n (Pittel 1989).

#### 5.1.3 Key Theorems

**Proposer-Optimality Theorem (Gale & Shapley 1962):**
Student-proposing DA produces the student-optimal stable matching: every student gets the best possible partner in any stable matching.

**Receiver-Pessimality:**
The student-optimal stable matching is simultaneously the college-pessimal stable matching: every college gets the worst possible stable match from their perspective.

**Strategy-Proofness (Roth 1982):**
Under student-proposing DA, no student can benefit from misreporting preferences. Truth-telling is a dominant strategy for students. Colleges cannot unilaterally benefit from misreporting.

**Rural Hospital Theorem:**
The same set of students are unmatched across all stable matchings. If a college is under-filled in one stable matching, it is under-filled in all stable matchings. You cannot "fill" a rural hospital by choosing a different stable matching algorithm.

#### 5.1.4 Nobel Prize and Real-World Applications

| System                   | Year          | Algorithm             | Notes                                                                           |
| ------------------------------------------------------------------------------- | -------------------------------------------------------------------- | ---------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------- |
| NRMP (medical residency) | 1998 redesign | Resident-proposing DA | Previously used hospital-proposing; switched after Roth 1984 identified problem |
| NYC high school match    | 2003          | Student-proposing DA  | 80,000 students/year; designed by Abdulkadiroglu, Pathak, Roth                  |
| Boston school assignment | 2005          | Student-proposing DA  | Replaced Boston mechanism                                                       |
| Nobel Prize              | 2012          | —                     | Alvin Roth and Lloyd Shapley                                                    |

***

### 5.2 US Admissions as a Matching Market

#### 5.2.1 Why US College Admissions Is NOT a Stable Matching

| Feature                    | Medical Residency (NRMP)        | US College Admissions                      |
| --------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------- |
| Coordination               | Centralized clearinghouse       | Decentralized; each school runs separately |
| Algorithm                  | Resident-proposing DA           | None; sequential rounds                    |
| Timing                     | Simultaneous national match day | ED → EA → EDII → RD over 5 months          |
| Binding commitments        | Yes; match is binding           | ED only; RD is non-binding                 |
| Student strategy-proofness | Yes (proposing side)            | No                                         |
| Stable matching            | Yes, by design                  | No; many blocking pairs exist              |

**Consequence:** Many "blocking pairs" exist in the outcome of the US admissions market — students who prefer College A and would have been admitted by College A if they had applied, but instead attend College B. This is the definition of market inefficiency from a matching-theory standpoint.

#### 5.2.2 Unraveling in the Admissions Market

Roth & Xing (1994) describe "unraveling" in matching markets: when timing is decentralized, participants rush to make early commitments to reduce uncertainty, eventually moving so early that the market unravels (medical fellowships scheduling interviews 2 years in advance; college applications moving to October-November of senior year).

**Evidence in college admissions:**

* Share of class filled via early rounds: ~33% in 2010 → ~40-60% in 2024

* ED application volume at selective schools grew ~30-50% from 2015 to 2025

* Yale SCEA applications grew from ~4,600 to ~7,900 between 2015 and 2023

The ED mechanism is the admissions market's response to unraveling: a formal institution that provides binding commitment as a substitute for the coordinating role that a centralized clearinghouse would serve.

#### 5.2.3 ED as a Signaling Mechanism

Avery & Levin (2010, AER) model ED as a credible signal of first-choice preference:

* Without ED, students cannot credibly signal first-choice status (cheap talk)

* ED makes the signal costly (binding commitment; no financial aid comparison)

* Colleges rationally lower their admission threshold for ED applicants who signal genuine enthusiasm

* Sophisticated students (typically high-SES) are better positioned to identify and commit to their true first choice early

**Avery, Fairbanks & Zeckhauser (2003) empirical finding:**

* ED provides approximately +100 SAT points equivalent in admissions advantage

* More recent estimates (post-2010): +150-200 SAT points equivalent at some schools

**International analogs:**

* Turkey: centralized post-exam DA (OSYM)

* Brazil: SISU (centralized DA over ENEM scores)

* Chile: Sistema de Acceso

* Germany: Hochschulstart (medicine, law, pharmacy)

* Taiwan: multi-stage centralized system

#### 5.2.4 Theory Recommendation for Simulation

The simulation should NOT implement Gale-Shapley as its core engine. Instead:

* Model the decentralized, round-based sequential process (as currently done)

* ED fills colleges early with near-certainty (binding)

* Remaining rounds are sequential and non-binding

* Stability can be analyzed as a secondary metric (count blocking pairs) but is not the objective

* The simulation's round-based model correctly captures the market's actual dynamics

***

### 5.3 Student Welfare Optimization

#### 5.3.1 What Student-Optimal Means in Practice

In the NYC high school match (80,000 students/year):

* Before DA (2002 and earlier): 31,000 students per year were unmatched, receiving administrative assignment to a school they did not list

* After DA (2003 and later): approximately 3,000 unmatched — a 90% reduction

* Students received, on average, their 3rd choice rather than their 5th choice

**Abdulkadiroglu, Agarwal & Pathak (2017, AER):**

* Simulated counterfactual: what if students were randomly assigned (no market)?

* Student-proposing DA achieves ~80% of the welfare gains achievable by any mechanism

* Coordination effect vs. algorithm effect: moving students to express preferences simultaneously captures ~80% of total possible welfare gains, regardless of which algorithm is used

#### 5.3.2 Pareto Efficiency vs. Stability: The Fundamental Tradeoff

| Property                  | Student-Proposing DA      | Top Trading Cycles (TTC)  |
| -------------------------------------------------------------------------------- | -------------------------------------------------------------------------------- | -------------------------------------------------------------------------------- |
| Stable matching           | Yes                       | No                        |
| Pareto efficiency         | No                        | Yes                       |
| Strategy-proof (students) | Yes                       | Yes                       |
| Used in practice          | Yes (school choice, NRMP) | Limited (kidney exchange) |

No mechanism can simultaneously achieve stability AND Pareto efficiency (in general). The choice between DA and TTC depends on context:

* DA prioritizes that no student-school blocking pair exists (fairness criterion)

* TTC prioritizes that no student can be made better off without making another worse off

For school choice with priority-based fairness (e.g., siblings, neighborhood), DA is preferred.

#### 5.3.3 Boston Mechanism Failure and Equity Implications

The Boston mechanism (Immediate Acceptance) was used in many US cities' school choice systems before economic redesigns. Its flaw: it is NOT strategy-proof. Unsophisticated families who list their true first choice are at risk of being left without options if rejected.

**Agarwal & Somaini (2018):**

* Estimated structural model of Boston mechanism participation

* Welfare cost of Boston mechanism fell disproportionately on less sophisticated (lower-income, lower-education) families

* Sophisticated families (with more information) could strategically "game" the first-round priority to their advantage

**Analog to ED in college admissions:**

* ED mimics the first-round of the Boston mechanism: committing early captures a large priority boost

* Families that cannot commit early (due to financial aid uncertainty) are at a structural disadvantage

* First-generation students disproportionately avoid ED — rational given their constraints, but costly in admission outcomes

#### 5.3.4 Equity Implications for Simulation

The simulation can model welfare equity by tracking:

```javascript proof:W3sidHlwZSI6InByb29mQXV0aG9yZWQiLCJmcm9tIjowLCJ0byI6NjUyLCJhdHRycyI6eyJieSI6ImFpOmNsYXVkZSJ9fV0=
// Equity metrics to compute post-simulation
const equityMetrics = {
  // Yield-adjusted match quality
  meanCollegeRankByIncome: groupBy(students, s => s.incomeQuintile)
    .map(g => mean(g.map(s => s.enrolledCollege.rank))),

  // ED usage by first-gen status
  edUsageByFirstGen: groupBy(students, s => s.firstGen)
    .map(g => g.filter(s => s.appliedED).length / g.length),

  // Blocking pairs count (stability diagnostic)
  blockingPairs: countBlockingPairs(students, colleges),

  // Students matched below safety school
  belowSafetyMatch: students.filter(s =>
    s.enrolledCollege.rank > s.safetySchools.map(c => c.rank).min()
  ).length
};

5.4 K-12 School Choice Parallels

5.4.1 Boston Mechanism: The Broken First-Come System

The Boston mechanism (also called Immediate Acceptance or First Preference First):

  1. All students simultaneously submit ranked preference lists
  2. Round 1: Each school admits students who ranked it first, up to capacity
  3. Students not admitted in Round 1 carry their remaining preferences to Round 2
  4. Round 2: Schools admit students who ranked them second (but not yet admitted), up to remaining capacity
  5. Continue until all students are placed or exhausted

Why it fails: Any student who ranks their true first choice first risks being left without a seat if rejected. The strategically correct move is to rank a "safer" school first if the probability of getting your true first choice is low. Unsophisticated families cannot make this calculation.

5.4.2 Deferred Acceptance: The Fix

Under DA:

Boston (2005) reform: Boston City Schools switched from Boston mechanism to student-proposing DA after economists (Abdulkadiroglu, Pathak, Roth) identified the manipulation problem. The switch had measurable welfare improvements.

5.4.3 NYC High School Match Design

New York City high school admissions (2003 redesign by Abdulkadiroglu, Pathak, Roth):

5.4.4 ED as Boston Mechanism Analog

The parallel between college admissions and the Boston mechanism failure:

Feature Boston Mechanism (school choice) ED in College Admissions
Commitment timing Irrevocable in Round 1 Binding commitment in ED
Strategic advantage To sophisticated families who know to rank safe school To high-SES families who can commit without financial aid comparison
Unsophisticated penalty Ranked true first choice → stranded if rejected Avoided ED due to uncertainty → lost 1.5-2.0x boost
Market outcome Many students mismatched First-gen students systematically undermatched

Simulation extension ideas:

  1. Counterfactual mode: run a centralized DA matching (all students express preferences simultaneously; colleges match simultaneously) and compare outcomes to decentralized simulation
  2. Stability analysis: count blocking pairs in final simulation outcome
  3. Equity metrics: track mean match rank by income quintile and first-gen status
  4. Boston mechanism comparison: simulate a world where all ED/EA applicants go first, then RD in one simultaneous round

6. Simulation Calibration Recommendations

This section is the primary operational reference for the simulation's parameter configuration. Each subsection provides a recommended value, the range supported by the research, and the primary source.

6.1 Hook Multipliers

Hook multipliers represent the factor by which a student's admission probability is multiplied given a particular "hook" (special status that colleges value).

6.1.1 Athlete Hook Multiplier

Recommended implementation: Per-tier athlete multipliers, not a single global value.

College Tier Schools Recommended Multiplier Range Source
HYPS (D1 slot system) Harvard, Yale, Princeton, Stanford 4.5x 4.0-5.0x SFFA trial; Espenshade & Chung 2005
MIT (D3, no slots) MIT 3.0x 2.5-3.5x MIT athletics office; estimated from admit rates
Other Ivy (D1) Columbia, Penn, Brown, Dartmouth, Cornell 4.0x 3.5-4.5x Ivy League athletics; coach feedback
Near-Ivy DI Duke, Northwestern, Notre Dame 4.0x 3.5-4.5x Similar to Ivy DI
Near-Ivy DIII Caltech, WashU, CMU 2.0x 1.5-2.5x DIII; athletics less central
LAC (NESCAC) Williams, Amherst, Middlebury 3.5x 3.0-4.0x NESCAC athletics culture
Selective public UVA, UCLA, Michigan 2.5x 2.0-3.0x Revenue sports only; walk-ons minimal

Current simulation: 3.5x global. Recommended upgrade: Per-tier as above.

Validation: With 4.5x athlete multiplier at Harvard-level schools, an athlete with academic_score = 0.50 (roughly 1400 SAT, 3.8 GPA) should see ~15-20% admit probability, consistent with SFFA data.

6.1.2 Legacy Hook Multiplier

College Tier Recommended Multiplier Range Source
HYPSM 2.5x 2.0-3.5x SFFA trial; Arcidiacono; Espenshade
Ivy+ 2.5x 2.0-3.0x General Ivy policy
Near-Ivy 2.0x 1.5-2.5x Lower legacy emphasis
Selective 1.5x 1.2-2.0x Variable by school
LAC 2.0x 1.5-2.5x Strong alumni community
Selective public 1.0x 1.0-1.1x Legacy less emphasized; state mission

Harvard SFFA data: Legacy admit rate = 34% vs. 3.1% non-ALDC = effective ~11x gross multiplier. Controlling for academic self-selection and other hook correlation, net multiplier estimated 2.5-3.5x.

6.1.3 Donor / Development Hook Multiplier

College Tier Recommended Multiplier Range Source
HYPSM 4.0x 3.0-5.0x SFFA trial; Dean's List 42% vs. 3.1%
Ivy+ 3.5x 2.5-4.5x Similar process
Near-Ivy 3.0x 2.0-4.0x Less transparent; estimated
Selective 2.0x 1.5-3.0x Smaller endowments; less room
LAC 2.5x 2.0-3.5x Large gift importance to smaller schools
Selective public 1.0x 1.0-1.5x Legally constrained; foundation gifts

Note: Donor hook should be rare in the population (< 0.5% of applicants at any school). Set base rate accordingly.

6.1.4 First-Generation College Student Multiplier

College Tier Recommended Multiplier Range Source
HYPSM 1.4x 1.3-1.6x QuestBridge partnerships; first-gen initiatives
Ivy+ 1.35x 1.2-1.5x Similar programs
Near-Ivy 1.25x 1.1-1.4x Varies by school commitment
Selective 1.2x 1.1-1.3x NACAC data; first-gen flag
Selective public 1.3x 1.2-1.5x State mission; in-state first-gen emphasis
LAC 1.3x 1.2-1.5x LAC access missions

Note: First-gen is a legal, race-neutral proxy. The NACAC factor importance survey rates it at 8.6% "considerable importance" across all schools, but Pell-eligible / QuestBridge schools may be much higher.

6.1.5 Combined Hook Interaction

When a student has multiple hooks, apply multiplicatively with a diminishing-returns cap:

```javascript proof:W3sidHlwZSI6InByb29mQXV0aG9yZWQiLCJmcm9tIjowLCJ0byI6NjIwLCJhdHRycyI6eyJieSI6ImFpOmNsYXVkZSJ9fV0= function applyHooks(baseProb, student, college) { let multiplier = 1.0;

if (student.isRecruited && college.hasAthleticSlots) { multiplier = college.athleteMultiplier; } if (student.isLegacy && college.tier !== 'selective_public') { multiplier = college.legacyMultiplier; } if (student.isDevelopment) { multiplier = college.donorMultiplier; } if (student.isFirstGen) { multiplier = college.firstGenMultiplier; }

// Diminishing returns: cap multiplier at 8x to prevent degenerate outcomes multiplier = Math.min(multiplier, 8.0);

return Math.min(1.0, baseProb * multiplier); }

***

### 6.2 Round Multipliers by College Tier

Round multipliers reflect the lower admissions threshold (and thus higher admit probability) in early rounds. The multiplier is applied to the admission score threshold, not to the probability directly.

#### 6.2.1 Threshold Multipliers

A threshold multiplier of 0.70 means the school admits students whose score exceeds 70% of the normal RD threshold — making early admission easier.

| Round                   | Threshold Multiplier | Probability Boost (est.) | Notes                                        |
| ------------------------------------------------------------------------------ | --------------------------------------------------------------------------- | ------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------- |
| ED (binding)            | 0.70                 | 1.5-2.0x                 | Strongest signal; near-100% yield commitment |
| EDII (binding)          | 0.75                 | 1.3-1.7x                 | Later; slightly weaker                       |
| EA / SCEA (non-binding) | 0.85                 | 1.2-1.4x                 | Signal of interest without commitment        |
| RD                      | 1.00                 | Baseline                 | Full competition                             |

#### 6.2.2 Round Multipliers by Tier

Different tiers offer different round structures:

| Tier                | ED Threshold    | EA/SCEA Threshold | Notes                                                          |
| -------------------------------------------------------------------------- | ---------------------------------------------------------------------- | ------------------------------------------------------------------------ | --------------------------------------------------------------------------------------------------------------------- |
| HYPSM               | N/A (SCEA only) | 0.85              | Harvard/Yale/Princeton/Stanford/MIT all use SCEA or EA, not ED |
| Ivy+ (with ED)      | 0.70            | N/A               | Columbia, Penn, Brown, Dartmouth, Cornell use ED               |
| Near-Ivy (with ED)  | 0.70            | N/A               | Most Near-Ivies use ED                                         |
| Near-Ivy (with REA) | N/A             | 0.85              | Notre Dame uses REA (restrictive EA)                           |
| Near-Ivy (no early) | 1.00            | N/A               | Georgetown (historically limited EA)                           |
| Selective           | 0.70            | 0.85              | Many offer both ED and EA                                      |
| Selective public    | N/A             | 0.90              | EA only; non-binding; weaker boost                             |

**Important:** MIT and Harvard use "Restrictive Early Action" (non-binding, but students may not apply EA/ED elsewhere). Yale uses SCEA. Stanford uses SCEA. Princeton uses SCEA. These behave like EA in terms of commitment but have exclusivity restrictions.

#### 6.2.3 Fill Rates by Round

| Tier                          | % Class Filled via Early Rounds |
| ------------------------------------------------------------------------------------ | -------------------------------------------------------------------------------------- |
| HYPSM (via SCEA/REA)          | 15-25%                          |
| Ivy+ (via ED)                 | 40-53%                          |
| Near-Ivy (via ED)             | 35-60%                          |
| Selective private (via ED+EA) | 25-45%                          |
| Selective public (via EA)     | 10-20%                          |

***

### 6.3 Yield Rates by Tier

#### 6.3.1 Recommended Yield Parameters

| College         | Recommended Yield | CDS Source | Notes                                  |
| ---------------------------------------------------------------------- | ------------------------------------------------------------------------ | ----------------------------------------------------------------- | --------------------------------------------------------------------------------------------- |
| MIT             | 86.6%             | CDS 2029   | Highest in simulation                  |
| Harvard         | 83.6%             | CDS 2029   | SCEA aids retention                    |
| Stanford        | 81.0%             | Estimated  | SCEA                                   |
| Princeton       | 78.3%             | CDS 2029   | SCEA                                   |
| Yale            | 67.7%             | CDS 2029   | SCEA but more competition              |
| Cornell         | 68.4%             | CDS 2029   | ED fills 38%                           |
| UPenn           | 67.9%             | CDS 2029   | ED 53% of class                        |
| Brown           | 67.3%             | CDS 2029   | ED                                     |
| Columbia        | 67.1%             | CDS 2029   | ED                                     |
| Dartmouth       | 63.7%             | CDS 2029   | ED 48%                                 |
| Notre Dame      | 57.0%             | Estimated  | REA; strong loyalty                    |
| Michigan        | 43.0%             | Estimated  | State preference; EA                   |
| Duke            | 42.0%             | Estimated  | ED 51%                                 |
| Georgetown      | 45.0%             | Estimated  | Limited early action                   |
| UChicago        | 62.0%             | Estimated  | ED; strong enrollment mgmt             |
| Northwestern    | 38.0%             | Estimated  | ED 53%; competition                    |
| Caltech         | 50.0%             | Estimated  | Very selective pool; high satisfaction |
| Rice            | 35.0%             | Estimated  | ED; smaller school                     |
| Vanderbilt      | 35.0%             | Estimated  | ED                                     |
| WashU           | 36.0%             | Estimated  | ED ~60%                                |
| Johns Hopkins   | 35.0%             | Estimated  | ED                                     |
| Carnegie Mellon | 33.0%             | Estimated  | ED; competition from CS programs       |
| Williams        | 35.0%             | Estimated  | SCEA; strong loyalty                   |
| Amherst         | 35.0%             | Estimated  | ED                                     |
| Middlebury      | 30.0%             | Estimated  | ED 68%; smaller pool                   |
| Emory           | 28.0%             | Estimated  | ED; competition                        |
| Tufts           | 30.0%             | Estimated  | ED                                     |
| Boston College  | 27.0%             | Estimated  | EA; strong alternatives                |
| UVA             | 30.0%             | Estimated  | State preference (out-of-state ~15%)   |
| UCLA            | 18.0%             | Estimated  | UC system overlap; multiple admits     |

#### 6.3.2 Yield Implementation

```javascript proof:W3sidHlwZSI6InByb29mQXV0aG9yZWQiLCJmcm9tIjowLCJ0byI6ODExLCJhdHRycyI6eyJieSI6ImFpOmNsYXVkZSJ9fV0=
function determineEnrollment(student, admits) {
  // Student selects school with highest enrollment utility
  const rankedAdmits = admits
    .map(college => ({
      college,
      utility: calculateEnrollmentUtility(student, college)
    }))
    .sort((a, b) => b.utility - a.utility);

  // Enroll at highest-utility admit
  const enrolled = rankedAdmits[0]?.college ?? null;

  // Apply melt probability
  const meltRate = getMeltRate(enrolled);
  if (Math.random() < meltRate) {
    return null;  // Student melts; re-open seat
  }

  return enrolled;
}

function getMeltRate(college) {
  const tier = college.tier;
  const meltRates = {
    hypsm: 0.02,
    ivy_plus: 0.03,
    near_ivy: 0.05,
    selective_private: 0.07,
    selective_public: 0.09,
    lac: 0.05
  };
  return meltRates[tier] || 0.05;
}

6.4 Applications Per Student

School Type Mean Apps Std Dev Min Max
Elite boarding (Andover, Exeter) 13 3 7 22
Well-resourced suburban public 9 3 5 18
Average suburban public 6 2 3 14
First-generation / under-resourced 4 2 2 10
Simulation overall mean (weighted) 6.8

Distribution shape: Right-skewed (Poisson or negative binomial with mean = school-type parameter). Cap at 20 to prevent unrealistic extremes.

6.4.2 ED Usage Rate by Student Type

Student Type ED Usage Rate Notes
Elite private school student 60-75% Sophisticated strategy; financial flexibility
Well-resourced suburban 40-55% Moderate strategic understanding
Average public 25-35% Limited counseling on ED strategy
First-generation 10-20% Financial aid comparison needed; ED avoidance rational

6.4.3 Implementation

```javascript proof:W3sidHlwZSI6InByb29mQXV0aG9yZWQiLCJmcm9tIjowLCJ0byI6NTM1LCJhdHRycyI6eyJieSI6ImFpOmNsYXVkZSJ9fV0= function generatePortfolioSize(student) { const meanApps = { elite_private: 13, well_resourced: 9, average_public: 6, first_gen: 4 }[student.schoolType] || 6;

// Poisson-like sampling (capped) return Math.min(20, Math.max(2, Math.round(samplePoisson(meanApps)) )); }

function willApplyED(student) { const edRate = { elite_private: 0.70, well_resourced: 0.48, average_public: 0.30, first_gen: 0.15 }[student.schoolType] || 0.30;

return Math.random() < edRate; }

***

### 6.5 EC Scoring Weights

#### 6.5.1 Weight in Overall Admissions Score

| Factor                            | Recommended Weight    | Range     | Source                              |
| ---------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------- | ---------------------------------------------------------------- | ------------------------------------------------------------------------------------------ |
| Academic index (GPA + test)       | 0.42                  | 0.38-0.48 | NACAC grades 93%; Arcidiacono model |
| Extracurriculars                  | 0.27                  | 0.22-0.32 | SFFA EC rating analysis             |
| Essays / personal statement       | 0.12                  | 0.08-0.18 | NACAC 56%; holistic review          |
| Recommendations                   | 0.10                  | 0.07-0.13 | NACAC 51-52%; reader-dependent      |
| Demonstrated interest / interview | 0.06                  | 0.03-0.10 | College-specific; varies widely     |
| School context (feeder mult.)     | Applied as multiplier | —         | See 6.7                             |
| Hook multipliers                  | Applied as multiplier | —         | See 6.1                             |

**Within Academic Index:**

| Sub-factor                  | Recommended Weight |
| ---------------------------------------------------------------------------------- | ------------------------------------------------------------------------- |
| GPA                         | 0.60               |
| Standardized test (SAT/ACT) | 0.40               |

#### 6.5.2 EC Tier Bonus System

| Tier                            | Score Range | Bonus | Prevalence       |
| -------------------------------------------------------------------------------------- | ------------------------------------------------------------------ | ------------------------------------------------------------ | ----------------------------------------------------------------------- |
| Tier 1 (national/international) | 8.5-10.0    | +0.08 | 2-3% of students |
| Tier 2 (state/regional)         | 6.5-8.4     | +0.03 | 10-15%           |
| Tier 3 (active participant)     | 4.0-6.4     | +0.00 | 35-40%           |
| Tier 4 (minimal)                | 1.5-3.9     | -0.02 | 40-50%           |

```javascript proof:W3sidHlwZSI6InByb29mQXV0aG9yZWQiLCJmcm9tIjowLCJ0byI6MTE3NiwiYXR0cnMiOnsiYnkiOiJhaTpjbGF1ZGUifX1d
function calcAdmitScore(student, college) {
  // Academic index: GPA and test scores on 0-1 scale
  const gpaScore = sigmoid((student.gpa - 3.0) / 0.5) * 0.60;
  const satScore = sigmoid((student.sat - 1200) / 150) * 0.40;
  const academicIndex = gpaScore + satScore;  // 0-1

  // EC contribution
  const ecBase = (student.ecScore / 10.0) * EC_WEIGHT;  // EC_WEIGHT = 0.27
  const ecBonus = student.ecTier === 1 ? 0.08 :
                  student.ecTier === 2 ? 0.03 :
                  student.ecTier === 4 ? -0.02 : 0;
  const ecContribution = ecBase + ecBonus;

  // Softer factors
  const softFactors = (student.essayScore / 10.0) * 0.12 +
                      (student.recScore / 10.0) * 0.10 +
                      (student.demonstratedInterest ? 0.06 : 0.03);

  // Combine
  let baseScore = academicIndex + ecContribution + softFactors;

  // Apply feeder multiplier (to school quality component only)
  baseScore *= student.school.feederMultiplier;

  // Apply hook multipliers
  baseScore = applyHooks(baseScore, student, college);

  // Add perception noise
  const noise = randn() * PERCEPTION_NOISE;

  return Math.min(1.0, Math.max(0.0, baseScore + noise));
}

6.6 Gender Multipliers

College Category Male Multiplier Female Multiplier Schools
STEM-heavy 1.0x 1.85x MIT, Caltech, CMU, Harvey Mudd
Balanced elite 1.0x 1.05x Harvard, Yale, Princeton, Stanford
Business-heavy 1.0x 1.05x UPenn (Wharton effect)
LAC 1.25x 1.0x Williams, Amherst, Middlebury
State flagship 1.0x 1.0x Michigan, UVA, UCLA

Rationale:

6.6.2 Application to Score

The gender multiplier should be applied to the admit probability (not the raw score) to avoid boundary effects:

```javascript proof:W3sidHlwZSI6InByb29mQXV0aG9yZWQiLCJmcm9tIjowLCJ0byI6MTgwLCJhdHRycyI6eyJieSI6ImFpOmNsYXVkZSJ9fV0= function applyGenderMultiplier(baseProb, student, college) { const multiplier = college.genderMultipliers[student.gender] || 1.0; return Math.min(1.0, baseProb * multiplier); }

***

### 6.7 Feeder School Multiplier

#### 6.7.1 Recommended Parameters

| School Type                                         | Feeder Multiplier | Validation Target                                |
| ---------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------ | ------------------------------------------------------------------------------------------------------- |
| Elite boarding (Andover, Exeter, Deerfield, Groton) | 2.2x              | Median student (1470 SAT, 3.9 GPA): 15-20% HYPSM |
| Selective exam/magnet (Stuyvesant, Boston Latin)    | 1.6x              | Same student: 10-14% HYPSM                       |
| Strong suburban public (top quartile nationally)    | 1.4x              | Same student: 8-12% HYPSM                        |
| Average suburban public                             | 1.0x (baseline)   | Same student: 5-8% HYPSM                         |
| Rural/under-resourced                               | 0.95x             | Same student: 4-6% HYPSM                         |

Note: The feeder multiplier is separate from and in addition to hook multipliers (athlete, legacy, donor). The feeder premium represents the residual advantage from school quality, counseling, and institutional trust after controlling for hooks that are modeled explicitly.

#### 6.7.2 Feeder Multiplier Application

The feeder multiplier is best applied to the overall admit score (or alternatively only to the "soft factors" component):

```javascript proof:W3sidHlwZSI6InByb29mQXV0aG9yZWQiLCJmcm9tIjowLCJ0byI6MjQ4LCJhdHRycyI6eyJieSI6ImFpOmNsYXVkZSJ9fV0=
// Apply feeder multiplier to full admit score
// (Reflects holistic quality lift: counseling, essays, recommendations)
baseScore *= student.school.feederMultiplier;

// Or: apply only to soft factors
softFactors *= student.school.feederMultiplier;

Recommended: Apply to the full baseScore before noise but after academic index computation. This prevents the multiplier from inflating already-near-ceiling academic scores unrealistically.

6.7.3 Validation Targets

After feeder multiplier calibration, the following outcomes should be approximately correct:

School Student Profile Expected HYPSM Admit Rate
Phillips Exeter 1470 SAT, 3.9 GPA, Tier 2 EC 15-20%
Strong suburban public 1470 SAT, 3.9 GPA, Tier 2 EC 7-10%
Average public 1470 SAT, 3.9 GPA, Tier 2 EC 5-7%
Under-resourced 1470 SAT, 3.9 GPA, Tier 2 EC 4-6%
Phillips Exeter 1550 SAT, 4.0 GPA, Tier 1 EC 35-50%
Average public 1550 SAT, 4.0 GPA, Tier 1 EC 15-25%

6.8 Perception Noise Parameter

6.8.1 What Perception Noise Models

The perceptionNoise parameter models the inherent randomness in holistic admissions review — two equally-qualified candidates may receive different outcomes due to:

This is distinct from the student's actual qualifications (which are measured with high fidelity) and is better understood as measurement error in the admissions process.

The perception noise should be added to the admit score as normally-distributed random noise:

javascript proof:W3sidHlwZSI6InByb29mQXV0aG9yZWQiLCJmcm9tIjowLCJ0byI6NTcsImF0dHJzIjp7ImJ5IjoiYWk6Y2xhdWRlIn19XQ== admitScore = baseScore + randn() * PERCEPTION_NOISE_SIGMA

College Tier Recommended Sigma Range Rationale
HYPSM 0.08 0.06-0.12 Highly holistic; large reader-to-reader variance
Ivy+ 0.07 0.05-0.10 Similar holistic process
Near-Ivy 0.06 0.04-0.09 Slightly more formula-driven
Selective private 0.05 0.03-0.08 More quantitative criteria
Selective public 0.04 0.02-0.06 Most formula-driven; GPA/test weighted heavily
LAC 0.07 0.05-0.10 Small staff; high holistic variance

Global default: A single PERCEPTION_NOISE = 0.07 sigma applied universally is a reasonable simplification if per-tier noise is not implemented.

6.8.3 Effect of Noise on Outcomes

At HYPSM admit rates (~4%), the score threshold is very high. A sigma of 0.08 on a 0-1 scale means:

The noise ensures:

  1. Qualified students sometimes get rejected (realistic; Caltech rejects many valedictorians)
  2. Borderline students sometimes get admitted (realistic; stochastic admission is empirically documented)
  3. Repeated simulations produce different outcomes (reduces overfit to any single run)

6.8.4 Perception Noise vs. Luck

A useful distinction:

The noise parameter should only capture reader-level variance, not student performance variance.


6.9 Acceptance Rate Calibration Targets

The simulation's acceptance rates should match real CDS data within a reasonable tolerance band. The following table provides validation targets:

College Target Accept Rate Acceptable Range Source
Harvard 4.2% 3.5-5.0% CDS 2029
Yale 3.9% 3.2-4.8% CDS 2029
Princeton 4.6% 3.8-5.5% CDS 2029
Stanford 3.7% 3.0-4.5% Estimated
MIT 4.6% 3.8-5.5% CDS 2029
Caltech 2.6% 2.0-3.5% CDS 2029
Columbia 3.9% 3.2-4.8% CDS 2029
Brown 5.5% 4.5-6.5% CDS 2029
Vanderbilt 5.6% 4.5-7.0% CDS 2029
UPenn 5.9% 4.8-7.2% CDS 2029
Duke 5.9% 4.8-7.2% CDS 2029
Dartmouth 6.2% 5.0-7.5% CDS 2029
UChicago 6.5% 5.2-8.0% CDS 2029
JHU 7.5% 6.0-9.0% CDS 2029
Williams 7.5% 6.0-9.5% CDS 2029
Northwestern 7.8% 6.2-9.5% CDS 2029
UCLA 8.6% 7.0-10.5% CDS 2029
Cornell 8.7% 7.0-10.5% CDS 2029
Amherst 9.0% 7.2-11.0% CDS 2029
Rice 9.5% 7.5-11.5% CDS 2029
Middlebury 10.0% 8.0-12.5% CDS 2029
Carnegie Mellon 11.3% 9.0-13.5% CDS 2029
Emory 11.4% 9.0-14.0% CDS 2029
Tufts 11.4% 9.0-14.0% CDS 2029
WashU 12.0% 9.5-15.0% CDS 2029
Georgetown 12.3% 10.0-15.0% CDS 2029
Notre Dame 12.4% 10.0-15.0% CDS 2029
Boston College 16.7% 13.0-20.0% CDS 2029
Michigan 18.0% 14.0-22.0% CDS 2029
UVA 20.0% 16.0-24.0% CDS 2029

Calibration method: Run the simulation 50 times with the same parameter set. Check that the mean acceptance rate across runs falls within the "acceptable range" for each school. If calibration fails, adjust the college's baseThreshold parameter (not the hook multipliers) to hit the target.


6.10 Enrollment Management Parameters

6.10.1 Over-Admission Ratio

Colleges must over-admit to hit enrollment targets. The over-admission ratio = admits_target / enrollment_target:

Tier Over-Admission Ratio
HYPSM 1/yield ≈ 1.15-1.45x
Ivy+ 1/yield ≈ 1.45-1.58x
Near-Ivy 1/yield ≈ 1.82-2.86x
Selective private 1/yield ≈ 2.50-4.00x
Selective public 1/yield ≈ 2.22-6.67x
LAC 1/yield ≈ 2.63-3.57x

6.10.2 Waitlist Buffer Size

Tier Waitlist Size (% of class) Activation Threshold
HYPSM 1,000-3,000 students Very rarely activated
Ivy+ 500-2,000 students Activated in low-yield years
Near-Ivy 300-1,000 students Activated annually
Selective 200-800 students Frequently activated

6.10.3 Waitlist Yield Rates

Tier Waitlist Offer-to-Enroll Rate
HYPSM 15-25% (if offered)
Ivy+ 25-35%
Near-Ivy 35-50%
Selective 40-60%

Lower-ranked schools on the waitlist face higher melt risk (students holding more attractive offers). Higher-ranked schools have higher waitlist yield because being waitlisted signals high interest.

6.10.4 Final Parameter Summary Table

For quick reference, all key calibration parameters consolidated:

Parameter Recommended Value Notes
Athlete multiplier (HYPS) 4.5x Per-tier recommended
Athlete multiplier (MIT) 3.0x DIII; no slots
Athlete multiplier (Ivy) 4.0x Formal slot system
Athlete multiplier (Near-Ivy DI) 4.0x
Athlete multiplier (LAC) 3.5x NESCAC
Legacy multiplier (HYPSM/Ivy) 2.5x
Legacy multiplier (Near-Ivy) 2.0x
Donor multiplier (HYPSM/Ivy) 4.0x Rare; < 0.5% of pool
First-gen multiplier (HYPSM) 1.4x Legal; race-neutral
First-gen multiplier (overall) 1.3x
ED threshold multiplier 0.70 Lowers admission bar
EA/SCEA threshold multiplier 0.85 Lowers admission bar
EDII threshold multiplier 0.75 Between ED and EA
HYPSM yield 67-87% Per-college recommended
Ivy+ yield 63-69% Per-college
Near-Ivy yield 35-55% Per-college
Selective yield 15-45% Per-college
Mean apps (elite boarding) 13 Per-school-type
Mean apps (avg public) 6 Per-school-type
Mean apps (first-gen) 4 Per-school-type
EC weight 0.27 In total score
Academic weight 0.42 GPA 60%, test 40%
EC Tier 1 bonus +0.08 Cliff-based
EC Tier 2 bonus +0.03
Gender mult. (female at MIT) 1.85x Structural STEM supply gap
Gender mult. (male at LAC) 1.25x Male under-representation
Feeder mult. (elite boarding) 2.2x After removing explicit hooks
Feeder mult. (avg public) 1.0x Baseline
Perception noise sigma 0.07 Normal distribution
Melt rate (elite private) 2% Post-commit dropout
Melt rate (selective public) 9% UC overlap; cost comparison
Waitlist threshold buffer 1.30x 30% above admit threshold

7. Data Sources & References

7.1 Primary Empirical Studies

Citation Key Finding File
Arcidiacono, P. et al. (2022). "Racial Classification and the Admissions at Elite Universities." NBER Working Paper 29225. Race multipliers from probit model on Harvard 2000-2017 admissions data mit_race_gender.md
Chetty, R., Deming, D., & Friedman, J. (2023). "Diversifying Society's Leaders? The Determinants and Causal Effects of Admission to Highly Selective Private Colleges." NBER Working Paper 31492. Top-1% families 2x as likely at Ivy-Plus; private HS mediates effect data_feeder_schools.md, exeter_mit_pipeline.md
Espenshade, T. & Chung, C. (2005). "The Opportunity Cost of Admission Preferences at Elite Universities." Social Science Quarterly. Athlete hook = +200 SAT pts; legacy = +160 SAT pts mit_athletic_hooks.md
Avery, C., Fairbanks, A., & Zeckhauser, R. (2003). "The Early Admissions Game." Harvard University Press. ED provides ~+100 SAT pts equivalent admission advantage college_matching_market.md
Avery, C. & Levin, J. (2010). "Early Admissions at Selective Colleges." American Economic Review. ED as credible signaling mechanism; dominant for first-choice college_matching_market.md
Abdulkadiroglu, A., Agarwal, N., & Pathak, P. (2017). "The Welfare Effects of Coordinated Assignment." American Economic Review. 80% of welfare gains from coordination; algorithm choice secondary student_welfare_matching.md
Agarwal, N. & Somaini, P. (2018). "Demand Analysis Using Strategic Reports." Econometrica. Boston mechanism welfare cost fell disproportionately on less-sophisticated families student_welfare_matching.md, k12_school_choice.md
Roth, A. & Xing, X. (1994). "Jumping the Gun: Imperfections and Institutions Related to the Timing of Market Transactions." American Economic Review. Unraveling in matching markets college_matching_market.md
Roth, A. (1982). "The Economics of Matching: Stability and Incentives." Mathematics of Operations Research. Strategy-proofness only for proposing side gale_shapley_algorithm.md
Pittel, B. (1989). "The Average Number of Stable Matchings." SIAM Journal on Discrete Mathematics. Expected O(n ln n) proposals under random preferences gale_shapley_algorithm.md
Document Key Finding File
SFFA v. Harvard (2023), Supreme Court Eliminated race-conscious admissions at Harvard and UNC mit_race_gender.md
SFFA v. Harvard, Expert Testimony (2018) Arcidiacono probit model; ALDC admit rates; race multiplier evidence college_decision_model.md, mit_race_gender.md
SFFA v. Harvard, Trial Exhibits Harvard 1-6 rating scale; EC data by admit rate mit_extracurriculars.md, college_decision_model.md

7.3 Institutional and Survey Data

Source Data Available URL
CommonApp End-of-Season Reports (2024-25) Applications per applicant, total applicants, demographics commonapp.org/research/data
NACAC State of College Admission (2023) Factor importance survey; counselor-to-student ratio data nacacnet.org
Harvard Common Data Set (2024-25) Acceptance rate, yield, SAT ranges, class size https://oira.harvard.edu/common-data-set/
Yale Common Data Set (2024-25) Acceptance rate, yield, SAT ranges https://oir.yale.edu/common-data-set
Princeton Common Data Set (2024-25) Acceptance rate, yield, SAT ranges https://ir.princeton.edu/common-data-set
MIT Common Data Set (2024-25) Acceptance rate, yield, SAT ranges https://ir.mit.edu/common-data-set
All 30 college CDS files Acceptance, yield, SAT, class size Individual institutional IR offices

7.4 Journalism and Investigative Sources

Source Key Data File
Harvard Crimson (2024). "More Than One in 10 Harvard Undergrads Come From Just 21 Schools." Named feeder school list; 15-year send data data_feeder_schools.md, exeter_mit_pipeline.md
Wall Street Journal (2023 series on Ivy admissions) Hook multiplier reporting; donor preference transparency college_decision_model.md
MIT News (2024). "MIT Class of 2028 Profile." Post-SFFA demographic shifts: Black 13%→5%, Asian 41%→47% mit_race_gender.md

7.5 Foundational Academic References

Citation Key Contribution File
Gale, D. & Shapley, L. (1962). "College Admissions and the Stability of Marriage." American Mathematical Monthly. Original stable matching algorithm; student-optimal proof gale_shapley_algorithm.md
Abdulkadiroglu, A., Pathak, P., & Roth, A. (2005). "The New York City High School Match." American Economic Review P&P. NYC school choice implementation k12_school_choice.md
Abdulkadiroglu, A. & Sonmez, T. (2003). "School Choice: A Mechanism Design Approach." American Economic Review. Formal analysis of Boston mechanism failure k12_school_choice.md
Roth, A. (2008). "Deferred Acceptance Algorithms: History, Theory, Practice, and Open Questions." International Journal of Game Theory. Comprehensive DA review gale_shapley_algorithm.md

7.6 Data Sources NOT Publicly Available

For completeness, the following data sources were referenced in research but are NOT publicly available without institutional access:

Source Data Access
National Student Clearinghouse StudentTracker Student-level college enrollment by high school Institutional subscription (~$5,000/yr)
Naviance "Where they got in" School-specific college acceptance data Student login; school purchase
College admissions raw data Individual applications, scores, decisions Institutional; privacy-restricted
Common Application raw microdata Individual-level app data Member institution access
Arcidiacono SFFA dataset Harvard admissions 2000-2017 Court exhibits; partial public release

Appendix A: Simulation Architecture Compatibility

A.1 Mapping Parameters to simulation_spec.md

spec.md Section Parameters from This Document
Section 2: Student Generation feederMultiplier, appsPerStudent by school type, ecTier distribution
Section 3: College Admissions hookMultipliers, roundThresholds, perceptionNoise, genderMultipliers
Section 4: Student Decisions yieldRates, enrollmentUtility weights, meltRates
Section 5: Waitlist waitlistThresholdBuffer, waitlistYieldRates
Section 6: Analytics equityMetrics, blockingPairCount

A.2 Parameter Versioning

Parameter Current (index.html) Recommended Priority
Athlete multiplier 3.5x (global) Per-tier (2.5-4.5x) High
Legacy multiplier 2.5x (global) Per-tier (1.5-2.5x) Medium
Donor multiplier 4.0x (global) Per-tier (2.0-4.0x) Medium
First-gen multiplier 1.4x 1.3-1.4x (keep) Low
ED threshold 0.70 0.70 (keep)
EA threshold 0.85 0.85 (keep)
EC weight Unknown 0.27 High
Perception noise Unknown 0.07 sigma High
Feeder multiplier Not implemented 1.0-2.2x High
Gender multiplier Not implemented Per-tier Medium
Yield rates Unknown Per-college High

A.3 Quick-Start Calibration Order

For iterative calibration of the simulation, recommended order:

  1. Acceptance rates first: Set baseThreshold per college so simulated accept rates match CDS targets (±2pp tolerance)
  2. Yield rates second: Set per-college yield parameters; validate against CDS yield data
  3. Hook multipliers third: Enable hooks one at a time; validate ALDC share of admits
  4. Feeder multiplier fourth: Add school-type feeder multiplier; validate pipeline rates
  5. Perception noise last: Tune sigma to produce realistic variance across 50 simulation runs (σ acceptance rate ≈ ±0.5-1.5pp)

Document generated from synthesis of 16 research files. Primary sources for all claims are cited in-line with section references to the source file where the original research is located.