Source: RESEARCH_SUMMARY.md
Compiled: 2026-03-01
Source files: 16 research documents in /research/
Purpose: Single canonical reference for simulation calibration, parameter tuning, and theoretical grounding
2.1 Race & Gender Effects
2.2 Athletic Hooks
2.3 Extracurricular Weighting
2.4 Feeder School Effects
3.1 Student Portfolio Construction
3.2 Student Yield & Enrollment Decisions
3.3 College Admissions Decision Model
3.4 College Enrollment Management
4.1 CommonApp & NACAC Data
4.2 HYPSM Common Data Sets
4.3 Top 20-50 Common Data Sets
4.4 Feeder School Datasets
5.1 Gale-Shapley Algorithm
5.2 US Admissions as a Matching Market
5.3 Student Welfare Optimization
5.4 K-12 School Choice Parallels
6.1 Hook Multipliers
6.2 Round Multipliers by College Tier
6.3 Yield Rates by Tier
6.4 Applications Per Student
6.5 EC Scoring Weights
6.6 Gender Multipliers
6.7 Feeder School Multiplier
6.8 Perception Noise Parameter
6.9 Acceptance Rate Calibration Targets
6.10 Enrollment Management Parameters
This synthesis covers 16 research documents produced to support a single-file, agent-based college admissions simulation. The simulation models the full US selective college admissions cycle — from student portfolio construction through Early Decision, Early Action, Regular Decision, and waitlist rounds — for approximately 20 high schools and 30 colleges. The simulation is implemented in vanilla ES6+ JavaScript with D3.js visualization, requiring no server.
The 16 source documents fall into four clusters:
| Cluster | Files | Theme |
|---|---|---|
| MIT-specific parameters | mit_race_gender, mit_athletic_hooks, mit_extracurriculars, exeter_mit_pipeline | Granular parameter calibration from primary sources |
| Agent behavior | student_portfolio_behavior, student_yield_behavior, college_decision_model, college_enrollment_management | How students and colleges actually behave |
| Open-source data | data_commonapp_nacac, data_hypsm_cds, data_top20_50_cds, data_feeder_schools | Publicly verifiable ground truth |
| Matching theory | gale_shapley_algorithm, college_matching_market, student_welfare_matching, k12_school_choice | Theoretical grounding and design validation |
1. Hook multipliers are multiplicative, not additive. SFFA v. Harvard trial testimony confirmed hooks compound. An athlete who is also a legacy gets both multipliers applied sequentially, not summed. The current simulation correctly uses multiplicative hooks.
2. ED provides a 1.5-2.0x admit boost, not a flat percentage bump. The mechanism is a lower admit threshold (threshold_ED = threshold_RD × 0.70), which at realistic score distributions produces a 1.5-2.0x effective boost. This is empirically grounded in Avery, Fairbanks & Zeckhauser (2003).
3. Athletic hooks are the strongest single multiplier at most schools. SFFA trial data puts recruited athletes at ~86% admit rate at Harvard vs. 3.4% overall — roughly 25x, which collapses to a ~4-5x multiplier once you control for athlete self-selection into strong academic profiles. MIT is an outlier: no binding slots, no likely letters, estimated multiplier 2.5-3.5x.
4. Post-SFFA race multipliers must be set to 1.0x. The Supreme Court's June 2023 ruling in SFFA v. Harvard/UNC eliminated explicit race-conscious admissions. MIT's Class of 2028 (first post-SFFA cohort) showed dramatic demographic shifts: Black enrollment dropped from 13% to 5%, Asian enrollment rose from 41% to 47%. The simulation should not model race as a direct admit multiplier in post-SFFA mode.
5. Feeder school effects are large and undermodeled in most simulations. Harvard Crimson (2024) found 1 in 11 Harvard undergrads comes from just 21 schools (0.078% of US high schools). The counselor-quality premium (Exeter 33:1 vs. national 372:1) alone implies a 1.3-1.5x application quality boost, compounding to 2.0-2.5x overall for elite boarding schools after institutional trust and peer effects.
6. Applications per student have grown 46% since 2015-16. The mean is now 6.80 (2024-25 CommonApp data), up from 4.63 in 2013-14. Elite private school students average 12+ applications. This growth has increased overlap and intensified yield uncertainty for colleges.
7. Yield rates vary enormously by tier and must be tier-specific. MIT's 86.6% yield is roughly 3x UCLA's ~15-20% and roughly double a Near-Ivy like Johns Hopkins (~35%). Using a single yield parameter produces deeply unrealistic enrollment outcomes.
8. The US admissions market is NOT a stable matching. Unlike NRMP (medical residency matching), US college admissions is decentralized, sequential, and produces no stable matching in the Gale-Shapley sense. Many "blocking pairs" exist: students who prefer a college that would have preferred them over some admitted student. The simulation correctly models this as a decentralized round-based process, not a centralized algorithm.
9. EC scoring should use a tiered bonus system, not a continuous scale. Harvard's SFFA data shows a cliff: EC rating 1 students admit at 50.6% vs. EC rating 2 at 18.1% vs. EC rating 3 at 3.8%. This is better modeled by a Tier 1 spike bonus (+0.08) than a smooth linear transform.
10. Coordination matters more than algorithm choice in matching welfare. Abdulkadiroglu, Agarwal & Pathak (2017 AER) find that 80% of potential welfare gains in student-school matching come from coordination alone — ensuring students and schools express preferences simultaneously to the same clearinghouse — rather than from the specific algorithm. This validates the simulation's round-synchronized approach even without implementing full DA.
The following multipliers derive from Peter Arcidiacono's expert testimony in SFFA v. Harvard (2018), applying a probit model to Harvard admissions data 2000-2017. MIT-specific figures are estimated by applying similar patterns with MIT's known demographic targets.
| Racial/Ethnic Group | Harvard (Arcidiacono) | MIT (Estimated) | Basis |
|---|---|---|---|
| African American | 3.5x | 3.0x | SFFA trial testimony, MIT's ~13% Black enrollment target |
| Hispanic/Latino | 2.3x | 2.0x | SFFA trial testimony |
| Native American | 4.0x | 3.5x | Consistent with Harvard data |
| White | 1.0x (baseline) | 1.0x | Baseline |
| Asian American | 0.75x | 0.80x | Arcidiacono; MIT slightly less biased in STEM context |
| International | 0.90x | 0.85x | US citizen preference; MIT cap ~11% international |
Note: These multipliers apply to the admissions score calculation, not to external socioeconomic data generation.
Following the Supreme Court's June 29, 2023 decision in Students for Fair Admissions v. Harvard and Students for Fair Admissions v. UNC, race-conscious admissions is prohibited. All race multipliers become 1.0x.
| Racial/Ethnic Group | Post-SFFA Multiplier |
|---|---|
| All groups | 1.0x |
Indirect proxies that remain legal post-SFFA:
| Proxy Factor | Multiplier | Legal Basis |
|---|---|---|
| First-generation college student | 1.3-1.4x | Socioeconomic diversity, race-neutral |
| Pell Grant eligibility | 1.2-1.3x | Socioeconomic diversity |
| Rural/underrepresented geography | 1.1-1.2x | Geographic diversity |
| Low-income zip code | 1.1-1.2x | Socioeconomic diversity |
| Underrepresented state/country | 1.1x | Geographic diversity |
MIT Class of 2028 (first post-SFFA cohort, enrolled fall 2024) showed dramatic shifts:
| Group | Pre-SFFA (Class of 2027) | Post-SFFA (Class of 2028) | Change |
|---|---|---|---|
| Black/African American | ~13% | ~5% | -8 pp |
| Hispanic/Latino | ~15% | ~11% | -4 pp |
| White | ~38% | ~37% | -1 pp |
| Asian American | ~41% | ~47% | +6 pp |
| Native American | ~2% | ~1% | -1 pp |
Harvard's Class of 2028 showed similar trends: Black enrollment dropped from ~15% to ~5-6%.
Gender multipliers reflect actual imbalances in applicant pools and institutional diversity goals. They should be applied per college category, not universally.
| College Category | Male Multiplier | Female Multiplier | Rationale |
|---|---|---|---|
| STEM-heavy (MIT, Caltech, CMU) | 1.0x (baseline) | 1.8-2.0x | Female admit rates historically ~2x male due to demand imbalance |
| Balanced research (HYPS, Ivies) | 1.0x | 1.0-1.1x | Near-parity in applicant pool |
| Liberal arts colleges | 1.2-1.3x | 1.0x | Male scarcity in LAC applicant pool |
| State flagships (UVA, Michigan) | 1.0-1.1x | 1.0x | Near-parity; varies by year |
MIT-specific data: MIT Class of 2029 admitted women at roughly 24% acceptance rate vs. approximately 12% for men (self-reported by MIT, reflecting its 24% female applicant pool against a target of ~49% female enrollment). This implies a multiplier of approximately 1.8-2.0x for women at MIT.
Implementation note: Gender multipliers interact with the academic score sigmoid. A female applicant with academic_index = 0.70 at MIT effectively competes as if her score were 0.70 × 1.9 = 1.33 before the sigmoid clips it. The correct implementation applies the multiplier to the admit probability, not to the raw score.
```javascript proof:W3sidHlwZSI6InByb29mQXV0aG9yZWQiLCJmcm9tIjowLCJ0byI6NTUwLCJhdHRycyI6eyJieSI6ImFpOmNsYXVkZSJ9fV0= // Post-SFFA (current) — race multipliers eliminated const RACE_MULTIPLIERS = { african_american: 1.0, hispanic: 1.0, white: 1.0, asian: 1.0, native_american: 1.0 };
// Socioeconomic proxy multipliers (legal post-SFFA) const SOCIOECONOMIC_MULTIPLIERS = { first_gen: 1.35, pell_eligible: 1.25, rural: 1.15, low_income_zip: 1.15 };
// Gender multipliers by college category const GENDER_MULTIPLIERS = { stem_heavy: { male: 1.0, female: 1.9 }, balanced: { male: 1.0, female: 1.05 }, lac: { male: 1.25, female: 1.0 } };
***
### 2.2 Athletic Hooks
#### 2.2.1 Harvard SFFA Trial Data
The SFFA v. Harvard trial produced the most granular public dataset on athletic hook effects:
| Group | Admit Rate | Multiplier vs. Overall |
| ------------------------------------------------------------------------------------------------ | ----------------------------------------------------------------- | ----------------------------------------------------------------------------- |
| All recruited athletes | ~86% | ~25x vs. 3.4% overall |
| Recruited athletes w/ top academic rating | ~83% | ~24x |
| Non-athletes w/ top academic rating | ~16% | ~5x |
| Walk-on athletes (not recruited) | ~5-6% | ~1.5x |
Controlling for the fact that recruited athletes are pre-screened to meet minimum academic thresholds, the effective hook multiplier collapses to approximately 4-5x at Harvard for the marginal admitted student.
**Espenshade & Chung (2005) SAT-equivalent estimates:**
* Athlete hook = +200 SAT points equivalent
* Legacy hook = +160 SAT points
* Black/African American = +230 points (pre-SFFA era)
* Hispanic = +185 points (pre-SFFA era)
* First-generation = +130 points
#### 2.2.2 MIT Athletic Model: No-Slot Exception
MIT is structurally unique among highly selective research universities:
* Fields 33 varsity sports (Division III, NEWMAC conference)
* Approximately 20-25% of undergrads participate in varsity athletics
* **No binding roster slots:** coaches cannot guarantee admission
* **No "likely letters":** MIT does not send pre-decision signals to recruits
* Admission decision made independently by admissions office; coaches submit advocacy letters
* Estimated admitted-athlete acceptance rate: 25-50% (compared to 4.6% overall)
* Implied gross multiplier: ~6-11x; effective multiplier controlling for academic self-selection: ~2.5-3.5x
This contrasts sharply with Harvard/Yale/Princeton, which operate formal recruitment "bands" and do issue likely letters.
#### 2.2.3 Recommended Multipliers by School Tier
| Tier | Schools | Recommended Athlete Multiplier | Rationale |
| --------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------- |
| HYPS (excluding MIT) | Harvard, Yale, Princeton, Stanford | 4.0-5.0x | Formal slot system, likely letters, SFFA data |
| MIT | MIT | 2.5-3.5x | No slots, coach advocacy only, DIII |
| Other Ivy | Columbia, Penn, Brown, Dartmouth, Cornell | 3.5-4.5x | Ivy League formal recruitment system |
| Near-Ivy DI | Duke, Northwestern, Notre Dame, Georgetown | 3.5-4.5x | DI with significant athletics program |
| Near-Ivy DIII | Caltech, WashU, CMU | 1.5-2.5x | DIII, minimal athletics influence |
| NESCAC LACs | Williams, Amherst, Middlebury | 3.0-4.0x | NESCAC athletics culture, formal recruitment |
| Selective publics | UVA, UCLA, Michigan | 2.0-3.0x | Revenue sport athletes only; most athletes not recruited |
**Current simulation default:** 3.5x global (single tier). This is a defensible middle ground but under-represents HYPS and over-represents MIT. The recommended upgrade is per-tier multipliers.
#### 2.2.4 Sport-Type Differentiation
Not all recruited athletes receive the same boost. "Head count" sports (football, basketball, volleyball) that receive full scholarship commitments differ from "equivalency" sports.
| Sport Category | Notes | Relative Multiplier |
| ------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------ | -------------------------------------------------------------------------- |
| Revenue/head-count sports (football, basketball) | Most scrutinized; large rosters | High (3.5-5x) |
| Olympic/NESCAC sports (rowing, squash, lacrosse) | Highest socioeconomic concentration | High (3.5-4.5x) |
| DIII non-revenue | No scholarships; weaker binding | Lower (2.0-3.0x) |
| Walk-ons (not recruited) | Essentially non-hook | 1.1-1.3x |
***
### 2.3 Extracurricular Weighting
#### 2.3.1 Harvard SFFA EC Rating Data
Harvard uses a 1-6 scale for extracurricular ratings (1 = highest). Admit rates by EC score from the SFFA trial dataset (2014-2019):
| EC Rating | Admit Rate | Notes |
| ------------------------------------------------------------------------------- | ----------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------- |
| 1 (Outstanding) | 50.6% | National-level achievement, elite competitions |
| 2 (Excellent) | 18.1% | State/regional leadership, significant impact |
| 3 (Good) | 3.8% | Active participant, some leadership |
| 4 (Adequate) | 1.6% | Typical participant |
| 5-6 (Below average/None) | < 1% | Weak or no EC record |
The cliff between EC 1 and EC 2 (50.6% vs. 18.1%) implies a nonlinear, threshold-based model rather than a smooth continuous function.
#### 2.3.2 CollegeVine Four-Tier Framework
CollegeVine's public-facing tier system is the best-validated public framework for categorizing ECs:
| Tier | Score Range | Description | Examples | Population % |
| ----------------------------------------------------------- | ------------------------------------------------------------------ | ---------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------ | ------------------------------------------------------------------- |
| 1 | 8.5-10.0 | National/international recognition; elite competitions | USAMO, Intel STS finalist, USIBO, national arts award, Olympic trial athlete | 2-3% |
| 2 | 6.5-8.4 | State/regional leadership; significant school-level achievement | State science fair winner, school newspaper editor-in-chief, student body president | 10-15% |
| 3 | 4.0-6.4 | Active participant with some responsibility | Club member, JV sports, school play supporting cast, volunteer | 35-40% |
| 4 | 1.5-3.9 | Minimal participation; listing only | One-time volunteer, unverifiable activities | 40-50% |
#### 2.3.3 Overall Weighting in Admissions Decision
Based on holistic review process documentation and SFFA trial testimony, the approximate weight breakdown:
| Factor | Weight | Source |
| ------------------------------------------------------------------------------------------------ | ------------------------------------------------------------- | --------------------------------------------------------------------------------------------- |
| Academic index (GPA + standardized tests) | 40-45% | Arcidiacono model; NACAC survey |
| Extracurriculars | 25-30% | SFFA testimony; CollegeVine analysis |
| Essays / personal statements | 10-15% | NACAC 56.2% "considerable importance" |
| Letters of recommendation | 8-12% | NACAC 51-52% "considerable importance" |
| Demonstrated interest / alumni interview | 5-10% | College-specific; varies widely |
#### 2.3.4 MIT-Specific EC Type Multipliers
MIT asks for only 4 activities (vs. CommonApp's 10-slot default), signaling depth over breadth. EC types receive different implicit weighting:
| EC Type | MIT Multiplier | Rationale |
| -------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------- | -------------------------------------------------------------------------------------------- |
| STEM Research (published/presented) | 1.15x | Directly aligns with MIT mission |
| Technical Competition (USAMO, USABO, USACO) | 1.10x | Objective national benchmark |
| Entrepreneurship / startup | 1.10x | Innovation culture at MIT |
| Community service (sustained, high impact) | 1.00x | Valued but not differentiating |
| Arts (national-level) | 1.00x | Valued; MIT has strong arts community |
| Generic volunteering | 0.85x | Common; limited signal value |
| Sports (DIII context) | 0.90x | Less emphasized than Ivy coaching |
#### 2.3.5 Spike Bonus Implementation
The "spike" concept captures the additional value of exceptional depth in a single activity vs. average performance across many activities:
```javascript proof:W3sidHlwZSI6InByb29mQXV0aG9yZWQiLCJmcm9tIjowLCJ0byI6NDczLCJhdHRycyI6eyJieSI6ImFpOmNsYXVkZSJ9fV0=
function calcECBonus(ecScore, ecTier) {
// Base EC contribution (linear)
let base = ecScore * EC_WEIGHT; // EC_WEIGHT ~ 0.25-0.30
// Tier-based spike bonus (nonlinear cliff)
let spikeBonus = 0;
if (ecTier === 1 || ecScore >= 8.5) {
spikeBonus = 0.08; // Substantial bonus for national-level achievement
} else if (ecTier === 2 || ecScore >= 6.5) {
spikeBonus = 0.03; // Moderate bonus for regional/state leadership
}
return base + spikeBonus;
}
The feeder school effect is one of the most robust and undermodeled phenomena in elite admissions:
Harvard Crimson 2024 investigation:
21 schools sent 2,200+ students to Harvard over 15 years
Top feeders: Boston Latin School, Phillips Andover, Stuyvesant HS, Phillips Exeter (100+ each)
5% of Harvard freshmen from just 7 schools: Boston Latin, Phillips Andover, Stuyvesant, Noble & Greenough, Phillips Exeter, Trinity (NYC), Lexington HS
These 7 schools represent approximately 0.02% of all US high schools
Chetty, Deming & Friedman (2023, NBER Working Paper 31492):
Children from top-1% families are 2x as likely to attend Ivy-Plus institutions with the same test scores
This advantage is "almost entirely driven by elite private high school attendance"
Private high school attendance mediates much of the socioeconomic gradient
| School Type | HYPSM Feeder Rate | All Elite (top 30) | Source |
|---|---|---|---|
| Elite boarding (Andover, Exeter, Groton) | 15-20% of graduates | 30-40% | Harvard Crimson; school profiles |
| Selective day/magnet schools (Stuyvesant, Boston Latin) | 8-12% | 20-30% | NSC estimates; journalism |
| Affluent suburban public (Lexington MA, Palo Alto) | 5-8% | 15-20% | School profiles; journalism |
| Average suburban public | 0.5-2% | 3-6% | NSC aggregate benchmarks |
| Rural/under-resourced public | 0.05-0.5% | 0.5-2% | NSC; Chetty analysis |
The feeder premium is not a single mechanism but a compound of several factors:
| Component | Multiplier Range | Mechanism |
|---|---|---|
| Application quality (counselor writing, strategy) | 1.3-1.5x | Experienced college counselors (33:1 ratio at Exeter vs. 372:1 national average) |
| Institutional trust/brand recognition | 1.2-1.4x | Admissions readers familiar with school rigor; grade inflation concerns |
| Peer effects (information, application norms) | 1.1-1.2x | Students apply to a wider and better-calibrated list |
| Direct alumni/counselor relationships | 1.1-1.2x | Informal communication between admissions and feeder schools |
| Combined (multiplicative) | 2.0-2.5x | After removing hooks already modeled (legacy, athlete, etc.) |
Note: The feeder premium applies to unhooked students. Hooked students already receive substantial multipliers; the feeder premium represents the additional advantage from institutional affiliation.
| High School Category | Feeder Multiplier | Validation Target |
|---|---|---|
| Elite boarding (Andover, Exeter, Deerfield) | 2.0-2.5x | Student with 1470 SAT, 3.9 GPA → 15-20% at HYPSM |
| Selective magnet/exam school | 1.5-1.8x | Student with same profile → 10-14% at HYPSM |
| Strong suburban public (top quartile) | 1.3-1.5x | Student with same profile → 8-12% at HYPSM |
| Average suburban public | 1.0x (baseline) | Student with same profile → 5-8% at HYPSM |
| Rural/under-resourced | 0.9-1.0x | Profile matters more than school |
CommonApp end-of-season reports provide the most reliable public data on applications per student:
| Year | Apps/Applicant | Notes |
|---|---|---|
| 2013-14 | 4.63 | CommonApp first full digital season |
| 2015-16 | 4.70 | Baseline reference year |
| 2018-19 | 5.20 | Pre-COVID |
| 2020-21 | 5.86 | COVID test-optional surge |
| 2021-22 | 6.11 | Post-COVID continuation |
| 2022-23 | 6.45 | New high at time |
| 2023-24 | 6.65 | Continued growth |
| 2024-25 | 6.80 | Most recent season |
Cumulative growth since 2015-16: approximately 44.7%.
The mean of 6.80 masks substantial heterogeneity:
| Student Type | Mean Apps | Median | 90th Pct | Notes |
|---|---|---|---|---|
| Elite private school (Exeter, Andover) | 12-14 | 10-12 | 18-20 | Full Common App slate |
| Well-resourced suburban public | 8-10 | 7-9 | 14-16 | Counselor-guided |
| Average suburban public | 5-7 | 5-6 | 10-12 | Near overall mean |
| First-generation | 3-5 | 3-4 | 7-8 | Information/resource barriers |
| Under-resourced public | 2-4 | 2-3 | 5-7 | Fewer known options |
Standard college counseling framework:
| Category | Admit Probability | Recommended # | Notes |
|---|---|---|---|
| Reach | < 20% given student profile | 3-5 | Dream schools; admit probability calculation important |
| Target (Match) | 20-60% | 3-5 | Core list; good fit probability |
| Likely (Safety) | > 70% | 2-3 | Insurance; student would attend |
| ED/EA choice | Varies | 1 | Typically top-choice reach school |
Minimum recommended applications: 3 safeties to protect against outcome risk. First-generation students disproportionately under-apply to safeties.
ED provides the strongest commitment signal and corresponding boost:
| Factor | Value | Source |
|---|---|---|
| ED admit rate boost (unhooked) | 1.3-1.5x | Avery, Fairbanks & Zeckhauser 2003 |
| ED admit rate boost (all, including hooked) | 1.6-2.0x | Aggregate CDS comparison |
| % of class filled via early rounds | 40-60% | CommonApp; individual CDS reports |
| ED yield assumption (colleges' model) | ~98% | ED is binding; near-100% by definition |
| Optimal ED targets: student selects school where | ED boost is largest AND school is their true first choice | Theory + behavioral economics |
Behavioral distortions in ED strategy:
Students from elite backgrounds are 2-3x more likely to apply ED to a highly selective school (Avery & Levin 2010)
First-generation students often avoid ED due to inability to compare financial aid packages before binding commitment
The ED mechanism parallels the Boston mechanism (immediate acceptance) — rewards sophisticated strategic actors
```javascript proof:W3sidHlwZSI6InByb29mQXV0aG9yZWQiLCJmcm9tIjowLCJ0byI6Nzk1LCJhdHRycyI6eyJieSI6ImFpOmNsYXVkZSJ9fV0= function buildPortfolio(student, colleges) { const meanApps = { elite_private: 13, well_resourced: 9, average_public: 6, first_gen: 4 };
const targetApps = samplePoisson(meanApps[student.schoolType]);
// Assign ED school (binding commitment) // Students select top-choice reach where ED boost is largest const edSchool = selectEDSchool(student, colleges);
// Build ranked list by "enrollment utility" // enrollmentUtility = prestige0.30 + netCost0.30 + programFit0.15 + // campusVisit0.10 + geography0.10 + peerInfluence0.05
const portfolio = colleges .map(c => ({ college: c, utility: enrollmentUtility(student, c) })) .sort((a, b) => b.utility - a.utility) .slice(0, targetApps);
return { edSchool, portfolio }; }
#### 3.1.6 Behavioral Biases in Portfolio Construction
| Bias | Magnitude | Effect |
| ------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------ |
| Overconfidence | ~1.15x on self-assessed admit probability | Students apply to too many reaches, too few safeties |
| Herding | Correlation within same high school | Peer influence on school choice; amplifies prestige seeking |
| Rankings anchoring | 58% consult rankings; 3% know correct rank | Prestige weight dominates fit factors |
| Loss aversion | Stronger for waitlist outcomes than rejections | Waitlist "hope" is overvalued |
| Sunk cost | More apps = more attachment | Once applied, students over-weight any school they got into |
***
### 3.2 Student Yield & Enrollment Decisions
#### 3.2.1 Yield Rates by College Tier
Yield rate = proportion of admitted students who enroll. Source: CDS Part C for each school.
| College | Tier | Yield Rate (Class of 2029) | Notes |
| ---------------------------------------------------------------------- | ---------------------------------------------------------------- | --------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------- |
| MIT | HYPSM | 86.6% | Highest in simulation |
| Harvard | HYPSM | 83.6% | SCEA (non-binding EA) |
| Stanford | HYPSM | ~81% | SCEA |
| Princeton | HYPSM | 78.3% | SCEA |
| Yale | HYPSM | 67.7% | SCEA |
| — | — | — | — |
| Cornell | Ivy | 68.4% | RD/ED split |
| UPenn | Ivy | 67.9% | ED fills 53% |
| Brown | Ivy | 67.3% | ED |
| Columbia | Ivy | 67.1% | ED |
| Dartmouth | Ivy | 63.7% | ED fills 48% |
| — | — | — | — |
| Michigan | Selective | ~40-45% | State preference; large pool |
| Duke | Near-Ivy | ~40-45% | ED 51% |
| Notre Dame | Near-Ivy | ~55-60% | REA; strong loyalty |
| Georgetown | Near-Ivy | ~45% | No ED/EA historically |
| Northwestern | Near-Ivy | ~35-40% | ED 53% |
| Johns Hopkins | Near-Ivy | ~35% | ED |
| Carnegie Mellon | Near-Ivy | ~30-35% | ED |
| WashU | Near-Ivy | ~35% | ED ~60% |
| Rice | Near-Ivy | ~35% | ED |
| Vanderbilt | Near-Ivy | ~35% | ED |
| — | — | — | — |
| Williams | LAC | ~35% | SCEA |
| Amherst | LAC | ~35% | ED |
| Middlebury | LAC | ~30% | ED fills 68% |
| — | — | — | — |
| UVA | Selective | ~30% | State preference (in-state ~85%, out-of-state ~15%) |
| UCLA | Selective | ~15-20% | UC system; many students hold multiple UC offers |
| Emory | Selective | ~25-30% | ED |
| Tufts | Selective | ~30% | ED |
| Boston College | Selective | ~25-30% | EA |
#### 3.2.2 Yield Tier Summary
| Tier | Yield Range | Key Drivers |
| ------------------------------------------------------------------------ | ------------------------------------------------------------------ | --------------------------------------------------------------------------------------------------- |
| HYPSM | 67-87% | Brand dominance; financial aid; ED/SCEA |
| Ivy+ | 63-69% | Strong ED programs; financial aid packages |
| Near-Ivy | 35-55% | Competition from HYPSM; cost sensitivity |
| Selective private | 25-40% | Many alternatives; cost sensitivity |
| Selective public | 15-45% | In-state vs. out-of-state splits; UC overlap |
| LAC | 28-38% | Niche appeal; overlap with Ivies |
#### 3.2.3 Yield Prediction Model
Enrollment utility model (based on college counseling research and behavioral economics):
enrollmentScore = prestige_weight * prestige_rank + // 0.30 financial_weight * net_cost_factor + // 0.30 program_weight * program_fit + // 0.15 visit_weight * campus_visit + // 0.10 geography_weight * geo_proximity + // 0.10 peer_weight * peer_influence // 0.05
**Financial aid is the #1 enrollment factor among admitted students:**
* NACAC survey: 49% rate financial aid "very important" in enrollment decision
* Students without demonstrated need: prestige weight rises to ~0.45
* Students with high demonstrated need: financial weight rises to ~0.50
**57% of students enroll at their first-choice school** (NACAC), implying 43% end up at lower-ranked options due to financial aid, waitlists, or other constraints.
#### 3.2.4 Income-Based Yield Heterogeneity
| Income Bracket | Key Enrollment Driver | MIT-specific Pattern |
| ----------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------- |
| Bottom quintile (< $40K) | Financial aid package (Pell + school grants) | MIT no-loan policy eliminates cost barrier; yield ~90% |
| Middle quintile ($40K-$100K) | Net cost comparison across admits | Yield ~80-85% at MIT |
| Top quintile (> $200K) | Prestige ranking; peer/family expectations | Yield ~85-90%; strong prestige seekers |
***
### 3.3 College Admissions Decision Model
#### 3.3.1 Harvard's Holistic Rating System
Harvard uses six 1-6 rating scales (1 = best):
| Rating Dimension | Description |
| ----------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------- |
| Academic | Intellectual achievement; course rigor; GPA; test scores |
| Extracurricular | Depth, leadership, impact; national vs. regional vs. local |
| Personal | Character, integrity, empathy; essays; recommendations |
| Athletic | Sports achievement; recruited athlete status |
| Recommendation | Quality and specificity of counselor/teacher letters |
| Alumni Interview | If applicable; assessments vary by interviewer quality |
**Admit rates by Overall (summary) rating:**
| Overall Rating | Admit Rate |
| --------------------------------------------------------------------- | ----------------------------------------------------------------- |
| 1 | 100% |
| 2+ | ~90% |
| 2 | ~70% |
| 2- | ~35% |
| 3+ | ~20% |
| 3 | ~3% |
| 4+ | <1% |
#### 3.3.2 ALDC Categories and Admit Rates
ALDC = Athletes, Legacies, Dean's Interest List, Children of faculty/staff. SFFA trial data for Harvard (2014-2019):
| Category | Admit Rate | Overall Pool Rate |
| ------------------------------------------------------------------------------------ | ----------------------------------------------------------------- | ------------------------------------------------------------------------ |
| Recruited Athletes | ~86% | 3.4% |
| Faculty/Staff Children | ~47% | 3.4% |
| Dean's Interest List (donors) | ~42% | 3.4% |
| Legacies (parent attended) | ~34% | 3.4% |
| Non-ALDC | ~3.1% | 3.4% |
**ALDC composition of admitted classes (Harvard, 2014-2019):**
* 43% of white admits are ALDC
* < 16% of Black, Asian, or Hispanic admits are ALDC
* Among white ALDC admits: approximately 3 in 4 would be rejected on non-ALDC basis
#### 3.3.3 Hook Implementation: Multiplicative Approach
Hooks must be applied multiplicatively, not additively. If the baseline admit probability is p:
p_athlete = min(1.0, p × athlete_multiplier) p_legacy = min(1.0, p × legacy_multiplier) p_combined = min(1.0, p × athlete_multiplier × legacy_multiplier)
The alternative (additive) approach would overstate the benefit for students with very low baseline probabilities and understate it for moderate-probability students.
#### 3.3.4 Round-Based Admit Threshold Multipliers
The "multiplier" in a threshold-based model means: how much lower is the admission threshold in early rounds vs. RD?
| Round | Threshold Multiplier | Effective Boost | Notes |
| ---------------------------------------------------------------------------------------- | --------------------------------------------------------------------------- | ---------------------------------------------------------------------- | --------------------------------------------------------------------------------- |
| Early Decision (ED) | 0.70 | ~1.5-2.0x | Binding; largest boost |
| Early Action (EA) | 0.85 | ~1.2-1.4x | Non-binding; smaller boost |
| Single-Choice Early Action (SCEA) | 0.85 | ~1.2-1.4x | Non-binding but exclusive |
| Early Decision II (EDII) | 0.75 | ~1.4-1.7x | Between ED and EA |
| Regular Decision (RD) | 1.00 | Baseline | Full competition |
The threshold multiplier is applied to the score threshold below which candidates are rejected: a lower threshold value = more students clear the bar = higher admit rate.
#### 3.3.5 Waitlist Mechanics
```javascript proof:W3sidHlwZSI6InByb29mQXV0aG9yZWQiLCJmcm9tIjowLCJ0byI6MzQwLCJhdHRycyI6eyJieSI6ImFpOmNsYXVkZSJ9fV0=
function assignWaitlist(student, college) {
const admitThreshold = college.baseThreshold * roundMultiplier;
const waitlistThreshold = admitThreshold * 1.30; // 30% buffer above admit cutoff
if (student.admitScore >= admitThreshold) return 'admit';
if (student.admitScore >= waitlistThreshold) return 'waitlist';
return 'deny';
}
Waitlist activation rates (% of waitlisted students eventually admitted):
HYPSM: 0-10% (varies heavily by year; Princeton ranged 0.15% to 16.4%)
Ivy+: 5-15%
Near-Ivy: 10-20%
Selective: 15-30%
Admits_needed = Target_class_size / Expected_yield_rate
Colleges must over-admit to hit enrollment targets because yield is uncertain. If yield is lower than expected, waitlist is activated. If yield is higher, incoming class exceeds capacity (rare but has occurred at UVA, others).
Early rounds reduce yield uncertainty by locking in a portion of the class early (ED is binding; SCEA students are more likely to yield):
| Tier | Fill % via Early Rounds | ED/SCEA Yield | Notes |
|---|---|---|---|
| HYPSM | 15-25% (SCEA/REA, non-binding) | 70-85% via SCEA | Harvard/Princeton/Yale/Stanford use non-binding SCEA |
| Ivy+ (with ED) | 40-53% | ~98% ED | UPenn 53%, Northwestern 53%, Duke 51% |
| Near-Ivy | 35-50% | ~98% ED | WashU ~60%, Middlebury 68% |
| Selective private | 25-40% | ~97% ED | Varies widely |
| Selective public | 5-15% | EA, not ED | No binding commitment; lower certainty |
Melt = students who commit but do not enroll (withdraw after May 1 deposit).
| Tier | Melt Rate | Notes |
|---|---|---|
| Elite privates (HYPSM, Ivy+) | 1-3% | Strong brand; high engagement |
| Near-Ivy / selective private | 3-7% | Some competition from financial aid offers |
| Selective publics | 5-10% | UC system multiple admit; financial comparisons |
| Less selective | 10-40% | High competition; cost sensitivity |
```javascript proof:W3sidHlwZSI6InByb29mQXV0aG9yZWQiLCJmcm9tIjowLCJ0byI6NTI4LCJhdHRycyI6eyJieSI6ImFpOmNsYXVkZSJ9fV0= function activateWaitlist(college, enrolledCount) { const shortfall = college.targetClassSize - enrolledCount;
if (shortfall <= 0) { college.waitlistActivated = false; return []; }
// Activate waitlist; admit top candidates until shortfall filled const waitlistAdmits = college.waitlist .sort((a, b) => b.admitScore - a.admitScore) .slice(0, Math.ceil(shortfall * 2.5)) // Over-offer by 150% for waitlist yield ~40% .map(s => ({ ...s, status: 'admitted_waitlist' }));
return waitlistAdmits; }
***
## 4. Open-Source Data Catalog
### 4.1 CommonApp & NACAC Data
#### 4.1.1 CommonApp Reports (Public)
| Report | URL | Data Available |
| ------------------------------------------------------------------------------------------------ | ---------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------ |
| 2024-25 End-of-Season | commonapp.org/research/data | Apps per applicant, applicant count, school breakdown |
| Historical reports (2014-2024) | commonapp.org/research/data | 10-year trend series |
| Annual State of College Admission (NACAC) | nacacnet.org/research | Factor importance survey, application trends |
**Key CommonApp statistics (2024-25):**
* Total applications: 10,193,579
* Unique applicants: 1,497,000+
* Applications per applicant: 6.80
* First-generation applicants: ~22-26% of domestic applicants
* International applicants: ~16% of total
#### 4.1.2 NACAC Factor Importance Survey
NACAC surveys admissions offices annually on factor importance. Most recent full dataset (Fall 2023, n=185 institutions):
| Factor | % Rating "Considerable Importance" |
| ------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------- |
| Grades in all courses | 93.0% |
| Grades in college-prep courses | 91.9% |
| Strength of curriculum | 86.5% |
| Character/personal qualities | 65.8% |
| Essay/writing sample | 56.2% |
| Counselor recommendation | 51.9% |
| Teacher recommendation | 51.3% |
| Extracurricular activities | 50.8% |
| Demonstrated interest | 43.3% |
| SAT/ACT scores | 30.3% |
| Class rank | 19.5% |
| Work experience | 14.2% |
| State residency | 11.3% |
| Interview | 9.4% |
| First-generation status | 8.6% |
| Legacy | 2.7% |
**Simulation implication:** Grades dominate at 93% vs. SAT/ACT at 30.3%. The simulation's academic score should weight GPA more heavily than test scores (recommended 60/40 GPA/SAT split within the academic index).
#### 4.1.3 NACAC Data Access
* NACAC State of College Admission (annual PDF): free download at nacacnet.org
* Raw survey microdata: not publicly available; aggregate tables in annual report
* Trend reports: available for download 2005-present
***
### 4.2 HYPSM Common Data Sets
CDS data is published annually by each institution as required for participation in the U.S. News rankings. All data below is for Class of 2029 (admitted 2024-25 cycle).
#### 4.2.1 Harvard University
| Metric | Value |
| ------------------------------------------------------------------------------- | ---------------------------------------------------------------- |
| Total applicants | 47,893 |
| Admitted | 2,003 |
| Acceptance rate | 4.2% |
| Enrolled | 1,676 |
| Yield rate | 83.6% |
| EA admit rate | 7.6% |
| RD admit rate | 2.6% |
| SAT Composite middle 50% | 1500-1580 |
| SAT Math middle 50% | 760-800 |
| SAT EBRW middle 50% | 740-780 |
| ACT Composite middle 50% | 34-36 |
#### 4.2.2 Yale University
| Metric | Value |
| ------------------------------------------------------------------------------- | ---------------------------------------------------------------- |
| Total applicants | 57,517 |
| Admitted | 2,227 |
| Acceptance rate | 3.9% |
| SCEA admit rate | 10.0% |
| RD admit rate | 3.5% |
| SAT Composite middle 50% | 1480-1560 |
| SAT Math middle 50% | 750-800 |
| ACT Composite middle 50% | 33-36 |
#### 4.2.3 Princeton University
| Metric | Value |
| ------------------------------------------------------------------------------- | ---------------------------------------------------------------- |
| Total applicants | 40,468 |
| Admitted | 1,868 |
| Acceptance rate | 4.6% |
| SCEA admit rate | ~11% |
| Yield rate | 78.3% |
| SAT Composite middle 50% | 1480-1570 |
| SAT Math middle 50% | 760-800 |
| ACT Composite middle 50% | 33-36 |
#### 4.2.4 Stanford University
| Metric | Value |
| ------------------------------------------------------------------------------- | --------------------------------------------------------------------- |
| Total applicants | ~56,000 (est.) |
| Acceptance rate | ~3.7% |
| Yield rate | ~81% |
| SCEA admit rate | ~9-11% |
| SAT Composite middle 50% | 1480-1570 |
| ACT Composite middle 50% | 34-36 |
#### 4.2.5 MIT
| Metric | Value |
| ------------------------------------------------------------------------------- | ---------------------------------------------------------------------------- |
| Total applicants | ~28,000 |
| Accepted | ~1,300 |
| Acceptance rate | ~4.6% |
| Yield rate | 86.6% (Class of 2029) |
| EA admit rate | ~7-8% |
| RD admit rate | ~3-4% |
| SAT Math middle 50% | 780-800 |
| SAT Composite middle 50% | 1520-1580 |
| ACT Composite middle 50% | 35-36 |
***
### 4.3 Top 20-50 Common Data Sets
#### 4.3.1 Acceptance Rates (Class of 2029)
| College | Tier | Acceptance Rate | Yield (est.) |
| ---------------------------------------------------------------------- | ---------------------------------------------------------------- | ---------------------------------------------------------------------- | -------------------------------------------------------------------------- |
| Caltech | Ivy+ | 2.6% | ~50% |
| Columbia | Ivy | 3.9% | 67.1% |
| Brown | Ivy | 5.5% | 67.3% |
| Vanderbilt | Near-Ivy | 5.6% | ~35% |
| UPenn | Ivy | 5.9% | 67.9% |
| Duke | Near-Ivy | 5.9% | ~40-45% |
| Dartmouth | Ivy | 6.2% | 63.7% |
| UChicago | Ivy+ | 6.5% | ~60% |
| Johns Hopkins | Near-Ivy | 7.5% | ~35% |
| Williams | LAC | 7.5% | ~35% |
| Northwestern | Near-Ivy | 7.8% | ~35-40% |
| UCLA | Selective | 8.6% | ~15-20% |
| Cornell | Ivy | 8.7% | 68.4% |
| Amherst | LAC | 9.0% | ~35% |
| Rice | Near-Ivy | 9.5% | ~35% |
| Middlebury | LAC | 10.0% | ~30% |
| Carnegie Mellon | Near-Ivy | 11.3% | ~30-35% |
| Emory | Selective | 11.4% | ~25-30% |
| Tufts | Selective | 11.4% | ~30% |
| WashU | Near-Ivy | 12.0% | ~35% |
| Georgetown | Near-Ivy | 12.3% | ~45% |
| Notre Dame | Near-Ivy | 12.4% | ~55-60% |
| Boston College | Selective | 16.7% | ~25-30% |
| Michigan | Selective | 18.0% | ~40-45% |
| UVA | Selective | 20.0% | ~30% (out-of-state) |
#### 4.3.2 SAT Middle 50% by College
| College | SAT Composite M50% | SAT Math M50% |
| ---------------------------------------------------------------------- | ---------------------------------------------------------------------------- | -------------------------------------------------------------------- |
| Caltech | 1530-1580 | 790-800 |
| Columbia | 1500-1570 | 770-800 |
| UChicago | 1510-1580 | 770-800 |
| Northwestern | 1480-1570 | 750-800 |
| Brown | 1460-1560 | 740-800 |
| UPenn | 1460-1560 | 740-800 |
| Duke | 1480-1570 | 760-800 |
| Dartmouth | 1440-1560 | 730-790 |
| Cornell | 1420-1560 | 720-790 |
| Rice | 1480-1560 | 750-800 |
| WashU | 1480-1570 | 760-800 |
| Vanderbilt | 1480-1570 | 750-800 |
| Johns Hopkins | 1490-1570 | 760-800 |
| Notre Dame | 1430-1540 | 730-790 |
| Carnegie Mellon | 1480-1570 | 760-800 |
| Georgetown | 1400-1530 | 700-780 |
| Williams | 1430-1570 | 730-800 |
| Amherst | 1420-1560 | 720-790 |
| Middlebury | 1360-1530 | 680-770 |
| Emory | 1380-1530 | 700-770 |
| Tufts | 1400-1530 | 710-780 |
| Boston College | 1350-1510 | 700-770 |
| Michigan | 1340-1530 | 700-790 |
| UVA | 1340-1520 | 700-790 |
| UCLA | Test-free (UC system) | N/A |
***
### 4.4 Feeder School Datasets
#### 4.4.1 Available Public Data Sources
| Source | Data Available | Access |
| ----------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------ | --------------------------------------------------------------------------------- |
| NSC High School Benchmarks | % of HS seniors enrolling in 2-yr, 4-yr colleges; no school-specific college destinations | Free public PDF |
| Harvard Crimson investigation (2024) | Named feeder schools, 15-year totals, approximate counts | Free web |
| Chetty/Deming/Friedman (2023) NBER 31492 | Ivy-Plus enrollment by parental income decile; private HS effect | Free NBER preprint |
| Arcidiacono et al. SFFA trial exhibits | School-level admit rate variation (Harvard only, 2000-2017) | Court documents; public |
| Individual school profiles / naviance | School-specific college send lists | Restricted (student login) |
| NSC StudentTracker | Institution-level college enrollment data | Restricted; subscription |
#### 4.4.2 Key Feeder School Benchmarks (Harvard Crimson 2024)
| School | Harvard Students (15 years) | Annual Rate (est.) | Category |
| --------------------------------------------------------------------------- | ---------------------------------------------------------------------------------- | ------------------------------------------------------------------------- | ------------------------------------------------------------------------------- |
| Boston Latin School | 100+ | 7-8/year | Selective magnet/exam |
| Phillips Andover | 100+ | 7-8/year | Elite boarding |
| Stuyvesant HS | 100+ | 7-8/year | Selective exam school |
| Phillips Exeter | 100+ | 7-8/year | Elite boarding |
| Noble & Greenough | 70-100 | 5-7/year | Elite day school |
| Trinity School (NYC) | 70-100 | 5-7/year | Elite day school |
| Lexington HS (MA) | 70-100 | 5-7/year | Affluent suburban public |
#### 4.4.3 Chetty et al. Key Finding
From NBER Working Paper 31492 (Chetty, Deming & Friedman 2023):
* Children from top-1% families are 2.0x as likely to attend an Ivy-Plus school with the same test scores
* Children from top-0.1% families are approximately 3x as likely
* This advantage is "almost entirely driven" by private high school attendance
* Post-SAT-score effect: roughly half of the remaining socioeconomic gradient is explained by non-academic characteristics (ECs, recommendations, essays) and half by institutional preferences
***
## 5. Optimal Matching Theory
### 5.1 Gale-Shapley Algorithm
#### 5.1.1 Problem Statement
The stable matching problem: given n students and n colleges, each with complete preference rankings over the other side, find an assignment that is **stable** — no student-college pair (s, c) where s prefers c to their current match AND c prefers s to one of their currently assigned students.
#### 5.1.2 Student-Proposing Deferred Acceptance
Algorithm: Student-Proposing DA Input: Student preferences P_S, College preferences P_C, College capacities q_c
Initialize: All students unmatched, all colleges have empty provisional lists
while exists an unmatched student s with proposals remaining: s proposes to the next college c on s's preference list
if c has space (|match(c)| < q_c): c tentatively accepts s
else: s' = c's least preferred current match if c prefers s over s': c rejects s' (s' becomes unmatched and can propose again) c tentatively accepts s else: c rejects s
return final tentative acceptances as matches
**Termination:** O(n²) total proposals in worst case; Ω(n²) lower bound — asymptotically optimal.
**Average case:** Under random preferences, expected proposals ≈ n ln n (Pittel 1989).
#### 5.1.3 Key Theorems
**Proposer-Optimality Theorem (Gale & Shapley 1962):**
Student-proposing DA produces the student-optimal stable matching: every student gets the best possible partner in any stable matching.
**Receiver-Pessimality:**
The student-optimal stable matching is simultaneously the college-pessimal stable matching: every college gets the worst possible stable match from their perspective.
**Strategy-Proofness (Roth 1982):**
Under student-proposing DA, no student can benefit from misreporting preferences. Truth-telling is a dominant strategy for students. Colleges cannot unilaterally benefit from misreporting.
**Rural Hospital Theorem:**
The same set of students are unmatched across all stable matchings. If a college is under-filled in one stable matching, it is under-filled in all stable matchings. You cannot "fill" a rural hospital by choosing a different stable matching algorithm.
#### 5.1.4 Nobel Prize and Real-World Applications
| System | Year | Algorithm | Notes |
| ------------------------------------------------------------------------------- | -------------------------------------------------------------------- | ---------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------- |
| NRMP (medical residency) | 1998 redesign | Resident-proposing DA | Previously used hospital-proposing; switched after Roth 1984 identified problem |
| NYC high school match | 2003 | Student-proposing DA | 80,000 students/year; designed by Abdulkadiroglu, Pathak, Roth |
| Boston school assignment | 2005 | Student-proposing DA | Replaced Boston mechanism |
| Nobel Prize | 2012 | — | Alvin Roth and Lloyd Shapley |
***
### 5.2 US Admissions as a Matching Market
#### 5.2.1 Why US College Admissions Is NOT a Stable Matching
| Feature | Medical Residency (NRMP) | US College Admissions |
| --------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------- |
| Coordination | Centralized clearinghouse | Decentralized; each school runs separately |
| Algorithm | Resident-proposing DA | None; sequential rounds |
| Timing | Simultaneous national match day | ED → EA → EDII → RD over 5 months |
| Binding commitments | Yes; match is binding | ED only; RD is non-binding |
| Student strategy-proofness | Yes (proposing side) | No |
| Stable matching | Yes, by design | No; many blocking pairs exist |
**Consequence:** Many "blocking pairs" exist in the outcome of the US admissions market — students who prefer College A and would have been admitted by College A if they had applied, but instead attend College B. This is the definition of market inefficiency from a matching-theory standpoint.
#### 5.2.2 Unraveling in the Admissions Market
Roth & Xing (1994) describe "unraveling" in matching markets: when timing is decentralized, participants rush to make early commitments to reduce uncertainty, eventually moving so early that the market unravels (medical fellowships scheduling interviews 2 years in advance; college applications moving to October-November of senior year).
**Evidence in college admissions:**
* Share of class filled via early rounds: ~33% in 2010 → ~40-60% in 2024
* ED application volume at selective schools grew ~30-50% from 2015 to 2025
* Yale SCEA applications grew from ~4,600 to ~7,900 between 2015 and 2023
The ED mechanism is the admissions market's response to unraveling: a formal institution that provides binding commitment as a substitute for the coordinating role that a centralized clearinghouse would serve.
#### 5.2.3 ED as a Signaling Mechanism
Avery & Levin (2010, AER) model ED as a credible signal of first-choice preference:
* Without ED, students cannot credibly signal first-choice status (cheap talk)
* ED makes the signal costly (binding commitment; no financial aid comparison)
* Colleges rationally lower their admission threshold for ED applicants who signal genuine enthusiasm
* Sophisticated students (typically high-SES) are better positioned to identify and commit to their true first choice early
**Avery, Fairbanks & Zeckhauser (2003) empirical finding:**
* ED provides approximately +100 SAT points equivalent in admissions advantage
* More recent estimates (post-2010): +150-200 SAT points equivalent at some schools
**International analogs:**
* Turkey: centralized post-exam DA (OSYM)
* Brazil: SISU (centralized DA over ENEM scores)
* Chile: Sistema de Acceso
* Germany: Hochschulstart (medicine, law, pharmacy)
* Taiwan: multi-stage centralized system
#### 5.2.4 Theory Recommendation for Simulation
The simulation should NOT implement Gale-Shapley as its core engine. Instead:
* Model the decentralized, round-based sequential process (as currently done)
* ED fills colleges early with near-certainty (binding)
* Remaining rounds are sequential and non-binding
* Stability can be analyzed as a secondary metric (count blocking pairs) but is not the objective
* The simulation's round-based model correctly captures the market's actual dynamics
***
### 5.3 Student Welfare Optimization
#### 5.3.1 What Student-Optimal Means in Practice
In the NYC high school match (80,000 students/year):
* Before DA (2002 and earlier): 31,000 students per year were unmatched, receiving administrative assignment to a school they did not list
* After DA (2003 and later): approximately 3,000 unmatched — a 90% reduction
* Students received, on average, their 3rd choice rather than their 5th choice
**Abdulkadiroglu, Agarwal & Pathak (2017, AER):**
* Simulated counterfactual: what if students were randomly assigned (no market)?
* Student-proposing DA achieves ~80% of the welfare gains achievable by any mechanism
* Coordination effect vs. algorithm effect: moving students to express preferences simultaneously captures ~80% of total possible welfare gains, regardless of which algorithm is used
#### 5.3.2 Pareto Efficiency vs. Stability: The Fundamental Tradeoff
| Property | Student-Proposing DA | Top Trading Cycles (TTC) |
| -------------------------------------------------------------------------------- | -------------------------------------------------------------------------------- | -------------------------------------------------------------------------------- |
| Stable matching | Yes | No |
| Pareto efficiency | No | Yes |
| Strategy-proof (students) | Yes | Yes |
| Used in practice | Yes (school choice, NRMP) | Limited (kidney exchange) |
No mechanism can simultaneously achieve stability AND Pareto efficiency (in general). The choice between DA and TTC depends on context:
* DA prioritizes that no student-school blocking pair exists (fairness criterion)
* TTC prioritizes that no student can be made better off without making another worse off
For school choice with priority-based fairness (e.g., siblings, neighborhood), DA is preferred.
#### 5.3.3 Boston Mechanism Failure and Equity Implications
The Boston mechanism (Immediate Acceptance) was used in many US cities' school choice systems before economic redesigns. Its flaw: it is NOT strategy-proof. Unsophisticated families who list their true first choice are at risk of being left without options if rejected.
**Agarwal & Somaini (2018):**
* Estimated structural model of Boston mechanism participation
* Welfare cost of Boston mechanism fell disproportionately on less sophisticated (lower-income, lower-education) families
* Sophisticated families (with more information) could strategically "game" the first-round priority to their advantage
**Analog to ED in college admissions:**
* ED mimics the first-round of the Boston mechanism: committing early captures a large priority boost
* Families that cannot commit early (due to financial aid uncertainty) are at a structural disadvantage
* First-generation students disproportionately avoid ED — rational given their constraints, but costly in admission outcomes
#### 5.3.4 Equity Implications for Simulation
The simulation can model welfare equity by tracking:
```javascript proof:W3sidHlwZSI6InByb29mQXV0aG9yZWQiLCJmcm9tIjowLCJ0byI6NjUyLCJhdHRycyI6eyJieSI6ImFpOmNsYXVkZSJ9fV0=
// Equity metrics to compute post-simulation
const equityMetrics = {
// Yield-adjusted match quality
meanCollegeRankByIncome: groupBy(students, s => s.incomeQuintile)
.map(g => mean(g.map(s => s.enrolledCollege.rank))),
// ED usage by first-gen status
edUsageByFirstGen: groupBy(students, s => s.firstGen)
.map(g => g.filter(s => s.appliedED).length / g.length),
// Blocking pairs count (stability diagnostic)
blockingPairs: countBlockingPairs(students, colleges),
// Students matched below safety school
belowSafetyMatch: students.filter(s =>
s.enrolledCollege.rank > s.safetySchools.map(c => c.rank).min()
).length
};
The Boston mechanism (also called Immediate Acceptance or First Preference First):
Why it fails: Any student who ranks their true first choice first risks being left without a seat if rejected. The strategically correct move is to rank a "safer" school first if the probability of getting your true first choice is low. Unsophisticated families cannot make this calculation.
Under DA:
No assignment is final until the algorithm terminates
A student who proposes to their first choice and gets "tentatively accepted" can still be displaced later by another student (if the college prefers the new student)
This eliminates the strategic incentive to misrank: truth-telling is weakly dominant
Boston (2005) reform: Boston City Schools switched from Boston mechanism to student-proposing DA after economists (Abdulkadiroglu, Pathak, Roth) identified the manipulation problem. The switch had measurable welfare improvements.
New York City high school admissions (2003 redesign by Abdulkadiroglu, Pathak, Roth):
Scale: ~80,000 8th graders, ~700 high school programs
Previous system (Intermediate Preference Form): students submitted preferences, schools admitted by priority; 30,000 forms processed manually
Under old system: 31,000 students/year received administrative assignment to a school they did not list
After DA implementation: ~3,000 unmatched (90% reduction)
Students received an average of their 3rd choice vs. 5th choice under old system
The parallel between college admissions and the Boston mechanism failure:
| Feature | Boston Mechanism (school choice) | ED in College Admissions |
|---|---|---|
| Commitment timing | Irrevocable in Round 1 | Binding commitment in ED |
| Strategic advantage | To sophisticated families who know to rank safe school | To high-SES families who can commit without financial aid comparison |
| Unsophisticated penalty | Ranked true first choice → stranded if rejected | Avoided ED due to uncertainty → lost 1.5-2.0x boost |
| Market outcome | Many students mismatched | First-gen students systematically undermatched |
Simulation extension ideas:
This section is the primary operational reference for the simulation's parameter configuration. Each subsection provides a recommended value, the range supported by the research, and the primary source.
Hook multipliers represent the factor by which a student's admission probability is multiplied given a particular "hook" (special status that colleges value).
Recommended implementation: Per-tier athlete multipliers, not a single global value.
| College Tier | Schools | Recommended Multiplier | Range | Source |
|---|---|---|---|---|
| HYPS (D1 slot system) | Harvard, Yale, Princeton, Stanford | 4.5x | 4.0-5.0x | SFFA trial; Espenshade & Chung 2005 |
| MIT (D3, no slots) | MIT | 3.0x | 2.5-3.5x | MIT athletics office; estimated from admit rates |
| Other Ivy (D1) | Columbia, Penn, Brown, Dartmouth, Cornell | 4.0x | 3.5-4.5x | Ivy League athletics; coach feedback |
| Near-Ivy DI | Duke, Northwestern, Notre Dame | 4.0x | 3.5-4.5x | Similar to Ivy DI |
| Near-Ivy DIII | Caltech, WashU, CMU | 2.0x | 1.5-2.5x | DIII; athletics less central |
| LAC (NESCAC) | Williams, Amherst, Middlebury | 3.5x | 3.0-4.0x | NESCAC athletics culture |
| Selective public | UVA, UCLA, Michigan | 2.5x | 2.0-3.0x | Revenue sports only; walk-ons minimal |
Current simulation: 3.5x global. Recommended upgrade: Per-tier as above.
Validation: With 4.5x athlete multiplier at Harvard-level schools, an athlete with academic_score = 0.50 (roughly 1400 SAT, 3.8 GPA) should see ~15-20% admit probability, consistent with SFFA data.
| College Tier | Recommended Multiplier | Range | Source |
|---|---|---|---|
| HYPSM | 2.5x | 2.0-3.5x | SFFA trial; Arcidiacono; Espenshade |
| Ivy+ | 2.5x | 2.0-3.0x | General Ivy policy |
| Near-Ivy | 2.0x | 1.5-2.5x | Lower legacy emphasis |
| Selective | 1.5x | 1.2-2.0x | Variable by school |
| LAC | 2.0x | 1.5-2.5x | Strong alumni community |
| Selective public | 1.0x | 1.0-1.1x | Legacy less emphasized; state mission |
Harvard SFFA data: Legacy admit rate = 34% vs. 3.1% non-ALDC = effective ~11x gross multiplier. Controlling for academic self-selection and other hook correlation, net multiplier estimated 2.5-3.5x.
| College Tier | Recommended Multiplier | Range | Source |
|---|---|---|---|
| HYPSM | 4.0x | 3.0-5.0x | SFFA trial; Dean's List 42% vs. 3.1% |
| Ivy+ | 3.5x | 2.5-4.5x | Similar process |
| Near-Ivy | 3.0x | 2.0-4.0x | Less transparent; estimated |
| Selective | 2.0x | 1.5-3.0x | Smaller endowments; less room |
| LAC | 2.5x | 2.0-3.5x | Large gift importance to smaller schools |
| Selective public | 1.0x | 1.0-1.5x | Legally constrained; foundation gifts |
Note: Donor hook should be rare in the population (< 0.5% of applicants at any school). Set base rate accordingly.
| College Tier | Recommended Multiplier | Range | Source |
|---|---|---|---|
| HYPSM | 1.4x | 1.3-1.6x | QuestBridge partnerships; first-gen initiatives |
| Ivy+ | 1.35x | 1.2-1.5x | Similar programs |
| Near-Ivy | 1.25x | 1.1-1.4x | Varies by school commitment |
| Selective | 1.2x | 1.1-1.3x | NACAC data; first-gen flag |
| Selective public | 1.3x | 1.2-1.5x | State mission; in-state first-gen emphasis |
| LAC | 1.3x | 1.2-1.5x | LAC access missions |
Note: First-gen is a legal, race-neutral proxy. The NACAC factor importance survey rates it at 8.6% "considerable importance" across all schools, but Pell-eligible / QuestBridge schools may be much higher.
When a student has multiple hooks, apply multiplicatively with a diminishing-returns cap:
```javascript proof:W3sidHlwZSI6InByb29mQXV0aG9yZWQiLCJmcm9tIjowLCJ0byI6NjIwLCJhdHRycyI6eyJieSI6ImFpOmNsYXVkZSJ9fV0= function applyHooks(baseProb, student, college) { let multiplier = 1.0;
if (student.isRecruited && college.hasAthleticSlots) { multiplier = college.athleteMultiplier; } if (student.isLegacy && college.tier !== 'selective_public') { multiplier = college.legacyMultiplier; } if (student.isDevelopment) { multiplier = college.donorMultiplier; } if (student.isFirstGen) { multiplier = college.firstGenMultiplier; }
// Diminishing returns: cap multiplier at 8x to prevent degenerate outcomes multiplier = Math.min(multiplier, 8.0);
return Math.min(1.0, baseProb * multiplier); }
***
### 6.2 Round Multipliers by College Tier
Round multipliers reflect the lower admissions threshold (and thus higher admit probability) in early rounds. The multiplier is applied to the admission score threshold, not to the probability directly.
#### 6.2.1 Threshold Multipliers
A threshold multiplier of 0.70 means the school admits students whose score exceeds 70% of the normal RD threshold — making early admission easier.
| Round | Threshold Multiplier | Probability Boost (est.) | Notes |
| ------------------------------------------------------------------------------ | --------------------------------------------------------------------------- | ------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------- |
| ED (binding) | 0.70 | 1.5-2.0x | Strongest signal; near-100% yield commitment |
| EDII (binding) | 0.75 | 1.3-1.7x | Later; slightly weaker |
| EA / SCEA (non-binding) | 0.85 | 1.2-1.4x | Signal of interest without commitment |
| RD | 1.00 | Baseline | Full competition |
#### 6.2.2 Round Multipliers by Tier
Different tiers offer different round structures:
| Tier | ED Threshold | EA/SCEA Threshold | Notes |
| -------------------------------------------------------------------------- | ---------------------------------------------------------------------- | ------------------------------------------------------------------------ | --------------------------------------------------------------------------------------------------------------------- |
| HYPSM | N/A (SCEA only) | 0.85 | Harvard/Yale/Princeton/Stanford/MIT all use SCEA or EA, not ED |
| Ivy+ (with ED) | 0.70 | N/A | Columbia, Penn, Brown, Dartmouth, Cornell use ED |
| Near-Ivy (with ED) | 0.70 | N/A | Most Near-Ivies use ED |
| Near-Ivy (with REA) | N/A | 0.85 | Notre Dame uses REA (restrictive EA) |
| Near-Ivy (no early) | 1.00 | N/A | Georgetown (historically limited EA) |
| Selective | 0.70 | 0.85 | Many offer both ED and EA |
| Selective public | N/A | 0.90 | EA only; non-binding; weaker boost |
**Important:** MIT and Harvard use "Restrictive Early Action" (non-binding, but students may not apply EA/ED elsewhere). Yale uses SCEA. Stanford uses SCEA. Princeton uses SCEA. These behave like EA in terms of commitment but have exclusivity restrictions.
#### 6.2.3 Fill Rates by Round
| Tier | % Class Filled via Early Rounds |
| ------------------------------------------------------------------------------------ | -------------------------------------------------------------------------------------- |
| HYPSM (via SCEA/REA) | 15-25% |
| Ivy+ (via ED) | 40-53% |
| Near-Ivy (via ED) | 35-60% |
| Selective private (via ED+EA) | 25-45% |
| Selective public (via EA) | 10-20% |
***
### 6.3 Yield Rates by Tier
#### 6.3.1 Recommended Yield Parameters
| College | Recommended Yield | CDS Source | Notes |
| ---------------------------------------------------------------------- | ------------------------------------------------------------------------ | ----------------------------------------------------------------- | --------------------------------------------------------------------------------------------- |
| MIT | 86.6% | CDS 2029 | Highest in simulation |
| Harvard | 83.6% | CDS 2029 | SCEA aids retention |
| Stanford | 81.0% | Estimated | SCEA |
| Princeton | 78.3% | CDS 2029 | SCEA |
| Yale | 67.7% | CDS 2029 | SCEA but more competition |
| Cornell | 68.4% | CDS 2029 | ED fills 38% |
| UPenn | 67.9% | CDS 2029 | ED 53% of class |
| Brown | 67.3% | CDS 2029 | ED |
| Columbia | 67.1% | CDS 2029 | ED |
| Dartmouth | 63.7% | CDS 2029 | ED 48% |
| Notre Dame | 57.0% | Estimated | REA; strong loyalty |
| Michigan | 43.0% | Estimated | State preference; EA |
| Duke | 42.0% | Estimated | ED 51% |
| Georgetown | 45.0% | Estimated | Limited early action |
| UChicago | 62.0% | Estimated | ED; strong enrollment mgmt |
| Northwestern | 38.0% | Estimated | ED 53%; competition |
| Caltech | 50.0% | Estimated | Very selective pool; high satisfaction |
| Rice | 35.0% | Estimated | ED; smaller school |
| Vanderbilt | 35.0% | Estimated | ED |
| WashU | 36.0% | Estimated | ED ~60% |
| Johns Hopkins | 35.0% | Estimated | ED |
| Carnegie Mellon | 33.0% | Estimated | ED; competition from CS programs |
| Williams | 35.0% | Estimated | SCEA; strong loyalty |
| Amherst | 35.0% | Estimated | ED |
| Middlebury | 30.0% | Estimated | ED 68%; smaller pool |
| Emory | 28.0% | Estimated | ED; competition |
| Tufts | 30.0% | Estimated | ED |
| Boston College | 27.0% | Estimated | EA; strong alternatives |
| UVA | 30.0% | Estimated | State preference (out-of-state ~15%) |
| UCLA | 18.0% | Estimated | UC system overlap; multiple admits |
#### 6.3.2 Yield Implementation
```javascript proof:W3sidHlwZSI6InByb29mQXV0aG9yZWQiLCJmcm9tIjowLCJ0byI6ODExLCJhdHRycyI6eyJieSI6ImFpOmNsYXVkZSJ9fV0=
function determineEnrollment(student, admits) {
// Student selects school with highest enrollment utility
const rankedAdmits = admits
.map(college => ({
college,
utility: calculateEnrollmentUtility(student, college)
}))
.sort((a, b) => b.utility - a.utility);
// Enroll at highest-utility admit
const enrolled = rankedAdmits[0]?.college ?? null;
// Apply melt probability
const meltRate = getMeltRate(enrolled);
if (Math.random() < meltRate) {
return null; // Student melts; re-open seat
}
return enrolled;
}
function getMeltRate(college) {
const tier = college.tier;
const meltRates = {
hypsm: 0.02,
ivy_plus: 0.03,
near_ivy: 0.05,
selective_private: 0.07,
selective_public: 0.09,
lac: 0.05
};
return meltRates[tier] || 0.05;
}
| School Type | Mean Apps | Std Dev | Min | Max |
|---|---|---|---|---|
| Elite boarding (Andover, Exeter) | 13 | 3 | 7 | 22 |
| Well-resourced suburban public | 9 | 3 | 5 | 18 |
| Average suburban public | 6 | 2 | 3 | 14 |
| First-generation / under-resourced | 4 | 2 | 2 | 10 |
| Simulation overall mean (weighted) | 6.8 | — | — | — |
Distribution shape: Right-skewed (Poisson or negative binomial with mean = school-type parameter). Cap at 20 to prevent unrealistic extremes.
| Student Type | ED Usage Rate | Notes |
|---|---|---|
| Elite private school student | 60-75% | Sophisticated strategy; financial flexibility |
| Well-resourced suburban | 40-55% | Moderate strategic understanding |
| Average public | 25-35% | Limited counseling on ED strategy |
| First-generation | 10-20% | Financial aid comparison needed; ED avoidance rational |
```javascript proof:W3sidHlwZSI6InByb29mQXV0aG9yZWQiLCJmcm9tIjowLCJ0byI6NTM1LCJhdHRycyI6eyJieSI6ImFpOmNsYXVkZSJ9fV0= function generatePortfolioSize(student) { const meanApps = { elite_private: 13, well_resourced: 9, average_public: 6, first_gen: 4 }[student.schoolType] || 6;
// Poisson-like sampling (capped) return Math.min(20, Math.max(2, Math.round(samplePoisson(meanApps)) )); }
function willApplyED(student) { const edRate = { elite_private: 0.70, well_resourced: 0.48, average_public: 0.30, first_gen: 0.15 }[student.schoolType] || 0.30;
return Math.random() < edRate; }
***
### 6.5 EC Scoring Weights
#### 6.5.1 Weight in Overall Admissions Score
| Factor | Recommended Weight | Range | Source |
| ---------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------- | ---------------------------------------------------------------- | ------------------------------------------------------------------------------------------ |
| Academic index (GPA + test) | 0.42 | 0.38-0.48 | NACAC grades 93%; Arcidiacono model |
| Extracurriculars | 0.27 | 0.22-0.32 | SFFA EC rating analysis |
| Essays / personal statement | 0.12 | 0.08-0.18 | NACAC 56%; holistic review |
| Recommendations | 0.10 | 0.07-0.13 | NACAC 51-52%; reader-dependent |
| Demonstrated interest / interview | 0.06 | 0.03-0.10 | College-specific; varies widely |
| School context (feeder mult.) | Applied as multiplier | — | See 6.7 |
| Hook multipliers | Applied as multiplier | — | See 6.1 |
**Within Academic Index:**
| Sub-factor | Recommended Weight |
| ---------------------------------------------------------------------------------- | ------------------------------------------------------------------------- |
| GPA | 0.60 |
| Standardized test (SAT/ACT) | 0.40 |
#### 6.5.2 EC Tier Bonus System
| Tier | Score Range | Bonus | Prevalence |
| -------------------------------------------------------------------------------------- | ------------------------------------------------------------------ | ------------------------------------------------------------ | ----------------------------------------------------------------------- |
| Tier 1 (national/international) | 8.5-10.0 | +0.08 | 2-3% of students |
| Tier 2 (state/regional) | 6.5-8.4 | +0.03 | 10-15% |
| Tier 3 (active participant) | 4.0-6.4 | +0.00 | 35-40% |
| Tier 4 (minimal) | 1.5-3.9 | -0.02 | 40-50% |
```javascript proof:W3sidHlwZSI6InByb29mQXV0aG9yZWQiLCJmcm9tIjowLCJ0byI6MTE3NiwiYXR0cnMiOnsiYnkiOiJhaTpjbGF1ZGUifX1d
function calcAdmitScore(student, college) {
// Academic index: GPA and test scores on 0-1 scale
const gpaScore = sigmoid((student.gpa - 3.0) / 0.5) * 0.60;
const satScore = sigmoid((student.sat - 1200) / 150) * 0.40;
const academicIndex = gpaScore + satScore; // 0-1
// EC contribution
const ecBase = (student.ecScore / 10.0) * EC_WEIGHT; // EC_WEIGHT = 0.27
const ecBonus = student.ecTier === 1 ? 0.08 :
student.ecTier === 2 ? 0.03 :
student.ecTier === 4 ? -0.02 : 0;
const ecContribution = ecBase + ecBonus;
// Softer factors
const softFactors = (student.essayScore / 10.0) * 0.12 +
(student.recScore / 10.0) * 0.10 +
(student.demonstratedInterest ? 0.06 : 0.03);
// Combine
let baseScore = academicIndex + ecContribution + softFactors;
// Apply feeder multiplier (to school quality component only)
baseScore *= student.school.feederMultiplier;
// Apply hook multipliers
baseScore = applyHooks(baseScore, student, college);
// Add perception noise
const noise = randn() * PERCEPTION_NOISE;
return Math.min(1.0, Math.max(0.0, baseScore + noise));
}
| College Category | Male Multiplier | Female Multiplier | Schools |
|---|---|---|---|
| STEM-heavy | 1.0x | 1.85x | MIT, Caltech, CMU, Harvey Mudd |
| Balanced elite | 1.0x | 1.05x | Harvard, Yale, Princeton, Stanford |
| Business-heavy | 1.0x | 1.05x | UPenn (Wharton effect) |
| LAC | 1.25x | 1.0x | Williams, Amherst, Middlebury |
| State flagship | 1.0x | 1.0x | Michigan, UVA, UCLA |
Rationale:
STEM schools have 1.8-2.0x female multiplier because female applicants are a minority (~24% at MIT vs. target 49% enrollment), creating strong selection incentive
LAC multiplier for males reflects the national trend of male under-representation in higher education; some LACs report 58-65% female student bodies
The gender multiplier should be applied to the admit probability (not the raw score) to avoid boundary effects:
```javascript proof:W3sidHlwZSI6InByb29mQXV0aG9yZWQiLCJmcm9tIjowLCJ0byI6MTgwLCJhdHRycyI6eyJieSI6ImFpOmNsYXVkZSJ9fV0= function applyGenderMultiplier(baseProb, student, college) { const multiplier = college.genderMultipliers[student.gender] || 1.0; return Math.min(1.0, baseProb * multiplier); }
***
### 6.7 Feeder School Multiplier
#### 6.7.1 Recommended Parameters
| School Type | Feeder Multiplier | Validation Target |
| ---------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------ | ------------------------------------------------------------------------------------------------------- |
| Elite boarding (Andover, Exeter, Deerfield, Groton) | 2.2x | Median student (1470 SAT, 3.9 GPA): 15-20% HYPSM |
| Selective exam/magnet (Stuyvesant, Boston Latin) | 1.6x | Same student: 10-14% HYPSM |
| Strong suburban public (top quartile nationally) | 1.4x | Same student: 8-12% HYPSM |
| Average suburban public | 1.0x (baseline) | Same student: 5-8% HYPSM |
| Rural/under-resourced | 0.95x | Same student: 4-6% HYPSM |
Note: The feeder multiplier is separate from and in addition to hook multipliers (athlete, legacy, donor). The feeder premium represents the residual advantage from school quality, counseling, and institutional trust after controlling for hooks that are modeled explicitly.
#### 6.7.2 Feeder Multiplier Application
The feeder multiplier is best applied to the overall admit score (or alternatively only to the "soft factors" component):
```javascript proof:W3sidHlwZSI6InByb29mQXV0aG9yZWQiLCJmcm9tIjowLCJ0byI6MjQ4LCJhdHRycyI6eyJieSI6ImFpOmNsYXVkZSJ9fV0=
// Apply feeder multiplier to full admit score
// (Reflects holistic quality lift: counseling, essays, recommendations)
baseScore *= student.school.feederMultiplier;
// Or: apply only to soft factors
softFactors *= student.school.feederMultiplier;
Recommended: Apply to the full baseScore before noise but after academic index computation. This prevents the multiplier from inflating already-near-ceiling academic scores unrealistically.
After feeder multiplier calibration, the following outcomes should be approximately correct:
| School | Student Profile | Expected HYPSM Admit Rate |
|---|---|---|
| Phillips Exeter | 1470 SAT, 3.9 GPA, Tier 2 EC | 15-20% |
| Strong suburban public | 1470 SAT, 3.9 GPA, Tier 2 EC | 7-10% |
| Average public | 1470 SAT, 3.9 GPA, Tier 2 EC | 5-7% |
| Under-resourced | 1470 SAT, 3.9 GPA, Tier 2 EC | 4-6% |
| Phillips Exeter | 1550 SAT, 4.0 GPA, Tier 1 EC | 35-50% |
| Average public | 1550 SAT, 4.0 GPA, Tier 1 EC | 15-25% |
The perceptionNoise parameter models the inherent randomness in holistic admissions review — two equally-qualified candidates may receive different outcomes due to:
Variability in essay quality interpretation across readers
Recommendation letter quality variance
Interview performance variance
Day-to-day variation in admissions committee mood/focus
Random overlap with college's diversity goals in a particular cycle
This is distinct from the student's actual qualifications (which are measured with high fidelity) and is better understood as measurement error in the admissions process.
The perception noise should be added to the admit score as normally-distributed random noise:
javascript proof:W3sidHlwZSI6InByb29mQXV0aG9yZWQiLCJmcm9tIjowLCJ0byI6NTcsImF0dHJzIjp7ImJ5IjoiYWk6Y2xhdWRlIn19XQ==
admitScore = baseScore + randn() * PERCEPTION_NOISE_SIGMA
| College Tier | Recommended Sigma | Range | Rationale |
|---|---|---|---|
| HYPSM | 0.08 | 0.06-0.12 | Highly holistic; large reader-to-reader variance |
| Ivy+ | 0.07 | 0.05-0.10 | Similar holistic process |
| Near-Ivy | 0.06 | 0.04-0.09 | Slightly more formula-driven |
| Selective private | 0.05 | 0.03-0.08 | More quantitative criteria |
| Selective public | 0.04 | 0.02-0.06 | Most formula-driven; GPA/test weighted heavily |
| LAC | 0.07 | 0.05-0.10 | Small staff; high holistic variance |
Global default: A single PERCEPTION_NOISE = 0.07 sigma applied universally is a reasonable simplification if per-tier noise is not implemented.
At HYPSM admit rates (~4%), the score threshold is very high. A sigma of 0.08 on a 0-1 scale means:
~16% of students score ≥1 sigma below their "true" score
~16% score ≥1 sigma above
This is equivalent to a ±12 percentile rank swing in competitive applicant scoring
The noise ensures:
A useful distinction:
Perception noise = variance in how the college evaluates a student's fixed characteristics
Application quality noise = variance in how well the student actually performs (essays, interview) — this should be modeled as part of the student's essay/rec scores, not as noise
Round-to-round news = variance in what happens between rounds (a competitor defers; a spot opens) — this is captured by round mechanics, not noise
The noise parameter should only capture reader-level variance, not student performance variance.
The simulation's acceptance rates should match real CDS data within a reasonable tolerance band. The following table provides validation targets:
| College | Target Accept Rate | Acceptable Range | Source |
|---|---|---|---|
| Harvard | 4.2% | 3.5-5.0% | CDS 2029 |
| Yale | 3.9% | 3.2-4.8% | CDS 2029 |
| Princeton | 4.6% | 3.8-5.5% | CDS 2029 |
| Stanford | 3.7% | 3.0-4.5% | Estimated |
| MIT | 4.6% | 3.8-5.5% | CDS 2029 |
| Caltech | 2.6% | 2.0-3.5% | CDS 2029 |
| Columbia | 3.9% | 3.2-4.8% | CDS 2029 |
| Brown | 5.5% | 4.5-6.5% | CDS 2029 |
| Vanderbilt | 5.6% | 4.5-7.0% | CDS 2029 |
| UPenn | 5.9% | 4.8-7.2% | CDS 2029 |
| Duke | 5.9% | 4.8-7.2% | CDS 2029 |
| Dartmouth | 6.2% | 5.0-7.5% | CDS 2029 |
| UChicago | 6.5% | 5.2-8.0% | CDS 2029 |
| JHU | 7.5% | 6.0-9.0% | CDS 2029 |
| Williams | 7.5% | 6.0-9.5% | CDS 2029 |
| Northwestern | 7.8% | 6.2-9.5% | CDS 2029 |
| UCLA | 8.6% | 7.0-10.5% | CDS 2029 |
| Cornell | 8.7% | 7.0-10.5% | CDS 2029 |
| Amherst | 9.0% | 7.2-11.0% | CDS 2029 |
| Rice | 9.5% | 7.5-11.5% | CDS 2029 |
| Middlebury | 10.0% | 8.0-12.5% | CDS 2029 |
| Carnegie Mellon | 11.3% | 9.0-13.5% | CDS 2029 |
| Emory | 11.4% | 9.0-14.0% | CDS 2029 |
| Tufts | 11.4% | 9.0-14.0% | CDS 2029 |
| WashU | 12.0% | 9.5-15.0% | CDS 2029 |
| Georgetown | 12.3% | 10.0-15.0% | CDS 2029 |
| Notre Dame | 12.4% | 10.0-15.0% | CDS 2029 |
| Boston College | 16.7% | 13.0-20.0% | CDS 2029 |
| Michigan | 18.0% | 14.0-22.0% | CDS 2029 |
| UVA | 20.0% | 16.0-24.0% | CDS 2029 |
Calibration method: Run the simulation 50 times with the same parameter set. Check that the mean acceptance rate across runs falls within the "acceptable range" for each school. If calibration fails, adjust the college's baseThreshold parameter (not the hook multipliers) to hit the target.
Colleges must over-admit to hit enrollment targets. The over-admission ratio = admits_target / enrollment_target:
| Tier | Over-Admission Ratio |
|---|---|
| HYPSM | 1/yield ≈ 1.15-1.45x |
| Ivy+ | 1/yield ≈ 1.45-1.58x |
| Near-Ivy | 1/yield ≈ 1.82-2.86x |
| Selective private | 1/yield ≈ 2.50-4.00x |
| Selective public | 1/yield ≈ 2.22-6.67x |
| LAC | 1/yield ≈ 2.63-3.57x |
| Tier | Waitlist Size (% of class) | Activation Threshold |
|---|---|---|
| HYPSM | 1,000-3,000 students | Very rarely activated |
| Ivy+ | 500-2,000 students | Activated in low-yield years |
| Near-Ivy | 300-1,000 students | Activated annually |
| Selective | 200-800 students | Frequently activated |
| Tier | Waitlist Offer-to-Enroll Rate |
|---|---|
| HYPSM | 15-25% (if offered) |
| Ivy+ | 25-35% |
| Near-Ivy | 35-50% |
| Selective | 40-60% |
Lower-ranked schools on the waitlist face higher melt risk (students holding more attractive offers). Higher-ranked schools have higher waitlist yield because being waitlisted signals high interest.
For quick reference, all key calibration parameters consolidated:
| Parameter | Recommended Value | Notes |
|---|---|---|
| Athlete multiplier (HYPS) | 4.5x | Per-tier recommended |
| Athlete multiplier (MIT) | 3.0x | DIII; no slots |
| Athlete multiplier (Ivy) | 4.0x | Formal slot system |
| Athlete multiplier (Near-Ivy DI) | 4.0x | — |
| Athlete multiplier (LAC) | 3.5x | NESCAC |
| Legacy multiplier (HYPSM/Ivy) | 2.5x | — |
| Legacy multiplier (Near-Ivy) | 2.0x | — |
| Donor multiplier (HYPSM/Ivy) | 4.0x | Rare; < 0.5% of pool |
| First-gen multiplier (HYPSM) | 1.4x | Legal; race-neutral |
| First-gen multiplier (overall) | 1.3x | — |
| ED threshold multiplier | 0.70 | Lowers admission bar |
| EA/SCEA threshold multiplier | 0.85 | Lowers admission bar |
| EDII threshold multiplier | 0.75 | Between ED and EA |
| HYPSM yield | 67-87% | Per-college recommended |
| Ivy+ yield | 63-69% | Per-college |
| Near-Ivy yield | 35-55% | Per-college |
| Selective yield | 15-45% | Per-college |
| Mean apps (elite boarding) | 13 | Per-school-type |
| Mean apps (avg public) | 6 | Per-school-type |
| Mean apps (first-gen) | 4 | Per-school-type |
| EC weight | 0.27 | In total score |
| Academic weight | 0.42 | GPA 60%, test 40% |
| EC Tier 1 bonus | +0.08 | Cliff-based |
| EC Tier 2 bonus | +0.03 | — |
| Gender mult. (female at MIT) | 1.85x | Structural STEM supply gap |
| Gender mult. (male at LAC) | 1.25x | Male under-representation |
| Feeder mult. (elite boarding) | 2.2x | After removing explicit hooks |
| Feeder mult. (avg public) | 1.0x | Baseline |
| Perception noise sigma | 0.07 | Normal distribution |
| Melt rate (elite private) | 2% | Post-commit dropout |
| Melt rate (selective public) | 9% | UC overlap; cost comparison |
| Waitlist threshold buffer | 1.30x | 30% above admit threshold |
| Citation | Key Finding | File |
|---|---|---|
| Arcidiacono, P. et al. (2022). "Racial Classification and the Admissions at Elite Universities." NBER Working Paper 29225. | Race multipliers from probit model on Harvard 2000-2017 admissions data | mit_race_gender.md |
| Chetty, R., Deming, D., & Friedman, J. (2023). "Diversifying Society's Leaders? The Determinants and Causal Effects of Admission to Highly Selective Private Colleges." NBER Working Paper 31492. | Top-1% families 2x as likely at Ivy-Plus; private HS mediates effect | data_feeder_schools.md, exeter_mit_pipeline.md |
| Espenshade, T. & Chung, C. (2005). "The Opportunity Cost of Admission Preferences at Elite Universities." Social Science Quarterly. | Athlete hook = +200 SAT pts; legacy = +160 SAT pts | mit_athletic_hooks.md |
| Avery, C., Fairbanks, A., & Zeckhauser, R. (2003). "The Early Admissions Game." Harvard University Press. | ED provides ~+100 SAT pts equivalent admission advantage | college_matching_market.md |
| Avery, C. & Levin, J. (2010). "Early Admissions at Selective Colleges." American Economic Review. | ED as credible signaling mechanism; dominant for first-choice | college_matching_market.md |
| Abdulkadiroglu, A., Agarwal, N., & Pathak, P. (2017). "The Welfare Effects of Coordinated Assignment." American Economic Review. | 80% of welfare gains from coordination; algorithm choice secondary | student_welfare_matching.md |
| Agarwal, N. & Somaini, P. (2018). "Demand Analysis Using Strategic Reports." Econometrica. | Boston mechanism welfare cost fell disproportionately on less-sophisticated families | student_welfare_matching.md, k12_school_choice.md |
| Roth, A. & Xing, X. (1994). "Jumping the Gun: Imperfections and Institutions Related to the Timing of Market Transactions." American Economic Review. | Unraveling in matching markets | college_matching_market.md |
| Roth, A. (1982). "The Economics of Matching: Stability and Incentives." Mathematics of Operations Research. | Strategy-proofness only for proposing side | gale_shapley_algorithm.md |
| Pittel, B. (1989). "The Average Number of Stable Matchings." SIAM Journal on Discrete Mathematics. | Expected O(n ln n) proposals under random preferences | gale_shapley_algorithm.md |
| Document | Key Finding | File |
|---|---|---|
| SFFA v. Harvard (2023), Supreme Court | Eliminated race-conscious admissions at Harvard and UNC | mit_race_gender.md |
| SFFA v. Harvard, Expert Testimony (2018) | Arcidiacono probit model; ALDC admit rates; race multiplier evidence | college_decision_model.md, mit_race_gender.md |
| SFFA v. Harvard, Trial Exhibits | Harvard 1-6 rating scale; EC data by admit rate | mit_extracurriculars.md, college_decision_model.md |
| Source | Data Available | URL |
|---|---|---|
| CommonApp End-of-Season Reports (2024-25) | Applications per applicant, total applicants, demographics | commonapp.org/research/data |
| NACAC State of College Admission (2023) | Factor importance survey; counselor-to-student ratio data | nacacnet.org |
| Harvard Common Data Set (2024-25) | Acceptance rate, yield, SAT ranges, class size | https://oira.harvard.edu/common-data-set/ |
| Yale Common Data Set (2024-25) | Acceptance rate, yield, SAT ranges | https://oir.yale.edu/common-data-set |
| Princeton Common Data Set (2024-25) | Acceptance rate, yield, SAT ranges | https://ir.princeton.edu/common-data-set |
| MIT Common Data Set (2024-25) | Acceptance rate, yield, SAT ranges | https://ir.mit.edu/common-data-set |
| All 30 college CDS files | Acceptance, yield, SAT, class size | Individual institutional IR offices |
| Source | Key Data | File |
|---|---|---|
| Harvard Crimson (2024). "More Than One in 10 Harvard Undergrads Come From Just 21 Schools." | Named feeder school list; 15-year send data | data_feeder_schools.md, exeter_mit_pipeline.md |
| Wall Street Journal (2023 series on Ivy admissions) | Hook multiplier reporting; donor preference transparency | college_decision_model.md |
| MIT News (2024). "MIT Class of 2028 Profile." | Post-SFFA demographic shifts: Black 13%→5%, Asian 41%→47% | mit_race_gender.md |
| Citation | Key Contribution | File |
|---|---|---|
| Gale, D. & Shapley, L. (1962). "College Admissions and the Stability of Marriage." American Mathematical Monthly. | Original stable matching algorithm; student-optimal proof | gale_shapley_algorithm.md |
| Abdulkadiroglu, A., Pathak, P., & Roth, A. (2005). "The New York City High School Match." American Economic Review P&P. | NYC school choice implementation | k12_school_choice.md |
| Abdulkadiroglu, A. & Sonmez, T. (2003). "School Choice: A Mechanism Design Approach." American Economic Review. | Formal analysis of Boston mechanism failure | k12_school_choice.md |
| Roth, A. (2008). "Deferred Acceptance Algorithms: History, Theory, Practice, and Open Questions." International Journal of Game Theory. | Comprehensive DA review | gale_shapley_algorithm.md |
For completeness, the following data sources were referenced in research but are NOT publicly available without institutional access:
| Source | Data | Access |
|---|---|---|
| National Student Clearinghouse StudentTracker | Student-level college enrollment by high school | Institutional subscription (~$5,000/yr) |
| Naviance "Where they got in" | School-specific college acceptance data | Student login; school purchase |
| College admissions raw data | Individual applications, scores, decisions | Institutional; privacy-restricted |
| Common Application raw microdata | Individual-level app data | Member institution access |
| Arcidiacono SFFA dataset | Harvard admissions 2000-2017 | Court exhibits; partial public release |
| spec.md Section | Parameters from This Document |
|---|---|
| Section 2: Student Generation | feederMultiplier, appsPerStudent by school type, ecTier distribution |
| Section 3: College Admissions | hookMultipliers, roundThresholds, perceptionNoise, genderMultipliers |
| Section 4: Student Decisions | yieldRates, enrollmentUtility weights, meltRates |
| Section 5: Waitlist | waitlistThresholdBuffer, waitlistYieldRates |
| Section 6: Analytics | equityMetrics, blockingPairCount |
| Parameter | Current (index.html) | Recommended | Priority |
|---|---|---|---|
| Athlete multiplier | 3.5x (global) | Per-tier (2.5-4.5x) | High |
| Legacy multiplier | 2.5x (global) | Per-tier (1.5-2.5x) | Medium |
| Donor multiplier | 4.0x (global) | Per-tier (2.0-4.0x) | Medium |
| First-gen multiplier | 1.4x | 1.3-1.4x (keep) | Low |
| ED threshold | 0.70 | 0.70 (keep) | — |
| EA threshold | 0.85 | 0.85 (keep) | — |
| EC weight | Unknown | 0.27 | High |
| Perception noise | Unknown | 0.07 sigma | High |
| Feeder multiplier | Not implemented | 1.0-2.2x | High |
| Gender multiplier | Not implemented | Per-tier | Medium |
| Yield rates | Unknown | Per-college | High |
For iterative calibration of the simulation, recommended order:
baseThreshold per college so simulated accept rates match CDS targets (±2pp tolerance)Document generated from synthesis of 16 research files. Primary sources for all claims are cited in-line with section references to the source file where the original research is located.