Source: kaggle_survey_applications.md
| Cycle | Apps/Applicant | Applicants | Total Apps | Notes |
|---|---|---|---|---|
| 2013-14 | 4.63 | ~820K | ~3.8M | Pre-pandemic baseline |
| 2014-15 | ~4.8 | ~850K | ~4.1M | Steady growth |
| 2015-16 | ~4.9 | ~880K | ~4.3M | Steady growth |
| 2016-17 | ~5.0 | ~900K | ~4.5M | Crossed 5.0 threshold |
| 2017-18 | ~5.1 | ~920K | ~4.7M | Continued upward |
| 2018-19 | ~5.2 | ~950K | ~4.9M | Pre-COVID |
| 2019-20 | ~5.4 | ~960K | ~5.2M | COVID begins (spring) |
| 2020-21 | ~5.6 | ~1.0M | ~5.6M | Test-optional boom |
| 2021-22 | 6.22 | ~1.1M | ~6.8M | Sharp post-COVID jump |
| 2022-23 | 6.41 | ~1.2M | ~7.7M | Continued acceleration |
| 2023-24 | 6.65 | 1.42M | ~9.4M | 9.4M+ applications |
| 2024-25 | 6.80 | ~1.50M | 10.19M | Surpassed 10M for first time |
Source: CommonApp End-of-Season Reports (https://www.commonapp.org/about/reports-and-insights/)
Current average: 6.80 apps/student (simulation uses 6.8 -- well calibrated)
Growth rate: +47% in apps/student from 2013-14 to 2024-25
Acceleration after 2019-20 driven by: test-optional policies, more CommonApp members, pandemic anxiety
~1,097 member institutions in 2024-25 (up 3% YoY)
Total applicants grew 5% YoY; total applications grew 7-8% YoY
Most Selective (<25% admit rate): Application growth of 5% (slowest)
Highly Selective (25-49% admit rate): Application growth of 11%
Selective (50-74% admit rate): Growth outpaced most selective
Less Selective (75%+ admit rate): Fastest growth
Implication: Students are "applying wider" -- adding more match/safety schools while maintaining reach applications.
Latino applicants: +15% growth
Black/African American applicants: +12% growth
First-generation students: strong growth across all rounds
Texas overtook NY and CA as #1 state by applicant count (+43% since 2023-24)
International applicants: -1% (first decline since 2019-20)
~50% of applicants submitted test scores (despite only ~5% of members requiring them)
| School | ED Rate | EA Rate | Overall Rate | ED Advantage |
|---|---|---|---|---|
| Harvard | -- | ~9% (REA) | 3.6% | 2.5x |
| Yale | -- | 10.8% (SCEA) | 4.5% | 2.4x |
| Princeton | -- | ~10% (SCEA) | ~4% | ~2.5x |
| Stanford | -- | ~8% (REA) | ~3.9% | ~2.1x |
| MIT | -- | 5.2% (EA) | 4.5% | 1.2x |
| Columbia | 13.2% | -- | 3.9% | 3.4x |
| UPenn | 14.2% | -- | 5.4% | 2.6x |
| Brown | 14.4% | -- | 5.4% | 2.7x |
| Dartmouth | 19.1% | -- | 5.4% | 3.5x |
| Cornell | ~18% | -- | ~8% | ~2.3x |
| Duke | 19.7% | -- | 6.7% | 2.9x |
| Northwestern | 23% | -- | 7.7% | 3.0x |
| UChicago | ~20% | ~6% (EA) | ~5% | ~4x ED |
| Rice | 16.8% | -- | 7.9% | 2.1x |
| Vanderbilt | ~20% | -- | ~6% | ~3.3x |
| Johns Hopkins | 11% | -- | ~7% | 1.6x |
| Notre Dame | -- | 12.9% (REA) | 11.2% | 1.2x |
| Georgetown | -- | ~15% (EA) | 12.9% | 1.2x |
| Emory | 23.2% | -- | 10.2% | 2.3x |
| WashU | 25.2% | -- | 12% | 2.1x |
| UVA | 27.9% | ~16% | 16.8% | 1.7x ED |
| USC | -- | 7.1% | 9.8% | 0.7x (EA lower) |
| Williams | 23.3% | -- | 8.3% | 2.8x |
| Amherst | 29.3% | -- | 9% | 3.3x |
| Middlebury | 30.5% | -- | 10.7% | 2.9x |
Source: CollegeVine (https://blog.collegevine.com/ed-and-ea-acceptance-rates), Spark Admissions, College Kickstart
Binding ED programs: ~2.6x higher acceptance rate than overall
Non-binding EA/REA/SCEA: ~1.5-2.5x, varies widely
MIT EA stands out as nearly equal to overall (highly selective EA is not a big advantage)
CollegeVine summary: "Students applying ED see a 1.6x (or 60%) increase in their chances of admission to very selective schools"
| School | % of Class Filled by ED | Source |
|---|---|---|
| Duke | 49-51% | Duke Chronicle, Ivy Coach |
| UPenn | ~50% | Common knowledge, multiple sources |
| Brown | ~45% | Multiple sources |
| Cornell | ~49% | Multiple sources |
| Dartmouth | ~45% | Multiple sources |
| Northwestern | ~50% | Estimates |
| WashU | ~55% | Estimates |
| Emory | ~40% | Estimates |
| Vanderbilt | ~50% | Estimates |
Simulation uses 40-60% -- well aligned with real data.
Consensus: 7-10 applications total with the following split:
2-3 Reach schools (<15% personal admission probability)
3-4 Match/Target schools (15-70% personal admission probability)
2-3 Safety schools (>70% personal admission probability)
Rule of thumb: ~30% reach, ~40% match, ~30% safety
For students aiming at HYPSM/Ivy+: higher reach proportion (4-5 reaches, 2-3 matches, 2 safeties)
Reach: Admission rate < 15% for your profile
Match/Target: Admission rate 15-70% for your profile
Safety: Admission rate > 70% for your profile
High-achieving students at feeder schools: avg 8-12 apps, heavily weighted toward reaches
Average student: 6-8 apps, more balanced
Post-pandemic: more students are "over-applying" (12-20 apps), creating application inflation
Source: Appily (https://www.appily.com/guidance/articles/finding-your-college/what-are-safety-reach-and-match-schools), CollegeVine, Princeton Review
URL: https://www.kaggle.com/datasets/samsonqian/college-admissions
License: CC0 Public Domain
Size: 222 KB
Last Updated: November 2018
Fields: Admission/class demographics by university
Relevance: Demographic breakdowns of admitted students by institution
Limitation: Pre-pandemic, likely lacks application volume data
URL: https://www.kaggle.com/datasets/yashgpt/us-college-data
Size: 777 rows x 18 columns
Key Fields:
Apps - Number of applications received
Accept - Number of applications accepted
Enroll - Number of students enrolled (yield proxy)
Private - Public/private indicator
Top10perc / Top25perc - % from top of HS class
F.Undergrad / P.Undergrad - Enrollment size
Outstate - Tuition
Grad.Rate - Graduation rate
Relevance: HIGH -- Apps/Accept/Enroll gives acceptance rates and yield. 777 institutions is a good sample.
Limitation: No year breakdown, no ED/EA split, no student-level data
URL: https://www.kaggle.com/datasets/hark99/post-secondary-education-data-ipeds
Size: ~19.3 MB (much larger)
Source: U.S. Department of Education NCES
Version: 3 (last modified Jan 2020)
Key Variables: Federal financial aid, PELL grants, income classification
Relevance: MEDIUM -- institutional-level data, but more focused on financial aid than admissions strategy
Limitation: Financial aid focused, not application behavior
URL: https://www.kaggle.com/datasets/pandanup/college-admission-data-set
Last Updated: March 2021
Relevance: LOW-MEDIUM -- general admission data
Limitation: Likely synthetic or small sample
URL: https://www.kaggle.com/datasets/amanace/student-admission-dataset
Size: 250 rows, 4 columns (GPA, SAT, ECs, Admission Status)
Relevance: LOW -- synthetic data, educational purposes only
Limitation: Only 250 records, no school names, no round information
URL: https://www.kaggle.com/datasets/darkhorse3141/collegeadmissiondataset
License: CC0 Public Domain
Size: 104 KB
Relevance: UNKNOWN -- could not extract schema from Kaggle page
URL: https://www.kaggle.com/datasets/farhansadeek/university-admission-dataset
Relevance: LOW-MEDIUM -- generic admissions data
Description: Subreddit where students post detailed admissions outcome profiles
Fields posted by users: Demographics, GPA, SAT/ACT, ECs, essays, school list, outcomes (accepted/rejected/waitlisted per school), decision round (ED/EA/RD)
Used by: GradGPT "Admits Like Me" tool (https://www.gradgpt.com/tools/admits-like-me)
Public dataset: Not available as structured CSV/download
Access: Cornell ConvoKit Reddit corpus includes r/ApplyingToCollege data
Academic use: Researcher Eric Chapdelaine analyzed r/ApplyingToCollege using PRAW (Python Reddit API), collecting Class of 2020 and 2021 posts
Description: Platform tracking ED/EA results for 750+ institutions
Data: Accepts CSV exports from Naviance, Maia Learning, Cialfo, SCOIR
Access: Fee-based consulting service (not public dataset)
Relevance: HIGH -- closest to "ground truth" for school-level ED/EA outcomes
Description: Neural network predicting college admissions from GPA+SAT corpus
Relevance: MEDIUM -- demonstrates viability of training on self-reported outcome data
Description: Scraper for CollegeData.com admissions statistics
Relevance: MEDIUM -- provides institutional-level stats
Coverage: All Title IV institutions, data back to 1997
Key fields: Admission rates, SAT/ACT ranges, enrollment, demographics, outcomes
Access: Free download, API available
Relevance: HIGH -- government source, comprehensive
URL: https://www.nacacnet.org/state-of-college-admission-report/
Key findings:
~1/3 of colleges offer Early Action
Average EA acceptance rate: 71% vs. 65% RD (across all responding institutions)
Redesigned as interactive online tool in 2023
Access: Some data member-only
Reports available:
End-of-season reports (annual, most comprehensive)
Deadline updates (Nov 1, Dec 1, Jan 1, Mar 1 snapshots)
Fiscal year annual reports
Relevance: HIGHEST -- primary source for application volume trends
| Parameter | Simulation Value | Real Data | Status |
|---|---|---|---|
| Avg apps/student | 6.8 | 6.80 (CommonApp 2024-25) | EXACT MATCH |
| ED fills % of class | 40-60% | 40-55% for Ivy+, up to 60% for some | WELL CALIBRATED |
| ED acceptance multiplier | (varies by college) | 1.6-3.5x vs. overall rate | CHECK: may need per-school tuning |
| Hook: athlete multiplier | 3.5x | Not directly comparable (recruited vs. non-recruited) | REASONABLE |
| Hook: donor multiplier | 4.0x | No public data; anecdotal support | PLAUSIBLE |
| Hook: legacy multiplier | 2.5x | Some schools phasing out legacy; historically ~2-3x | REASONABLE |
| Hook: first-gen multiplier | 1.4x | Growing institutional priority; modest boost | REASONABLE |
Application count distribution: Currently flat 6.8 average. Consider modeling:
High-achievers: 8-12 apps (heavily weighted to reaches)
Average students: 5-8 apps (balanced reach/match/safety)
Low-income/first-gen: potentially fewer apps (4-6) due to fee waiver limits
Application round distribution: Model based on school-level data:
~12-15% of all applicants use ED somewhere
~30-40% use EA at one or more schools
~60-70% submit at least one RD application