Source: kaggle_scorecard_deepdive.md
The US Department of Education College Scorecard is the most comprehensive federal dataset on higher education outcomes. It combines data from the Integrated Postsecondary Education Data System (IPEDS), federal financial aid records, and IRS tax data to create a unified view of institutional performance.
Official site: https://collegescorecard.ed.gov/data/
Kaggle mirror: https://www.kaggle.com/datasets/kaggle/college-scorecard (last updated Nov 2017, ~563 MB, 27K+ downloads, CC0 license)
Data catalog: https://catalog.data.gov/dataset/college-scorecard-c25e9
GitHub releases: https://github.com/RTICWDT/college-scorecard/releases
Latest release: v3.5.3 (November 19, 2025) -- IPEDS data update with minority-serving institution indicators and Federal Student Aid metrics
Previous major release: v3.5.1 (April 23, 2025) -- added SCORECARD_SECTOR metric
IPEDS update cycle: Provisional Spring 2025 data released January 6, 2026 (includes Fall 2024 enrollment)
Admissions/test score data: Most recent available is typically 1-2 years behind current cycle (2022-23 or 2023-24 academic year)
Earnings data: Lagged 6-10 years after entry (e.g., 2025 file has earnings for ~2015-2019 cohorts)
The College Scorecard uses two naming conventions:
ADM_RATE) used in downloadable data fileslatest.admissions.admission_rate.overall) used in REST API queries| CSV Variable | API Field Path | Description |
|---|---|---|
| UNITID | id |
Unique institution identifier (IPEDS) |
| OPEID | ope8_id |
8-digit OPE ID |
| INSTNM | school.name |
Institution name |
| STABBR | school.state |
State abbreviation |
| CONTROL | school.ownership |
1=public, 2=private nonprofit, 3=private for-profit |
| PREDDEG | school.degrees_awarded.predominant |
Predominant degree type (3=bachelor's) |
| HIGHDEG | school.degrees_awarded.highest |
Highest degree awarded |
| CSV Variable | API Field Path | Description |
|---|---|---|
| ADM_RATE | latest.admissions.admission_rate.overall |
Overall admission rate (admitted / applied) |
| ADM_RATE_ALL | latest.admissions.admission_rate.by_ope_id |
Admission rate across all campuses |
| SAT_AVG | latest.admissions.sat_scores.average.overall |
Average SAT equivalent score (combined) |
| SATVR25 | latest.admissions.sat_scores.25th_percentile.critical_reading |
SAT reading 25th percentile |
| SATVR75 | latest.admissions.sat_scores.75th_percentile.critical_reading |
SAT reading 75th percentile |
| SATVRMID | latest.admissions.sat_scores.midpoint.critical_reading |
SAT reading midpoint |
| SATMT25 | latest.admissions.sat_scores.25th_percentile.math |
SAT math 25th percentile |
| SATMT75 | latest.admissions.sat_scores.75th_percentile.math |
SAT math 75th percentile |
| SATMTMID | latest.admissions.sat_scores.midpoint.math |
SAT math midpoint |
| SATWR25 | latest.admissions.sat_scores.25th_percentile.writing |
SAT writing 25th percentile |
| SATWR75 | latest.admissions.sat_scores.75th_percentile.writing |
SAT writing 75th percentile |
| SATWRMID | latest.admissions.sat_scores.midpoint.writing |
SAT writing midpoint |
| ACTCM25 | latest.admissions.act_scores.25th_percentile.cumulative |
ACT composite 25th percentile |
| ACTCM75 | latest.admissions.act_scores.75th_percentile.cumulative |
ACT composite 75th percentile |
| ACTCMMID | latest.admissions.act_scores.midpoint.cumulative |
ACT composite midpoint |
| ACTEN25 | latest.admissions.act_scores.25th_percentile.english |
ACT English 25th percentile |
| ACTMT25 | latest.admissions.act_scores.25th_percentile.math |
ACT math 25th percentile |
| CSV Variable | API Field Path | Description |
|---|---|---|
| UGDS | latest.student.size |
Total undergraduate degree-seeking enrollment |
| UGDS_WHITE | latest.student.demographics.race_ethnicity.white |
Share of enrollment that is white |
| UGDS_BLACK | latest.student.demographics.race_ethnicity.black |
Share that is Black |
| UGDS_HISP | latest.student.demographics.race_ethnicity.hispanic |
Share that is Hispanic |
| UGDS_ASIAN | latest.student.demographics.race_ethnicity.asian |
Share that is Asian |
| UGDS_AIAN | latest.student.demographics.race_ethnicity.aian |
Share that is American Indian / Alaska Native |
| UGDS_NHPI | latest.student.demographics.race_ethnicity.nhpi |
Share that is Native Hawaiian / Pacific Islander |
| UGDS_2MOR | latest.student.demographics.race_ethnicity.two_or_more |
Share that is two or more races |
| UGDS_NRA | latest.student.demographics.race_ethnicity.non_resident_alien |
Share that is non-resident alien |
| UGDS_UNKN | latest.student.demographics.race_ethnicity.unknown |
Share that is unknown race/ethnicity |
| UG | latest.student.enrollment.all |
Total undergraduate enrollment (all students) |
| CSV Variable | API Field Path | Description |
|---|---|---|
| C150_4 | latest.completion.completion_rate_4yr_150nt |
6-year graduation rate (150% of normal time, 4-year institutions) |
| C150_4_WHITE | latest.completion.completion_rate_4yr_150nt_white |
6-year graduation rate for white students |
| C150_4_BLACK | latest.completion.completion_rate_4yr_150nt_black |
6-year graduation rate for Black students |
| C150_4_HISP | latest.completion.completion_rate_4yr_150nt_hisp |
6-year graduation rate for Hispanic students |
| C150_4_ASIAN | latest.completion.completion_rate_4yr_150nt_asian |
6-year graduation rate for Asian students |
| RET_FT4 | latest.student.retention_rate.four_year.full_time |
First-year retention rate (full-time, 4-year institutions) |
| RET_PT4 | latest.student.retention_rate.four_year.part_time |
First-year retention rate (part-time) |
| CSV Variable | API Field Path | Description |
|---|---|---|
| NPT4_PUB | latest.cost.avg_net_price.public |
Average net price (public institutions, Title IV recipients) |
| NPT4_PRIV | latest.cost.avg_net_price.private |
Average net price (private institutions, Title IV recipients) |
| NPT41_PUB | latest.cost.net_price.public.by_income_level.0-30000 |
Net price for family income $0-$30K (public) |
| NPT45_PUB | latest.cost.net_price.public.by_income_level.110001-plus |
Net price for family income $110K+ (public) |
| COSTT4_A | latest.cost.attendance.academic_year |
Average cost of attendance (academic year) |
| COSTT4_P | latest.cost.attendance.program_year |
Average cost of attendance (program year) |
| TUITIONFEE_IN | latest.cost.tuition.in_state |
In-state tuition and fees |
| TUITIONFEE_OUT | latest.cost.tuition.out_of_state |
Out-of-state tuition and fees |
| PCTPELL | latest.aid.pell_grant_rate |
Share of undergraduates receiving Pell grants |
| PCTFLOAN | latest.aid.federal_loan_rate |
Share receiving federal student loans |
| GRAD_DEBT_MDN | latest.aid.median_debt.completers.overall |
Median debt at graduation (completers) |
| GRAD_DEBT_MDN_SUPP | latest.aid.median_debt_suppressed.completers.overall |
Median debt (suppressed for privacy) |
| CSV Variable | API Field Path | Description |
|---|---|---|
| MD_EARN_WNE_P6 | latest.earnings.6_yrs_after_entry.median |
Median earnings 6 years after entry |
| MD_EARN_WNE_P10 | latest.earnings.10_yrs_after_entry.median |
Median earnings 10 years after entry |
| MN_EARN_WNE_P6 | latest.earnings.6_yrs_after_entry.mean_earnings |
Mean earnings 6 years after entry |
| MN_EARN_WNE_P10 | latest.earnings.10_yrs_after_entry.mean_earnings |
Mean earnings 10 years after entry |
| COUNT_WNE_P6 | latest.earnings.6_yrs_after_entry.working_not_enrolled.earnings_count |
Count of students in 6-year earnings cohort |
| RPY_3YR_RT_SUPP | latest.repayment.3_yr_repayment.overall |
3-year loan repayment rate |
| CSV Variable | API Field Path | Description |
|---|---|---|
| PFTFAC | school.ft_faculty_rate |
Share of faculty that is full-time |
| AVGFACSAL | school.faculty_salary |
Average faculty salary |
https://api.data.gov/ed/collegescorecard/v1/schools
Requires a free API key from https://api.ed.gov/signup/
Pass as query parameter: api_key=YOUR_KEY
GET https://api.data.gov/ed/collegescorecard/v1/schools?
api_key={KEY}
&school.name=Harvard University
&fields=id,school.name,latest.admissions.admission_rate.overall,latest.admissions.sat_scores.average.overall
&per_page=100
| Parameter | Description |
|---|---|
fields |
Comma-separated list of fields to return |
per_page |
Results per page (max 100) |
page |
Page number for pagination |
sort |
Sort by field (e.g., latest.admissions.admission_rate.overall:asc) |
keys_nested=true |
Return JSON objects instead of dotted strings |
Exact match: school.state=CA
Range: latest.admissions.admission_rate.overall__range=0..0.1 (0-10%)
Not null: latest.admissions.admission_rate.overall__not=null
Year-specific: Replace latest with year (e.g., 2021.admissions.admission_rate.overall)
1,000 requests per IP per hour
Contact scorecarddata@rti.org for increases
bash proof:W3sidHlwZSI6InByb29mQXV0aG9yZWQiLCJmcm9tIjowLCJ0byI6OTQ5LCJhdHRycyI6eyJieSI6ImFpOmNsYXVkZSJ9fV0=
curl "https://api.data.gov/ed/collegescorecard/v1/schools?\
api_key=YOUR_KEY&\
fields=id,school.name,\
latest.admissions.admission_rate.overall,\
latest.admissions.sat_scores.average.overall,\
latest.admissions.sat_scores.25th_percentile.critical_reading,\
latest.admissions.sat_scores.75th_percentile.critical_reading,\
latest.admissions.sat_scores.25th_percentile.math,\
latest.admissions.sat_scores.75th_percentile.math,\
latest.student.size,\
latest.student.demographics.race_ethnicity.white,\
latest.student.demographics.race_ethnicity.black,\
latest.student.demographics.race_ethnicity.hispanic,\
latest.student.demographics.race_ethnicity.asian,\
latest.completion.completion_rate_4yr_150nt,\
latest.student.retention_rate.four_year.full_time,\
latest.cost.avg_net_price.private,\
latest.aid.pell_grant_rate,\
latest.aid.median_debt.completers.overall,\
latest.earnings.10_yrs_after_entry.median&\
school.name=Harvard University&\
per_page=1"
College Scorecard (IPEDS-reported, typically 1-2 years lag)
Class of 2029 admissions data (2024-2025 cycle, from institutional press releases)
U.S. News 2022 data (SAT middle 50% ranges via reachhighscholars.org)
| College | ADM_RATE (Scorecard) | ADM_RATE (Class 2029) | SAT Middle 50% | SAT Avg (est.) |
|---|---|---|---|---|
| Harvard | 5% | 4.18% | 1460-1580 | 1520 |
| Yale | 5% | ~5% | 1460-1580 | 1520 |
| Princeton | 6% | 4.42% | 1450-1570 | 1510 |
| Stanford | 5% | ~4% (TBA) | 1420-1570 | 1500 |
| MIT | 7% | 4.56% | 1510-1580 | 1545 |
| College | ADM_RATE (Scorecard) | ADM_RATE (Class 2029) | SAT Middle 50% | SAT Avg (est.) |
|---|---|---|---|---|
| Columbia | 6% | 4.94% | 1470-1570 | 1520 |
| UPenn | 9% | ~6% | 1450-1570 | 1510 |
| Brown | 8% | 5.65% | 1440-1570 | 1505 |
| Dartmouth | 9% | 6.02% | 1440-1560 | 1500 |
| Cornell | 11% | 8.38% | 1400-1540 | 1470 |
| Duke | 8% | 5.20% | 1510-1560 | 1535 |
| Northwestern | 9% | 7.00% | 1430-1550 | 1490 |
| UChicago | 7% | ~5% (est.) | 1500-1570 | 1535 |
| Caltech | 7% | 3.78% | 1530-1580 | 1555 |
| College | ADM_RATE (Scorecard) | ADM_RATE (Class 2029) | SAT Middle 50% | SAT Avg (est.) |
|---|---|---|---|---|
| Johns Hopkins | 9% | 5.14% | 1480-1570 | 1525 |
| Vanderbilt | 12% | 4.6% | 1460-1560 | 1510 |
| Rice | 11% | 8.01% | 1460-1570 | 1515 |
| Notre Dame | 19% | 9% | 1420-1560 | 1490 |
| Georgetown | 17% | 12% | 1380-1550 | 1465 |
| Carnegie Mellon | 17% | 11.07% | 1460-1560 | 1510 |
| WashU | 8% | ~12% | 1460-1560 | 1510 |
| College | ADM_RATE (Scorecard) | ADM_RATE (Class 2029) | SAT Middle 50% | SAT Avg (est.) |
|---|---|---|---|---|
| Emory | 19% | 10.30% | 1380-1530 | 1455 |
| Tufts | 16% | 10.81% | 1380-1530 | 1455 |
| Boston College | 26% | 13.85% | 1330-1500 | 1415 |
| UVA | 23% | 15.4% | 1320-1510 | 1415 |
| UCLA | 14% | 9.42% | 1290-1520 | 1405 |
| Michigan | 17% | ~16% | 1340-1560 | 1450 |
| College | ADM_RATE (Scorecard) | ADM_RATE (Class 2029) | SAT Middle 50% | SAT Avg (est.) |
|---|---|---|---|---|
| Williams | 15% | 8.5% | 1410-1560 | 1485 |
| Amherst | 12% | 7.72% | 1410-1550 | 1480 |
| Middlebury | 22% | 12.77% | 1340-1520 | 1430 |
| Simulation Parameter | Scorecard Field | Notes |
|---|---|---|
| Acceptance rate | ADM_RATE |
Use as baseline; adjust with Class of 2029 actuals |
| SAT score thresholds | SAT_AVG, SATVR25/75, SATMT25/75 |
Middle 50% ranges for scoring calibration |
| ACT thresholds | ACTCM25/75, ACTCMMID |
Alternative test score data |
| Student body size | UGDS |
For modeling class size and yield |
| Demographics mix | UGDS_WHITE/BLACK/HISP/ASIAN/2MOR/NRA |
For diversity modeling and hook calibration |
| Graduation rate | C150_4 |
6-year completion rate by race for outcome modeling |
| Retention rate | RET_FT4 |
Proxy for student satisfaction / institutional quality |
| Pell grant rate | PCTPELL |
Socioeconomic diversity indicator |
| Net price | NPT4_PUB/PRIV |
Financial aid modeling |
| Median debt | GRAD_DEBT_MDN |
Post-graduation outcome modeling |
| Earnings | MD_EARN_WNE_P10 |
Long-term ROI by institution |
ED/EA multipliers: Not in federal data; must come from CDS or institutional reports
Hook categories: Legacy, athlete, donor -- not tracked federally
Application volume by round: Not in Scorecard; use CommonApp data
Yield rates: Partially available via IPEDS but not a named Scorecard field
High school feeder data: Not in Scorecard at all
Name: US Dept of Education: College Scorecard
URL: https://www.kaggle.com/datasets/kaggle/college-scorecard
Files: Raw CSVs, combined Scorecard.csv, database.sqlite
Size: ~563 MB compressed
License: CC0 (Public Domain)
Last updated on Kaggle: November 2017 (significantly outdated)
Downloads: 27,216
College Scorecard Raw Data (by tunguz): https://www.kaggle.com/datasets/tunguz/college-scorecard-raw-data -- more recent raw data files
CollegeScorecard US College Graduation (by thedevastator): https://www.kaggle.com/datasets/thedevastator/collegescorecard-us-college-graduation-and-oppor
For the most current data, use the API directly rather than Kaggle downloads. The Kaggle mirror is 8+ years stale. The API provides latest.* fields that always return the most current available data, and year-specific queries (e.g., 2023.admissions.*) for historical trends.
The API doesn't support querying by a list of names in one call. Strategy:
Query by UNITID (unique identifier) for each school
Or filter by latest.admissions.admission_rate.overall__range=0..0.25 to get all selective schools, then filter client-side
| College | UNITID |
|---|---|
| Harvard | 166027 |
| Yale | 130794 |
| Princeton | 186131 |
| Stanford | 243744 |
| MIT | 166683 |
| Columbia | 190150 |
| UPenn | 215062 |
| Brown | 217156 |
| Dartmouth | 182670 |
| Cornell | 190415 |
(Additional UNITIDs can be looked up via school.name queries)