College Simulation — Agent Rules Reference

Source: agent_rules.md

College Simulation — Agent Rules Reference

Generated from index.html (current state, 2026-03-02) Documents the behavioral logic of every agent type in the simulation.

Student Agent Rules
1.1 Generation
1.2 Archetypes
1.3 Hooks & Demographics
1.4 Academic Index (AI)
1.5 College List Building
1.6 Round Assignment
1.7 EDII Backup Pre-Assignment
1.8 Final Decision Logic
University (Admissions Office) Agent Rules
2.1 Initialization
2.2 Admission Scoring Pipeline
2.3 Round Processing
2.4 EDII Conversion
2.5 Waitlist Resolution
Shared Constants Reference
Simulation Execution Order

1. Student Agent Rules

1.1 Generation

Each high school generates studentsPerSchool (default: 20) students per simulation run. For each student:

Step 1 — Archetype selection archetype = weightedChoice(ARCHETYPE_WEIGHTS[school.category]) Weights differ by school category (elite boarding vs. public magnet vs. NYC private, etc.).

Step 2 — Gender STEM-leaning schools (boarding, public magnet, NYC private): 55% male / 45% female. All other schools: 50/50.

Step 3 — Correlated GPA / SAT Uses a Cholesky-decomposed bivariate normal with ρ = 0.65 (research: GPA-SAT correlation in elite pools).

gpa_z = z0
sat_z = 0.65·z0 + √(1−0.65²)·z1
gpa   = clamp(school.gpa.mean + school.gpa.std × gpa_z, school.gpa.lo, school.gpa.hi)
sat   = clamp(school.sat.mean + school.sat.std × sat_z, school.sat.lo, school.sat.hi)
       rounded to nearest 10

Step 4 — Archetype stat adjustments (see §1.2 for full detail)

Step 5 — Hook probabilities (see §1.3)

Step 6 — Income-stratified SAT offset SAT_INCOME_OFFSETS = [-50, -18, 0, +9, +32] for brackets 1–5 (< $20K → $80K+). Applied at ~40% of the national 206-point gap to avoid double-counting school-type stratification.

Step 7 — Weighted GPA gpa_w = clamp(gpa_uw + 0.3 + ap_count × 0.03, 3.5, 5.0)

Tracked students (e.g., Petr Kirsanov) are injected after the main loop with all fields hard-coded from real documents. They participate in the sim identically to generated students.

1.2 Archetypes

Eight archetypes, each with distinct stat profiles and application counts:

Archetype	App Count	EC Quality	Essay Base	Key Adjustments
`athlete`	4–8	5–9	school mean	SAT −50 to −120, GPA −0.10 to −0.25; `recruited_athlete = true`
`legacy_dev`	5–9	school mean	school mean	Assigned a tier-1/2/3 legacy school; 35% chance `donor = true`
`first_gen`	5–9	4–7.5	+0.5 to +1.0	`first_gen = true`; 40% chance `urm = true`
`stem_spike`	10–16	max(7, ec+1)	+0.5	SAT +30 to +60; AP count +3 to +4
`humanities_spike`	8–13	max(7, ec+1.5)	+1.0 to +1.5	Broadened liberal arts search
`arts_spike`	6–10	8–10	7–9	SAT −10 to −40; EC and essay from high truncNorm
`average_strong`	8–13	3–7	school mean	No boosts; safety-conscious
`well_rounded`	8–13	school mean	school mean	Strongly prefers ED

Archetype application count is the primary driver of target list size; the actual count K is drawn from a lognormal centered on the archetype mean with σ = 0.4 (clamped 3–20).

EC/Essay base stats (before archetype adjustment): ecQuality ~ truncNorm(6.5, 1.5, 3, 10) essayBase ~ truncNorm(6.0, 1.8, 3, 10)

1.3 Hooks & Demographics

Hook assignment (non-athlete, non-legacy_dev archetypes)

Per school category, background legacy/donor rates:

School Category	Legacy Prob	Donor Prob
elite_nyc_private	22%	15%
elite_boarding	18%	12%
boarding_day	15%	9%
elite_day_school	14%	9%
international_school	5%	2%
public_charter_elite	4%	2%
elite_public_magnet	3%	1%

When a legacy hook is assigned, 25% chance donor is also set. Legacy school drawn randomly from all colleges in tiers 1–3.

Per-school hook profile (from hookProfile JSON field): Each high school can specify first_gen and urm probabilities; applied after category-level hooks.

Additional hook probabilities (all archetypes): - Consulting client: school.consulting_client_prob (0.03–0.20 by school ivy-placement %) → if client: essay +0.5–1.0, EC +0.3–0.6 - Underrepresented state: 8% baseline - URM (non-first_gen path): 10% baseline

Income bracket assignment

Base by school category: boarding 5, nyc_private 5, boarding_day 4, elite_day 4, international 3, charter 3, magnet 2. First-gen students: bracket − 2. ±1 uniform noise. Clamped 1–5.

Pell eligibility: true if first_gen === true or income_bracket ≤ 2.

1.4 Academic Index (AI)

All scoring is anchored to the Academic Index on a 0–240 scale, mirroring the Ivy League AI formula.

CGS (Converted Grade Score) from unweighted GPA — piecewise linear Ivy table:

GPA	CGS	GPA	CGS
4.0	80	3.5	71
3.9	79	3.4	68
3.8	78	3.3	66
3.7	77	3.2	64
3.6	73	3.1	62
3.0	60	2.5	50

Above 4.0 → 80 (cap). Below 2.5 → linear extrapolation at 20 pts/GPA.

SAT component: (SAT / 20) × 2 → max 160 at 1600.

Academic Index: AI = CGS + (SAT/20)×2, capped at 240.

1.5 College List Building

For each student, a utility model ranks all 30 colleges; the top-K are selected for the list.

utility(college) = prestige + fitBonus + legacyBonus + 5 × logPEst

Where: - prestige: (6 − tier) × 8 + rand() × 12 − 6 (tier 1 → ~40, tier 5 → ~2) - fitBonus: FIT_SCORES[archetype][college] (0–5 scale, see §3) - legacyBonus: +15 if hooks.legacy === college (guarantees legacy school appears) - logPEst: log of estimated P(admit) using academic factors only (hooks excluded): rawEst = clamp(20 + aiDelta × 0.75, 0, 40) + (ec_quality/10)×20 + (essay_base/10)×10 + 2 acLogit = (rawEst − 46) / 20 logPEst = −log(1 + exp(−(acLogit − college.admitThreshold)))

Target K: K ~ lognormal(log(APP_MEANS[archetype]), σ=0.4), clamped 3–20.

APP_MEANS by archetype: athlete 6, legacy_dev 7, first_gen 6, stem_spike 13, humanities_spike 11, arts_spike 8, average_strong 10, well_rounded 10.

Category labels assigned from logPEst:

logPEst	Category
< −2.5	dream
−2.5 to −1.0	reach
−1.0 to −0.3	target
≥ −0.3	safety

Yield-risk flag: yieldRiskSchool = true if college is in {tufts, emory, uchicago, washu, carnegie_mellon} AND student's SAT > college median + 150.

Legacy guarantee: If hooks.legacy school was not in top-K utilities, it is appended to the list with demonstratedInterest = 0.7 to 1.0.

1.6 Round Assignment

Colleges are sorted by tier (ascending) to determine order of round assignment.

ED / REA (binding early — only one allowed): - Student applies ED to top-choice if: - College offers ED and (archetype is legacy_dev, athlete, or well_rounded, or rand() < 0.50) - Student applies REA/SCEA to top-choice if: - College offers restrictive EA and rand() < 0.45

EA (non-restrictive, no limit): - For each remaining college offering non-restrictive EA: apply EA with probability 65%.

Fallback: If no early round was assigned, the top-choice college is force-assigned ED (if it offers ED) or REA/EA (if it offers EA).

RD: All remaining list items default to Regular Decision.

1.7 EDII Backup Pre-Assignment

After round assignment, if a student has an ED/REA school, the simulation pre-selects an EDII backup:

Filter list items to: has edii=true, currently round='RD', not the ED school.
Prefer candidates exactly one tier below the ED school → then same tier → then any.
Store as student.ediiBackup = college_key.

The EDII conversion from RD → EDII is executed in §2.4 after the ED round resolves.

1.8 Final Decision Logic

After all admission rounds, non-committed students choose among their acceptances using a 5-factor weighted score:

Factor	Weight	Details
Prestige tier	0.40	`((6 − tier) / 5) × 30` → tier 1 = 30 pts, tier 5 = 6 pts
Archetype fit	0.27	`(FIT_SCORES[archetype][college] / 5) × 20`
Legacy preference	0.13	+15 pts if `hooks.legacy === college`, else 0
Personal noise	0.10	`rand() × 10`
Financial aid (Chetty)	0.10	`clamp((relYield − 1.0) × 8, −6, +12)` where relYield from `CHETTY_YIELD_BY_INCOME[college][income_bracket]`

Student commits to the college with the highest total score.

HYPSM override: If student holds multiple HYPSM acceptances: - If one was their ED/REA choice → commit to that one. - Otherwise → highest FIT_SCORE among HYPSM + random noise picks the winner.

Students with zero acceptances receive status = 'rejected_all'.

2. University (Admissions Office) Agent Rules

2.1 Initialization

Each college is initialized with the following derived fields:

Seat allocation by round:

Round	Formula
ED	`floor(classSize × ED_FILL[college])` (default 40%)
EDII	`floor(classSize × 0.08)` if college offers EDII
EA	`floor(classSize × 0.15)` (non-restrictive) or `× 0.20` (restrictive)
RD	`classSize − edSeats − ediiSeats` (or `− eaSeats`)

ED_FILL rates (calibrated from CDS data): Middlebury 68%, Duke 50%, Williams 46%, others default 40%.

School median AI: schoolAI = computeAcademicIndex(college.gpa, (college.sat[0] + college.sat[1]) / 2)

Admit threshold (calibrates the logistic model to real-world RD rates):

eliteBoosts = [_, 3.5, 2.8, 2.2, 2.2, 1.5]   // tier 1–5
poolRateRD  = min(0.60, (college.rateRD / 100) × eliteBoost[tier])
admitThreshold = log((1 − poolRateRD) / poolRateRD)

A higher threshold means harder admissions. For example, Harvard (tier 1, ~3% RD) has a high threshold; UVA (tier 4, ~22% RD) has a lower one.

Yield protection loaded from YIELD_PROTECTION constant (see §3).

2.2 Admission Scoring Pipeline

computeAdmissionScore(college, student, round, essayQ) returns P(admit) ∈ [0,1].

All factors are additive in logit (log-odds) space, then passed through a sigmoid.

Step 1 — Academic components

studentAI    = computeAcademicIndex(student.gpa_uw, student.sat)
aiDelta      = studentAI − college.schoolAI
academicScore= clamp(20 + aiDelta × 0.75, 0, 40)
ecScore      = (student.ec_quality / 10) × 20
ecBonus      = 8 if ec_quality ≥ 9.0, else 3 if ec_quality ≥ 7.5, else 0
essayScore   = (essayQ / 10) × 10
fitScore     = FIT_SCORES[archetype][college] ∈ {0,1,2,3,4,5}

componentsRaw = academicScore + ecScore + ecBonus + essayScore + fitScore
academicLogit = (componentsRaw − 46) / 20

Interpretation: A raw score of 46 → logit 0 → 50% base probability (before threshold subtraction). ±20 raw ≈ ±1 logit unit.

Step 2 — Feeder school bonus

feederLogit = log(student.feeder_bonus)

Range: 1.0× (public) to 2.5× (Collegiate School) → 0 to +0.92 logit.

Step 3 — Hooks (ALDC)

All hooks are additive in logit space (prevents exponential blow-up from stacking):

Hook	HYPSM (T1)	Ivy+ (T2)	Near-Ivy (T3)	Selective (T4)	LACs (T5)
donor	log(7.5)	log(6.0)	log(4.5)	log(3.0)	log(2.5)
recruited_athlete	log(4.5)	log(4.0)	log(3.5)	log(2.5)	log(2.0)
legacy	log(5.7)	log(4.0)	log(3.0)	log(2.0)	log(2.0)
first_gen	log(1.4)	log(1.4)	log(1.4)	log(1.4)	log(1.4)
pell_eligible	log(1.25)	← same →	← same →	← same →	← same →
urm	log(1.2)	← same →	← same →	← same →	← same →
underrepresented_state	log(1.3)	← same →	← same →	← same →	← same →

Calibrated from SFFA v. Harvard trial data (Arcidiacono expert testimony). Reference ALDC rates at Harvard: athlete 86%, donor 42%, legacy 33%, unhooked 5.6%.

Gender multiplier (added to hookLogit):

School Type	Male	Female
stem_heavy	1.0×	1.9×
balanced	1.0×	1.05×
lac	1.25×	1.0×

Caltech uses stem_heavy (female 1.9×), Williams uses lac (male 1.25×).

Income residual (Chetty 2023, unhooked students only): - Income bracket 5 (high): +log(1.15) - Income bracket 1 (low): +log(0.92) - Brackets 2–4: no adjustment

Step 4 — Yield protection penalty

Applies to 9 schools: Tufts, Emory, WashU, Carnegie Mellon, Middlebury, Boston College, Georgetown, Michigan, UVA.

Condition: college.yieldProtection = true AND aiDelta > 25 AND student has no major hooks (donor/athlete/legacy).

pen = yieldProtectionStrength × min(1, (aiDelta − 25) / 40)
yieldPenalty = −pen × 2   (in logit units)

Strength values: Tufts 0.35, WashU 0.30, Emory 0.25, Middlebury 0.20, Carnegie Mellon 0.20, Boston College 0.15, Georgetown 0.15, Michigan 0.10, UVA 0.10.

Step 5 — Round multiplier

Round	roundLogit
ED / REA	`log(clamp(rateE / rate, 1.2, 4.5))`
EDII	`log(clamp(rateE / rate × 0.65, 1.1, 3.0))`
EA	`log(clamp(rateE / rate × 0.80, 1.0, 2.5))`
RD	0

rateE = college's ED/EA rate, rate = overall acceptance rate.

Step 6 — Holistic noise

noise = (rand() + rand() − 1) × 1.2   // ≈ Normal(0, 0.7)

Approximates the subjective component of holistic review (~±1 logit, effectively ±25% probability swing).

Step 7 — Logistic combination

logit = academicLogit + feederLogit + hookLogit + yieldPenalty + roundLogit + noise
        − college.admitThreshold
P(admit) = sigmoid(logit) = 1 / (1 + exp(−logit))

2.3 Round Processing

Rounds execute in order: ED → EA/REA → EDII → RD.

For each college in each round:

Collect applicants: ED pool, EA+REA pool, EDII pool, or RD+deferred pool.
Filter: Remove students already committed from binding rounds.
Score: Call computeAdmissionScore() for every applicant → stored as app.admission_score.
Sort: Descending by admission_score.
Stochastic Bernoulli admission: For each applicant in order:
If seatsForRound > admitCount AND rand() < P → ACCEPT
Else if ED/EA AND P > 0.15 AND deferCount < 30% of apps → DEFER (goes to RD pool)
Else if RD AND P > 0.05 AND waitlistCount < 15% of apps → WAITLIST
Else → REJECT

Seat caps: - ED: edSeats - EA: eaSeats - EDII: ediiSeats - RD: seatsTotal − enrolled.length (remaining after all prior rounds)

Binding rounds (ED, EDII): Accepted students are immediately committed (status = 'committed', committed_to = college_key). No further applications considered.

2.4 EDII Conversion

Runs after the ED round, before the EDII round processes.

For each non-committed student with ediiBackup set: 1. Find the existing RD application for the backup school. 2. Convert it: app.round = 'EDII', move from RD pool to EDII pool. 3. Skip EDII with 40% probability if student already holds an EA acceptance.

Fallback (students without ediiBackup): Probabilistically convert a random eligible RD school to EDII with 30% chance. Only one EDII school per student.

2.5 Waitlist Resolution

Runs after student final decisions. Up to 5 cascade iterations.

Within each iteration, colleges are processed in prestige order (tier 1 first), so higher-ranked schools poach from lower-ranked schools' committed students.

For each college: 1. Skip if enrolled ≥ seatsTotal. 2. Calculate deficit = seatsTotal − enrolled.length; skip if deficit / seatsTotal ≤ 5%. 3. Pull from waitlist using per-school pullRate (from WAITLIST_DATA): scaleFactor = studentsPerSchool / 20 toPull = min(deficit, round(pullRate × typicalPulled × scaleFactor)) 4. For each waitlisted student (up to toPull): - If student is still uncommitted: admit, trigger studentFinalDecisions() for re-evaluation, then continue cascade.

Waitlist pull rates range from 1% (Carnegie Mellon, Michigan) to 10% (Brown, Duke).

3. Shared Constants Reference

FIT_SCORES (archetype × college, scale 1–5)

5 = perfect fit, 3 = good match, 1 = weak fit. Selected examples:

College	stem_spike	humanities_spike	athlete	arts_spike	first_gen
MIT	5	1	1	1	2
Caltech	5	1	1	1	2
Stanford	4	3	3	3	3
UChicago	3	5	1	3	2
Georgetown	2	5	2	2	3
Williams	3	5	3	4	3
UCLA	4	3	4	4	4
Notre Dame	2	4	4	2	3

College Tiers

Tier	Colleges
1 — HYPSM	Harvard, Yale, Princeton, Stanford, MIT
2 — Ivy+	Columbia, UPenn, Brown, Dartmouth, Cornell, Duke, Northwestern, UChicago, Caltech
3 — Near-Ivy	Johns Hopkins, Vanderbilt, Rice, Notre Dame, Georgetown, Carnegie Mellon, WashU
4 — Selective	Emory, Tufts, Boston College, UVA, UCLA, Michigan
5 — Top LACs	Williams, Amherst, Middlebury

CHETTY_YIELD_BY_INCOME

Per-school, per-income-bracket relative yield (1.0 = national average). Used in student final decision scoring. Source: Opportunity Insights / Kaggle elite-college-admissions dataset.

Brackets: 1 = < $20K, 2 = $20–40K, 3 = $40–60K, 4 = $60–80K, 5 = $80K+.

Examples: Harvard bracket-5 yield: 1.953× (high-income students enroll at 1.95× national rate). UCLA bracket-5: 0.48× (high-income students significantly less likely to attend than average).

YIELD_PROTECTION (Tufts Syndrome)

College	Strength
Tufts	0.35
WashU	0.30
Emory	0.25
Middlebury	0.20
Carnegie Mellon	0.20
Boston College	0.15
Georgetown	0.15
Michigan	0.10
UVA	0.10

Penalty activates only when aiDelta > 25 (student significantly overqualified) and student lacks major hooks.

WAITLIST_DATA (selected)

College	pullRate	typicalPulled
Harvard	3%	10
Brown	10%	80
Duke	10%	100
Princeton	8%	50
Michigan	1%	60
Carnegie Mellon	1%	50

4. Simulation Execution Order

1. initColleges()          → Compute schoolAI, admitThreshold, seat allocation, yield protection config
2. generateStudents()      → Correlated GPA/SAT, archetypes, hooks, income, feeder bonus; inject tracked students
3. buildCollegeLists()     → Utility model selects top-K colleges per student
4. assignRounds()          → ED/REA/EA/EDII/RD assignment + EDII backup pre-selection
5. createApplications()    → Essay quality roll (essay_base ± rand); applications pushed to college queues
6. processRound('ED')      → Stochastic Bernoulli; binding admits committed immediately
7. handleEDIIApplicants()  → Convert ediiBackup from RD → EDII
8. processRound('EA')      → EA + REA queues; deferred go to RD
9. processRound('EDII')    → Binding admits
10. processRound('RD')     → Main pool + deferred; waitlist assignments
11. studentFinalDecisions()→ Multi-factor choice among acceptances
12. resolveWaitlist()      → Up to 5 yield-triggered cascade iterations

Each step fires SIM.events entries (type: 'decision' or 'commit') consumed by the D3 arc visualization and the Tracked Student tab timeline.

GPA	CGS	GPA	CGS
4.0	80	3.5	71
3.9	79	3.4	68
3.8	78	3.3	66
3.7	77	3.2	64
3.6	73	3.1	62
3.0	60	2.5	50

GPA	CGS	GPA	CGS
4.0	80	3.5	71
3.9	79	3.4	68
3.8	78	3.3	66
3.7	77	3.2	64
3.6	73	3.1	62
3.0	60	2.5	50

College Simulation — Agent Rules Reference

College Simulation — Agent Rules Reference

Table of Contents

1. Student Agent Rules

1.1 Generation

1.2 Archetypes

1.3 Hooks & Demographics

1.4 Academic Index (AI)

1.5 College List Building

1.6 Round Assignment

1.7 EDII Backup Pre-Assignment

1.8 Final Decision Logic

2. University (Admissions Office) Agent Rules

2.1 Initialization

2.2 Admission Scoring Pipeline

2.3 Round Processing

2.4 EDII Conversion

2.5 Waitlist Resolution

3. Shared Constants Reference

FIT_SCORES (archetype × college, scale 1–5)

College Tiers

CHETTY_YIELD_BY_INCOME

YIELD_PROTECTION (Tufts Syndrome)

WAITLIST_DATA (selected)

4. Simulation Execution Order

GPA	CGS	GPA	CGS
4.0	80	3.5	71
3.9	79	3.4	68
3.8	78	3.3	66
3.7	77	3.2	64
3.6	73	3.1	62
3.0	60	2.5	50