Source: agent_rules.md
Generated from
index.html(current state, 2026-03-02) Documents the behavioral logic of every agent type in the simulation.
Each high school generates studentsPerSchool (default: 20) students per simulation run. For each student:
Step 1 — Archetype selection
archetype = weightedChoice(ARCHETYPE_WEIGHTS[school.category])
Weights differ by school category (elite boarding vs. public magnet vs. NYC private, etc.).
Step 2 — Gender STEM-leaning schools (boarding, public magnet, NYC private): 55% male / 45% female. All other schools: 50/50.
Step 3 — Correlated GPA / SAT Uses a Cholesky-decomposed bivariate normal with ρ = 0.65 (research: GPA-SAT correlation in elite pools).
gpa_z = z0
sat_z = 0.65·z0 + √(1−0.65²)·z1
gpa = clamp(school.gpa.mean + school.gpa.std × gpa_z, school.gpa.lo, school.gpa.hi)
sat = clamp(school.sat.mean + school.sat.std × sat_z, school.sat.lo, school.sat.hi)
rounded to nearest 10
Step 4 — Archetype stat adjustments (see §1.2 for full detail)
Step 5 — Hook probabilities (see §1.3)
Step 6 — Income-stratified SAT offset
SAT_INCOME_OFFSETS = [-50, -18, 0, +9, +32] for brackets 1–5 (< $20K → $80K+).
Applied at ~40% of the national 206-point gap to avoid double-counting school-type stratification.
Step 7 — Weighted GPA
gpa_w = clamp(gpa_uw + 0.3 + ap_count × 0.03, 3.5, 5.0)
Tracked students (e.g., Petr Kirsanov) are injected after the main loop with all fields hard-coded from real documents. They participate in the sim identically to generated students.
Eight archetypes, each with distinct stat profiles and application counts:
| Archetype | App Count | EC Quality | Essay Base | Key Adjustments |
|---|---|---|---|---|
athlete |
4–8 | 5–9 | school mean | SAT −50 to −120, GPA −0.10 to −0.25; recruited_athlete = true |
legacy_dev |
5–9 | school mean | school mean | Assigned a tier-1/2/3 legacy school; 35% chance donor = true |
first_gen |
5–9 | 4–7.5 | +0.5 to +1.0 | first_gen = true; 40% chance urm = true |
stem_spike |
10–16 | max(7, ec+1) | +0.5 | SAT +30 to +60; AP count +3 to +4 |
humanities_spike |
8–13 | max(7, ec+1.5) | +1.0 to +1.5 | Broadened liberal arts search |
arts_spike |
6–10 | 8–10 | 7–9 | SAT −10 to −40; EC and essay from high truncNorm |
average_strong |
8–13 | 3–7 | school mean | No boosts; safety-conscious |
well_rounded |
8–13 | school mean | school mean | Strongly prefers ED |
Archetype application count is the primary driver of target list size; the actual count K is drawn from a lognormal centered on the archetype mean with σ = 0.4 (clamped 3–20).
EC/Essay base stats (before archetype adjustment):
ecQuality ~ truncNorm(6.5, 1.5, 3, 10)
essayBase ~ truncNorm(6.0, 1.8, 3, 10)
Hook assignment (non-athlete, non-legacy_dev archetypes)
Per school category, background legacy/donor rates:
| School Category | Legacy Prob | Donor Prob |
|---|---|---|
| elite_nyc_private | 22% | 15% |
| elite_boarding | 18% | 12% |
| boarding_day | 15% | 9% |
| elite_day_school | 14% | 9% |
| international_school | 5% | 2% |
| public_charter_elite | 4% | 2% |
| elite_public_magnet | 3% | 1% |
When a legacy hook is assigned, 25% chance donor is also set. Legacy school drawn randomly from all colleges in tiers 1–3.
Per-school hook profile (from hookProfile JSON field):
Each high school can specify first_gen and urm probabilities; applied after category-level hooks.
Additional hook probabilities (all archetypes):
- Consulting client: school.consulting_client_prob (0.03–0.20 by school ivy-placement %)
→ if client: essay +0.5–1.0, EC +0.3–0.6
- Underrepresented state: 8% baseline
- URM (non-first_gen path): 10% baseline
Income bracket assignment
Base by school category: boarding 5, nyc_private 5, boarding_day 4, elite_day 4, international 3, charter 3, magnet 2. First-gen students: bracket − 2. ±1 uniform noise. Clamped 1–5.
Pell eligibility: true if first_gen === true or income_bracket ≤ 2.
All scoring is anchored to the Academic Index on a 0–240 scale, mirroring the Ivy League AI formula.
CGS (Converted Grade Score) from unweighted GPA — piecewise linear Ivy table:
| GPA | CGS | GPA | CGS |
|---|---|---|---|
| 4.0 | 80 | 3.5 | 71 |
| 3.9 | 79 | 3.4 | 68 |
| 3.8 | 78 | 3.3 | 66 |
| 3.7 | 77 | 3.2 | 64 |
| 3.6 | 73 | 3.1 | 62 |
| 3.0 | 60 | 2.5 | 50 |
Above 4.0 → 80 (cap). Below 2.5 → linear extrapolation at 20 pts/GPA.
SAT component: (SAT / 20) × 2 → max 160 at 1600.
Academic Index: AI = CGS + (SAT/20)×2, capped at 240.
For each student, a utility model ranks all 30 colleges; the top-K are selected for the list.
utility(college) = prestige + fitBonus + legacyBonus + 5 × logPEst
Where:
- prestige: (6 − tier) × 8 + rand() × 12 − 6 (tier 1 → ~40, tier 5 → ~2)
- fitBonus: FIT_SCORES[archetype][college] (0–5 scale, see §3)
- legacyBonus: +15 if hooks.legacy === college (guarantees legacy school appears)
- logPEst: log of estimated P(admit) using academic factors only (hooks excluded):
rawEst = clamp(20 + aiDelta × 0.75, 0, 40)
+ (ec_quality/10)×20 + (essay_base/10)×10 + 2
acLogit = (rawEst − 46) / 20
logPEst = −log(1 + exp(−(acLogit − college.admitThreshold)))
Target K: K ~ lognormal(log(APP_MEANS[archetype]), σ=0.4), clamped 3–20.
APP_MEANS by archetype: athlete 6, legacy_dev 7, first_gen 6, stem_spike 13, humanities_spike 11, arts_spike 8, average_strong 10, well_rounded 10.
Category labels assigned from logPEst:
| logPEst | Category |
|---|---|
| < −2.5 | dream |
| −2.5 to −1.0 | reach |
| −1.0 to −0.3 | target |
| ≥ −0.3 | safety |
Yield-risk flag: yieldRiskSchool = true if college is in {tufts, emory, uchicago, washu, carnegie_mellon} AND student's SAT > college median + 150.
Legacy guarantee: If hooks.legacy school was not in top-K utilities, it is appended to the list with demonstratedInterest = 0.7 to 1.0.
Colleges are sorted by tier (ascending) to determine order of round assignment.
ED / REA (binding early — only one allowed):
- Student applies ED to top-choice if:
- College offers ED and (archetype is legacy_dev, athlete, or well_rounded, or rand() < 0.50)
- Student applies REA/SCEA to top-choice if:
- College offers restrictive EA and rand() < 0.45
EA (non-restrictive, no limit): - For each remaining college offering non-restrictive EA: apply EA with probability 65%.
Fallback: If no early round was assigned, the top-choice college is force-assigned ED (if it offers ED) or REA/EA (if it offers EA).
RD: All remaining list items default to Regular Decision.
After round assignment, if a student has an ED/REA school, the simulation pre-selects an EDII backup:
edii=true, currently round='RD', not the ED school.student.ediiBackup = college_key.The EDII conversion from RD → EDII is executed in §2.4 after the ED round resolves.
After all admission rounds, non-committed students choose among their acceptances using a 5-factor weighted score:
| Factor | Weight | Details |
|---|---|---|
| Prestige tier | 0.40 | ((6 − tier) / 5) × 30 → tier 1 = 30 pts, tier 5 = 6 pts |
| Archetype fit | 0.27 | (FIT_SCORES[archetype][college] / 5) × 20 |
| Legacy preference | 0.13 | +15 pts if hooks.legacy === college, else 0 |
| Personal noise | 0.10 | rand() × 10 |
| Financial aid (Chetty) | 0.10 | clamp((relYield − 1.0) × 8, −6, +12) where relYield from CHETTY_YIELD_BY_INCOME[college][income_bracket] |
Student commits to the college with the highest total score.
HYPSM override: If student holds multiple HYPSM acceptances: - If one was their ED/REA choice → commit to that one. - Otherwise → highest FIT_SCORE among HYPSM + random noise picks the winner.
Students with zero acceptances receive status = 'rejected_all'.
Each college is initialized with the following derived fields:
Seat allocation by round:
| Round | Formula |
|---|---|
| ED | floor(classSize × ED_FILL[college]) (default 40%) |
| EDII | floor(classSize × 0.08) if college offers EDII |
| EA | floor(classSize × 0.15) (non-restrictive) or × 0.20 (restrictive) |
| RD | classSize − edSeats − ediiSeats (or − eaSeats) |
ED_FILL rates (calibrated from CDS data): Middlebury 68%, Duke 50%, Williams 46%, others default 40%.
School median AI:
schoolAI = computeAcademicIndex(college.gpa, (college.sat[0] + college.sat[1]) / 2)
Admit threshold (calibrates the logistic model to real-world RD rates):
eliteBoosts = [_, 3.5, 2.8, 2.2, 2.2, 1.5] // tier 1–5
poolRateRD = min(0.60, (college.rateRD / 100) × eliteBoost[tier])
admitThreshold = log((1 − poolRateRD) / poolRateRD)
A higher threshold means harder admissions. For example, Harvard (tier 1, ~3% RD) has a high threshold; UVA (tier 4, ~22% RD) has a lower one.
Yield protection loaded from YIELD_PROTECTION constant (see §3).
computeAdmissionScore(college, student, round, essayQ) returns P(admit) ∈ [0,1].
All factors are additive in logit (log-odds) space, then passed through a sigmoid.
Step 1 — Academic components
studentAI = computeAcademicIndex(student.gpa_uw, student.sat)
aiDelta = studentAI − college.schoolAI
academicScore= clamp(20 + aiDelta × 0.75, 0, 40)
ecScore = (student.ec_quality / 10) × 20
ecBonus = 8 if ec_quality ≥ 9.0, else 3 if ec_quality ≥ 7.5, else 0
essayScore = (essayQ / 10) × 10
fitScore = FIT_SCORES[archetype][college] ∈ {0,1,2,3,4,5}
componentsRaw = academicScore + ecScore + ecBonus + essayScore + fitScore
academicLogit = (componentsRaw − 46) / 20
Interpretation: A raw score of 46 → logit 0 → 50% base probability (before threshold subtraction). ±20 raw ≈ ±1 logit unit.
Step 2 — Feeder school bonus
feederLogit = log(student.feeder_bonus)
Range: 1.0× (public) to 2.5× (Collegiate School) → 0 to +0.92 logit.
Step 3 — Hooks (ALDC)
All hooks are additive in logit space (prevents exponential blow-up from stacking):
| Hook | HYPSM (T1) | Ivy+ (T2) | Near-Ivy (T3) | Selective (T4) | LACs (T5) |
|---|---|---|---|---|---|
| donor | log(7.5) | log(6.0) | log(4.5) | log(3.0) | log(2.5) |
| recruited_athlete | log(4.5) | log(4.0) | log(3.5) | log(2.5) | log(2.0) |
| legacy | log(5.7) | log(4.0) | log(3.0) | log(2.0) | log(2.0) |
| first_gen | log(1.4) | log(1.4) | log(1.4) | log(1.4) | log(1.4) |
| pell_eligible | log(1.25) | ← same → | ← same → | ← same → | ← same → |
| urm | log(1.2) | ← same → | ← same → | ← same → | ← same → |
| underrepresented_state | log(1.3) | ← same → | ← same → | ← same → | ← same → |
Calibrated from SFFA v. Harvard trial data (Arcidiacono expert testimony). Reference ALDC rates at Harvard: athlete 86%, donor 42%, legacy 33%, unhooked 5.6%.
Gender multiplier (added to hookLogit):
| School Type | Male | Female |
|---|---|---|
| stem_heavy | 1.0× | 1.9× |
| balanced | 1.0× | 1.05× |
| lac | 1.25× | 1.0× |
Caltech uses stem_heavy (female 1.9×), Williams uses lac (male 1.25×).
Income residual (Chetty 2023, unhooked students only): - Income bracket 5 (high): +log(1.15) - Income bracket 1 (low): +log(0.92) - Brackets 2–4: no adjustment
Step 4 — Yield protection penalty
Applies to 9 schools: Tufts, Emory, WashU, Carnegie Mellon, Middlebury, Boston College, Georgetown, Michigan, UVA.
Condition: college.yieldProtection = true AND aiDelta > 25 AND student has no major hooks (donor/athlete/legacy).
pen = yieldProtectionStrength × min(1, (aiDelta − 25) / 40)
yieldPenalty = −pen × 2 (in logit units)
Strength values: Tufts 0.35, WashU 0.30, Emory 0.25, Middlebury 0.20, Carnegie Mellon 0.20, Boston College 0.15, Georgetown 0.15, Michigan 0.10, UVA 0.10.
Step 5 — Round multiplier
| Round | roundLogit |
|---|---|
| ED / REA | log(clamp(rateE / rate, 1.2, 4.5)) |
| EDII | log(clamp(rateE / rate × 0.65, 1.1, 3.0)) |
| EA | log(clamp(rateE / rate × 0.80, 1.0, 2.5)) |
| RD | 0 |
rateE = college's ED/EA rate, rate = overall acceptance rate.
Step 6 — Holistic noise
noise = (rand() + rand() − 1) × 1.2 // ≈ Normal(0, 0.7)
Approximates the subjective component of holistic review (~±1 logit, effectively ±25% probability swing).
Step 7 — Logistic combination
logit = academicLogit + feederLogit + hookLogit + yieldPenalty + roundLogit + noise
− college.admitThreshold
P(admit) = sigmoid(logit) = 1 / (1 + exp(−logit))
Rounds execute in order: ED → EA/REA → EDII → RD.
For each college in each round:
computeAdmissionScore() for every applicant → stored as app.admission_score.admission_score.seatsForRound > admitCount AND rand() < P → ACCEPTP > 0.15 AND deferCount < 30% of apps → DEFER (goes to RD pool)P > 0.05 AND waitlistCount < 15% of apps → WAITLISTSeat caps:
- ED: edSeats
- EA: eaSeats
- EDII: ediiSeats
- RD: seatsTotal − enrolled.length (remaining after all prior rounds)
Binding rounds (ED, EDII): Accepted students are immediately committed (status = 'committed', committed_to = college_key). No further applications considered.
Runs after the ED round, before the EDII round processes.
For each non-committed student with ediiBackup set:
1. Find the existing RD application for the backup school.
2. Convert it: app.round = 'EDII', move from RD pool to EDII pool.
3. Skip EDII with 40% probability if student already holds an EA acceptance.
Fallback (students without ediiBackup): Probabilistically convert a random eligible RD school to EDII with 30% chance. Only one EDII school per student.
Runs after student final decisions. Up to 5 cascade iterations.
Within each iteration, colleges are processed in prestige order (tier 1 first), so higher-ranked schools poach from lower-ranked schools' committed students.
For each college:
1. Skip if enrolled ≥ seatsTotal.
2. Calculate deficit = seatsTotal − enrolled.length; skip if deficit / seatsTotal ≤ 5%.
3. Pull from waitlist using per-school pullRate (from WAITLIST_DATA):
scaleFactor = studentsPerSchool / 20
toPull = min(deficit, round(pullRate × typicalPulled × scaleFactor))
4. For each waitlisted student (up to toPull):
- If student is still uncommitted: admit, trigger studentFinalDecisions() for re-evaluation, then continue cascade.
Waitlist pull rates range from 1% (Carnegie Mellon, Michigan) to 10% (Brown, Duke).
5 = perfect fit, 3 = good match, 1 = weak fit. Selected examples:
| College | stem_spike | humanities_spike | athlete | arts_spike | first_gen |
|---|---|---|---|---|---|
| MIT | 5 | 1 | 1 | 1 | 2 |
| Caltech | 5 | 1 | 1 | 1 | 2 |
| Stanford | 4 | 3 | 3 | 3 | 3 |
| UChicago | 3 | 5 | 1 | 3 | 2 |
| Georgetown | 2 | 5 | 2 | 2 | 3 |
| Williams | 3 | 5 | 3 | 4 | 3 |
| UCLA | 4 | 3 | 4 | 4 | 4 |
| Notre Dame | 2 | 4 | 4 | 2 | 3 |
| Tier | Colleges |
|---|---|
| 1 — HYPSM | Harvard, Yale, Princeton, Stanford, MIT |
| 2 — Ivy+ | Columbia, UPenn, Brown, Dartmouth, Cornell, Duke, Northwestern, UChicago, Caltech |
| 3 — Near-Ivy | Johns Hopkins, Vanderbilt, Rice, Notre Dame, Georgetown, Carnegie Mellon, WashU |
| 4 — Selective | Emory, Tufts, Boston College, UVA, UCLA, Michigan |
| 5 — Top LACs | Williams, Amherst, Middlebury |
Per-school, per-income-bracket relative yield (1.0 = national average). Used in student final decision scoring. Source: Opportunity Insights / Kaggle elite-college-admissions dataset.
Brackets: 1 = < $20K, 2 = $20–40K, 3 = $40–60K, 4 = $60–80K, 5 = $80K+.
Examples: Harvard bracket-5 yield: 1.953× (high-income students enroll at 1.95× national rate). UCLA bracket-5: 0.48× (high-income students significantly less likely to attend than average).
| College | Strength |
|---|---|
| Tufts | 0.35 |
| WashU | 0.30 |
| Emory | 0.25 |
| Middlebury | 0.20 |
| Carnegie Mellon | 0.20 |
| Boston College | 0.15 |
| Georgetown | 0.15 |
| Michigan | 0.10 |
| UVA | 0.10 |
Penalty activates only when aiDelta > 25 (student significantly overqualified) and student lacks major hooks.
| College | pullRate | typicalPulled |
|---|---|---|
| Harvard | 3% | 10 |
| Brown | 10% | 80 |
| Duke | 10% | 100 |
| Princeton | 8% | 50 |
| Michigan | 1% | 60 |
| Carnegie Mellon | 1% | 50 |
1. initColleges() → Compute schoolAI, admitThreshold, seat allocation, yield protection config
2. generateStudents() → Correlated GPA/SAT, archetypes, hooks, income, feeder bonus; inject tracked students
3. buildCollegeLists() → Utility model selects top-K colleges per student
4. assignRounds() → ED/REA/EA/EDII/RD assignment + EDII backup pre-selection
5. createApplications() → Essay quality roll (essay_base ± rand); applications pushed to college queues
6. processRound('ED') → Stochastic Bernoulli; binding admits committed immediately
7. handleEDIIApplicants() → Convert ediiBackup from RD → EDII
8. processRound('EA') → EA + REA queues; deferred go to RD
9. processRound('EDII') → Binding admits
10. processRound('RD') → Main pool + deferred; waitlist assignments
11. studentFinalDecisions()→ Multi-factor choice among acceptances
12. resolveWaitlist() → Up to 5 yield-triggered cascade iterations
Each step fires SIM.events entries (type: 'decision' or 'commit') consumed by the D3 arc visualization and the Tracked Student tab timeline.