College Simulation — Agent Rules Reference

Source: agent_rules.md


College Simulation — Agent Rules Reference

Generated from index.html (current state, 2026-03-02) Documents the behavioral logic of every agent type in the simulation.


Table of Contents

  1. Student Agent Rules
  2. 1.1 Generation
  3. 1.2 Archetypes
  4. 1.3 Hooks & Demographics
  5. 1.4 Academic Index (AI)
  6. 1.5 College List Building
  7. 1.6 Round Assignment
  8. 1.7 EDII Backup Pre-Assignment
  9. 1.8 Final Decision Logic
  10. University (Admissions Office) Agent Rules
  11. 2.1 Initialization
  12. 2.2 Admission Scoring Pipeline
  13. 2.3 Round Processing
  14. 2.4 EDII Conversion
  15. 2.5 Waitlist Resolution
  16. Shared Constants Reference
  17. Simulation Execution Order

1. Student Agent Rules

1.1 Generation

Each high school generates studentsPerSchool (default: 20) students per simulation run. For each student:

Step 1 — Archetype selection archetype = weightedChoice(ARCHETYPE_WEIGHTS[school.category]) Weights differ by school category (elite boarding vs. public magnet vs. NYC private, etc.).

Step 2 — Gender STEM-leaning schools (boarding, public magnet, NYC private): 55% male / 45% female. All other schools: 50/50.

Step 3 — Correlated GPA / SAT Uses a Cholesky-decomposed bivariate normal with ρ = 0.65 (research: GPA-SAT correlation in elite pools).

gpa_z = z0
sat_z = 0.65·z0 + √(1−0.65²)·z1
gpa   = clamp(school.gpa.mean + school.gpa.std × gpa_z, school.gpa.lo, school.gpa.hi)
sat   = clamp(school.sat.mean + school.sat.std × sat_z, school.sat.lo, school.sat.hi)
       rounded to nearest 10

Step 4 — Archetype stat adjustments (see §1.2 for full detail)

Step 5 — Hook probabilities (see §1.3)

Step 6 — Income-stratified SAT offset SAT_INCOME_OFFSETS = [-50, -18, 0, +9, +32] for brackets 1–5 (< $20K → $80K+). Applied at ~40% of the national 206-point gap to avoid double-counting school-type stratification.

Step 7 — Weighted GPA gpa_w = clamp(gpa_uw + 0.3 + ap_count × 0.03, 3.5, 5.0)

Tracked students (e.g., Petr Kirsanov) are injected after the main loop with all fields hard-coded from real documents. They participate in the sim identically to generated students.


1.2 Archetypes

Eight archetypes, each with distinct stat profiles and application counts:

Archetype App Count EC Quality Essay Base Key Adjustments
athlete 4–8 5–9 school mean SAT −50 to −120, GPA −0.10 to −0.25; recruited_athlete = true
legacy_dev 5–9 school mean school mean Assigned a tier-1/2/3 legacy school; 35% chance donor = true
first_gen 5–9 4–7.5 +0.5 to +1.0 first_gen = true; 40% chance urm = true
stem_spike 10–16 max(7, ec+1) +0.5 SAT +30 to +60; AP count +3 to +4
humanities_spike 8–13 max(7, ec+1.5) +1.0 to +1.5 Broadened liberal arts search
arts_spike 6–10 8–10 7–9 SAT −10 to −40; EC and essay from high truncNorm
average_strong 8–13 3–7 school mean No boosts; safety-conscious
well_rounded 8–13 school mean school mean Strongly prefers ED

Archetype application count is the primary driver of target list size; the actual count K is drawn from a lognormal centered on the archetype mean with σ = 0.4 (clamped 3–20).

EC/Essay base stats (before archetype adjustment): ecQuality ~ truncNorm(6.5, 1.5, 3, 10) essayBase ~ truncNorm(6.0, 1.8, 3, 10)


1.3 Hooks & Demographics

Hook assignment (non-athlete, non-legacy_dev archetypes)

Per school category, background legacy/donor rates:

School Category Legacy Prob Donor Prob
elite_nyc_private 22% 15%
elite_boarding 18% 12%
boarding_day 15% 9%
elite_day_school 14% 9%
international_school 5% 2%
public_charter_elite 4% 2%
elite_public_magnet 3% 1%

When a legacy hook is assigned, 25% chance donor is also set. Legacy school drawn randomly from all colleges in tiers 1–3.

Per-school hook profile (from hookProfile JSON field): Each high school can specify first_gen and urm probabilities; applied after category-level hooks.

Additional hook probabilities (all archetypes): - Consulting client: school.consulting_client_prob (0.03–0.20 by school ivy-placement %) → if client: essay +0.5–1.0, EC +0.3–0.6 - Underrepresented state: 8% baseline - URM (non-first_gen path): 10% baseline

Income bracket assignment

Base by school category: boarding 5, nyc_private 5, boarding_day 4, elite_day 4, international 3, charter 3, magnet 2. First-gen students: bracket − 2. ±1 uniform noise. Clamped 1–5.

Pell eligibility: true if first_gen === true or income_bracket ≤ 2.


1.4 Academic Index (AI)

All scoring is anchored to the Academic Index on a 0–240 scale, mirroring the Ivy League AI formula.

CGS (Converted Grade Score) from unweighted GPA — piecewise linear Ivy table:

GPA CGS GPA CGS
4.0 80 3.5 71
3.9 79 3.4 68
3.8 78 3.3 66
3.7 77 3.2 64
3.6 73 3.1 62
3.0 60 2.5 50

Above 4.0 → 80 (cap). Below 2.5 → linear extrapolation at 20 pts/GPA.

SAT component: (SAT / 20) × 2 → max 160 at 1600.

Academic Index: AI = CGS + (SAT/20)×2, capped at 240.


1.5 College List Building

For each student, a utility model ranks all 30 colleges; the top-K are selected for the list.

utility(college) = prestige + fitBonus + legacyBonus + 5 × logPEst

Where: - prestige: (6 − tier) × 8 + rand() × 12 − 6 (tier 1 → ~40, tier 5 → ~2) - fitBonus: FIT_SCORES[archetype][college] (0–5 scale, see §3) - legacyBonus: +15 if hooks.legacy === college (guarantees legacy school appears) - logPEst: log of estimated P(admit) using academic factors only (hooks excluded): rawEst = clamp(20 + aiDelta × 0.75, 0, 40) + (ec_quality/10)×20 + (essay_base/10)×10 + 2 acLogit = (rawEst − 46) / 20 logPEst = −log(1 + exp(−(acLogit − college.admitThreshold)))

Target K: K ~ lognormal(log(APP_MEANS[archetype]), σ=0.4), clamped 3–20.

APP_MEANS by archetype: athlete 6, legacy_dev 7, first_gen 6, stem_spike 13, humanities_spike 11, arts_spike 8, average_strong 10, well_rounded 10.

Category labels assigned from logPEst:

logPEst Category
< −2.5 dream
−2.5 to −1.0 reach
−1.0 to −0.3 target
≥ −0.3 safety

Yield-risk flag: yieldRiskSchool = true if college is in {tufts, emory, uchicago, washu, carnegie_mellon} AND student's SAT > college median + 150.

Legacy guarantee: If hooks.legacy school was not in top-K utilities, it is appended to the list with demonstratedInterest = 0.7 to 1.0.


1.6 Round Assignment

Colleges are sorted by tier (ascending) to determine order of round assignment.

ED / REA (binding early — only one allowed): - Student applies ED to top-choice if: - College offers ED and (archetype is legacy_dev, athlete, or well_rounded, or rand() < 0.50) - Student applies REA/SCEA to top-choice if: - College offers restrictive EA and rand() < 0.45

EA (non-restrictive, no limit): - For each remaining college offering non-restrictive EA: apply EA with probability 65%.

Fallback: If no early round was assigned, the top-choice college is force-assigned ED (if it offers ED) or REA/EA (if it offers EA).

RD: All remaining list items default to Regular Decision.


1.7 EDII Backup Pre-Assignment

After round assignment, if a student has an ED/REA school, the simulation pre-selects an EDII backup:

  1. Filter list items to: has edii=true, currently round='RD', not the ED school.
  2. Prefer candidates exactly one tier below the ED school → then same tier → then any.
  3. Store as student.ediiBackup = college_key.

The EDII conversion from RD → EDII is executed in §2.4 after the ED round resolves.


1.8 Final Decision Logic

After all admission rounds, non-committed students choose among their acceptances using a 5-factor weighted score:

Factor Weight Details
Prestige tier 0.40 ((6 − tier) / 5) × 30 → tier 1 = 30 pts, tier 5 = 6 pts
Archetype fit 0.27 (FIT_SCORES[archetype][college] / 5) × 20
Legacy preference 0.13 +15 pts if hooks.legacy === college, else 0
Personal noise 0.10 rand() × 10
Financial aid (Chetty) 0.10 clamp((relYield − 1.0) × 8, −6, +12) where relYield from CHETTY_YIELD_BY_INCOME[college][income_bracket]

Student commits to the college with the highest total score.

HYPSM override: If student holds multiple HYPSM acceptances: - If one was their ED/REA choice → commit to that one. - Otherwise → highest FIT_SCORE among HYPSM + random noise picks the winner.

Students with zero acceptances receive status = 'rejected_all'.


2. University (Admissions Office) Agent Rules

2.1 Initialization

Each college is initialized with the following derived fields:

Seat allocation by round:

Round Formula
ED floor(classSize × ED_FILL[college]) (default 40%)
EDII floor(classSize × 0.08) if college offers EDII
EA floor(classSize × 0.15) (non-restrictive) or × 0.20 (restrictive)
RD classSize − edSeats − ediiSeats (or − eaSeats)

ED_FILL rates (calibrated from CDS data): Middlebury 68%, Duke 50%, Williams 46%, others default 40%.

School median AI: schoolAI = computeAcademicIndex(college.gpa, (college.sat[0] + college.sat[1]) / 2)

Admit threshold (calibrates the logistic model to real-world RD rates):

eliteBoosts = [_, 3.5, 2.8, 2.2, 2.2, 1.5]   // tier 1–5
poolRateRD  = min(0.60, (college.rateRD / 100) × eliteBoost[tier])
admitThreshold = log((1 − poolRateRD) / poolRateRD)

A higher threshold means harder admissions. For example, Harvard (tier 1, ~3% RD) has a high threshold; UVA (tier 4, ~22% RD) has a lower one.

Yield protection loaded from YIELD_PROTECTION constant (see §3).


2.2 Admission Scoring Pipeline

computeAdmissionScore(college, student, round, essayQ) returns P(admit) ∈ [0,1].

All factors are additive in logit (log-odds) space, then passed through a sigmoid.

Step 1 — Academic components

studentAI    = computeAcademicIndex(student.gpa_uw, student.sat)
aiDelta      = studentAI − college.schoolAI
academicScore= clamp(20 + aiDelta × 0.75, 0, 40)
ecScore      = (student.ec_quality / 10) × 20
ecBonus      = 8 if ec_quality ≥ 9.0, else 3 if ec_quality ≥ 7.5, else 0
essayScore   = (essayQ / 10) × 10
fitScore     = FIT_SCORES[archetype][college] ∈ {0,1,2,3,4,5}

componentsRaw = academicScore + ecScore + ecBonus + essayScore + fitScore
academicLogit = (componentsRaw − 46) / 20

Interpretation: A raw score of 46 → logit 0 → 50% base probability (before threshold subtraction). ±20 raw ≈ ±1 logit unit.

Step 2 — Feeder school bonus

feederLogit = log(student.feeder_bonus)

Range: 1.0× (public) to 2.5× (Collegiate School) → 0 to +0.92 logit.

Step 3 — Hooks (ALDC)

All hooks are additive in logit space (prevents exponential blow-up from stacking):

Hook HYPSM (T1) Ivy+ (T2) Near-Ivy (T3) Selective (T4) LACs (T5)
donor log(7.5) log(6.0) log(4.5) log(3.0) log(2.5)
recruited_athlete log(4.5) log(4.0) log(3.5) log(2.5) log(2.0)
legacy log(5.7) log(4.0) log(3.0) log(2.0) log(2.0)
first_gen log(1.4) log(1.4) log(1.4) log(1.4) log(1.4)
pell_eligible log(1.25) ← same → ← same → ← same → ← same →
urm log(1.2) ← same → ← same → ← same → ← same →
underrepresented_state log(1.3) ← same → ← same → ← same → ← same →

Calibrated from SFFA v. Harvard trial data (Arcidiacono expert testimony). Reference ALDC rates at Harvard: athlete 86%, donor 42%, legacy 33%, unhooked 5.6%.

Gender multiplier (added to hookLogit):

School Type Male Female
stem_heavy 1.0× 1.9×
balanced 1.0× 1.05×
lac 1.25× 1.0×

Caltech uses stem_heavy (female 1.9×), Williams uses lac (male 1.25×).

Income residual (Chetty 2023, unhooked students only): - Income bracket 5 (high): +log(1.15) - Income bracket 1 (low): +log(0.92) - Brackets 2–4: no adjustment

Step 4 — Yield protection penalty

Applies to 9 schools: Tufts, Emory, WashU, Carnegie Mellon, Middlebury, Boston College, Georgetown, Michigan, UVA.

Condition: college.yieldProtection = true AND aiDelta > 25 AND student has no major hooks (donor/athlete/legacy).

pen = yieldProtectionStrength × min(1, (aiDelta − 25) / 40)
yieldPenalty = −pen × 2   (in logit units)

Strength values: Tufts 0.35, WashU 0.30, Emory 0.25, Middlebury 0.20, Carnegie Mellon 0.20, Boston College 0.15, Georgetown 0.15, Michigan 0.10, UVA 0.10.

Step 5 — Round multiplier

Round roundLogit
ED / REA log(clamp(rateE / rate, 1.2, 4.5))
EDII log(clamp(rateE / rate × 0.65, 1.1, 3.0))
EA log(clamp(rateE / rate × 0.80, 1.0, 2.5))
RD 0

rateE = college's ED/EA rate, rate = overall acceptance rate.

Step 6 — Holistic noise

noise = (rand() + rand() − 1) × 1.2   // ≈ Normal(0, 0.7)

Approximates the subjective component of holistic review (~±1 logit, effectively ±25% probability swing).

Step 7 — Logistic combination

logit = academicLogit + feederLogit + hookLogit + yieldPenalty + roundLogit + noise
        − college.admitThreshold
P(admit) = sigmoid(logit) = 1 / (1 + exp(−logit))

2.3 Round Processing

Rounds execute in order: ED → EA/REA → EDII → RD.

For each college in each round:

  1. Collect applicants: ED pool, EA+REA pool, EDII pool, or RD+deferred pool.
  2. Filter: Remove students already committed from binding rounds.
  3. Score: Call computeAdmissionScore() for every applicant → stored as app.admission_score.
  4. Sort: Descending by admission_score.
  5. Stochastic Bernoulli admission: For each applicant in order:
  6. If seatsForRound > admitCount AND rand() < PACCEPT
  7. Else if ED/EA AND P > 0.15 AND deferCount < 30% of appsDEFER (goes to RD pool)
  8. Else if RD AND P > 0.05 AND waitlistCount < 15% of appsWAITLIST
  9. Else → REJECT

Seat caps: - ED: edSeats - EA: eaSeats - EDII: ediiSeats - RD: seatsTotal − enrolled.length (remaining after all prior rounds)

Binding rounds (ED, EDII): Accepted students are immediately committed (status = 'committed', committed_to = college_key). No further applications considered.


2.4 EDII Conversion

Runs after the ED round, before the EDII round processes.

For each non-committed student with ediiBackup set: 1. Find the existing RD application for the backup school. 2. Convert it: app.round = 'EDII', move from RD pool to EDII pool. 3. Skip EDII with 40% probability if student already holds an EA acceptance.

Fallback (students without ediiBackup): Probabilistically convert a random eligible RD school to EDII with 30% chance. Only one EDII school per student.


2.5 Waitlist Resolution

Runs after student final decisions. Up to 5 cascade iterations.

Within each iteration, colleges are processed in prestige order (tier 1 first), so higher-ranked schools poach from lower-ranked schools' committed students.

For each college: 1. Skip if enrolled ≥ seatsTotal. 2. Calculate deficit = seatsTotal − enrolled.length; skip if deficit / seatsTotal ≤ 5%. 3. Pull from waitlist using per-school pullRate (from WAITLIST_DATA): scaleFactor = studentsPerSchool / 20 toPull = min(deficit, round(pullRate × typicalPulled × scaleFactor)) 4. For each waitlisted student (up to toPull): - If student is still uncommitted: admit, trigger studentFinalDecisions() for re-evaluation, then continue cascade.

Waitlist pull rates range from 1% (Carnegie Mellon, Michigan) to 10% (Brown, Duke).


3. Shared Constants Reference

FIT_SCORES (archetype × college, scale 1–5)

5 = perfect fit, 3 = good match, 1 = weak fit. Selected examples:

College stem_spike humanities_spike athlete arts_spike first_gen
MIT 5 1 1 1 2
Caltech 5 1 1 1 2
Stanford 4 3 3 3 3
UChicago 3 5 1 3 2
Georgetown 2 5 2 2 3
Williams 3 5 3 4 3
UCLA 4 3 4 4 4
Notre Dame 2 4 4 2 3

College Tiers

Tier Colleges
1 — HYPSM Harvard, Yale, Princeton, Stanford, MIT
2 — Ivy+ Columbia, UPenn, Brown, Dartmouth, Cornell, Duke, Northwestern, UChicago, Caltech
3 — Near-Ivy Johns Hopkins, Vanderbilt, Rice, Notre Dame, Georgetown, Carnegie Mellon, WashU
4 — Selective Emory, Tufts, Boston College, UVA, UCLA, Michigan
5 — Top LACs Williams, Amherst, Middlebury

CHETTY_YIELD_BY_INCOME

Per-school, per-income-bracket relative yield (1.0 = national average). Used in student final decision scoring. Source: Opportunity Insights / Kaggle elite-college-admissions dataset.

Brackets: 1 = < $20K, 2 = $20–40K, 3 = $40–60K, 4 = $60–80K, 5 = $80K+.

Examples: Harvard bracket-5 yield: 1.953× (high-income students enroll at 1.95× national rate). UCLA bracket-5: 0.48× (high-income students significantly less likely to attend than average).

YIELD_PROTECTION (Tufts Syndrome)

College Strength
Tufts 0.35
WashU 0.30
Emory 0.25
Middlebury 0.20
Carnegie Mellon 0.20
Boston College 0.15
Georgetown 0.15
Michigan 0.10
UVA 0.10

Penalty activates only when aiDelta > 25 (student significantly overqualified) and student lacks major hooks.

WAITLIST_DATA (selected)

College pullRate typicalPulled
Harvard 3% 10
Brown 10% 80
Duke 10% 100
Princeton 8% 50
Michigan 1% 60
Carnegie Mellon 1% 50

4. Simulation Execution Order

1. initColleges()          → Compute schoolAI, admitThreshold, seat allocation, yield protection config
2. generateStudents()      → Correlated GPA/SAT, archetypes, hooks, income, feeder bonus; inject tracked students
3. buildCollegeLists()     → Utility model selects top-K colleges per student
4. assignRounds()          → ED/REA/EA/EDII/RD assignment + EDII backup pre-selection
5. createApplications()    → Essay quality roll (essay_base ± rand); applications pushed to college queues
6. processRound('ED')      → Stochastic Bernoulli; binding admits committed immediately
7. handleEDIIApplicants()  → Convert ediiBackup from RD → EDII
8. processRound('EA')      → EA + REA queues; deferred go to RD
9. processRound('EDII')    → Binding admits
10. processRound('RD')     → Main pool + deferred; waitlist assignments
11. studentFinalDecisions()→ Multi-factor choice among acceptances
12. resolveWaitlist()      → Up to 5 yield-triggered cascade iterations

Each step fires SIM.events entries (type: 'decision' or 'commit') consumed by the D3 arc visualization and the Tracked Student tab timeline.