Skip to main content
Transparency

The calibration agent, explained.

How FairLens's calibration agent learns from you — not the other way round. We show you everything: the prompts, the metrics, the citations, and what we deliberately don't claim.

What it does

The calibration agent is a pre-launch tool. Before a single real applicant is scored, you teach the system how YOU evaluate applications, by grading a small cohort of synthetic ones. Then the system shows you which of your criteria are actually driving your decisions, and which need to change.

Four pillars, in order:

  1. Generate a synthetic cohort — ~20 fake-but-plausible applicants spanning clear accepts, clear rejects, borderlines, and lopsided edge cases.
  2. Blind grading — you grade each one with a per-criterion score and a one-sentence rationale. You never see the AI's score until you've submitted yours.
  3. Analyze — the system scores all synthetics with your draft criteria, then compares its scores against yours.
  4. Propose — you get a diff: refined criterion descriptions, suggested weight changes, suggested cutoff, and flagged missing or redundant criteria. Every change cites which of your rationales drove it. You accept, reject, or edit — per criterion or in bulk.

What we measure

Four agreement metrics anchor the proposal. We surface the numbers, not a marketing claim.

  • Decision accuracy — how often the AI's accept/reject matched yours on the synthetic cohort.
  • Per-criterion correlation (Pearson) — for each criterion, how closely your scores tracked the AI's. We show this per criterion, not as a single blended number.
  • Confusion matrix — counts of agreement and disagreement, broken down (true accepts, true rejects, false accepts, false rejects). You can see whether the AI tends to over-accept or over-reject before you make changes.
  • Confidence overall — a coarse high/medium/low band based on sample size and the noise in the agreement signal. Below 12 grades, the system refuses to propose weight changes — only description refinements.

How the AI's suggestions are derived

Every suggested change has a traceable origin. Specifically:

  • Weight changes come from a logistic regression of your accept/reject decisions against the AI's per-criterion scores. Not magic — a named statistical method. The proposal includes a confidence interval.
  • Cutoff changes come from finding the score threshold that maximizes you-vs-AI decision agreement on the cohort you graded.
  • Criterion-description rewrites are LLM-generated, but every rewrite cites which of your rationales drove it. You see the citation. You can reject the rewrite.
  • Missing-criteria candidates are themes the LLM extracts from your rationales — again, with citations back to your text. If you can't see the evidence, the system doesn't propose the change.

What we don't claim

We don't say the proposed rubric is optimal or best. We say it is manager-aligned — it predicts YOUR decisions on the cohort YOU graded, with the measured agreement we surface. If a different manager grades the same cohort, the proposal can shift. The cohort and the grader are inseparable from the result.

What we never do

  • Synthetic applicants never appear in your dashboards, cohort stats, exports, or applicant directory. They live in a sibling table and stay there.
  • We don't send essay text, criterion descriptions, or form data in telemetry events. Telemetry carries only IDs (program, session, application, user).
  • We don't auto-apply changes. Every proposed change requires a manager click.
  • We don't replace human review. The reviewer-grading flow and the Studio's drift check (Mode B) are unchanged.

Data residency & compliance

FairLens runs on Azure South Africa North. Calibration data — including synthetic cohorts and your rationales — sits in the same Postgres database as the rest of your program data, in the same region. We name NDPR (Nigeria), POPIA (South Africa), Kenya DPA, and Ghana DPA on the security and privacy pages.

One nuance: LLM calls for cohort generation and analysis go to Azure OpenAI in Sweden Central (EU). The Azure OpenAI service does not retain prompts or completions for training. We document this honestly on the privacy page.

Want a walkthrough of the full flow?