Use case · Hiring and HR

The CV never says "gender". The data says it anyway.

Recruitment shortlisting and scoring run on data dense with proxies: career gaps that encode parental leave, postcodes that encode ethnicity, schools and names that encode class and origin. Deleting the protected column does not delete the signal.

How bias hides in hiring data

A scoring model, or a recruiter working from a ranked list, reads the proxies and reconstructs what the form never asked. The result is the familiar one: systematically lower scores for groups the historical data treated worse, laundered through apparently neutral features.

The regulatory direction is clear. The EU AI Act names recruitment and worker management as high-risk, which brings Article 10's dataset examination and mitigation duties. New York City's Local Law 144 already requires bias audits of automated employment decision tools.

What Rosa does

Point Diagnose at your candidate dataset and name one protected attribute. Rosa measures how recoverable that attribute is from the rest of the data and scores each column's contribution: career-gap fields, location fields, education fields, wherever the signal hides. No proxy labelling required; the adversarial method finds them because any leak is something the discriminator can exploit.

Remove then transforms the data so the signal is gone while every column keeps its distribution, and the run leaves the manifest and report your auditors and Local Law 144 assessors want to see.

1.81 to 0.13 standardised age bias in a model's absenteeism estimates for the highest-absence employees, before and after Rosa (0.69 to 0.20 for everyone else) Demonstrating Rosa white paper, UCI Absenteeism at Work dataset · methodology

The published evidence for the HR domain: in the Demonstrating Rosa white paper, a simple model predicting employee absenteeism (a common screening signal) gave over-45s markedly higher absence estimates than under-35s, even when the older employees were actually absent less. After Rosa debiased the data with respect to age, that gap effectively disappeared, making the dataset safe for automated screening without age discrimination. Read the white paper (PDF).

Measure your hiring data before your auditor does.

Diagnose reports the bias and the proxies carrying it. Free to try.