Use case · Healthcare

When the risk model under-serves the patient.

Clinical risk scores and resource-allocation models trained on historically unequal care reproduce that inequality: under-estimating risk for the groups the system saw less of, or treated differently. In medicine the cost of a biased score is measured in missed diagnoses.

The published heart-disease result

0.44 vs 0.44 risk estimates for men and women after Rosa, healthy patients (0.54 vs 0.54 for patients with heart disease): the model's gender gap effectively eliminated Demonstrating Rosa white paper, Cleveland heart-disease dataset · methodology

In the published Demonstrating Rosa white paper, a logistic regression model predicting heart disease on the public Cleveland dataset assigned women lower risk estimates than men, whether or not they actually had heart disease, mirroring the documented under-diagnosis of heart disease in women. After Rosa debiased the data, the same model assigned men and women equal estimates: standardised gender bias fell from 0.30 to 0.04 for healthy patients and from 0.20 to 0.05 for patients with disease.

The clinically useful signal stays; the demographic signal goes. A demonstration on a public dataset, stated as such. Read the white paper (PDF).

Who this is for

Health systems, medical AI builders, and insurers. Under the EU AI Act, AI in essential services and medical contexts sits in high-risk territory where Article 10's dataset duties apply, and the patient-safety and fundamental-rights framing makes dataset bias a clinical governance issue, not just a compliance one.

Rosa examines the dataset, removes the recoverable demographic signal, preserves the clinical statistics, and emits the evidence, so the model your clinicians rely on is defensible at the data layer.

Make the risk model defensible at the data layer.

Diagnose first: measure what your clinical data encodes. Free to try.