The product, explained

Diagnose. Remove. Prove.

Three things Rosa does to a dataset, and one thing it never does: change your numbers' statistics.

01 / DIAGNOSE

Measure how recoverable the protected attribute is

You supply a CSV and a short JSON configuration naming one protected attribute (the bias_columns entry), any columns to ignore, and which columns are categorical. That is the whole setup.

Rosa runs the Fair Adversarial Network (FAN), the proven adversarial method at Rosa's core: a discriminator repeatedly tries to recover the protected attribute from the rest of the data. How well it succeeds is a direct measure of how much bias the data encodes, whether the signal is carried directly or through proxies (a feature that stands in for the protected attribute, like a postcode for race, or an employment gap for gender).

No proxy labelling required. You name only the protected attribute. Rosa finds the proxies itself, and scores each column's contribution to the encoded bias.

Output: a measured bias score and a PDF Dataset Intake Report, plus the Run Manifest.

02 / REMOVE

Transform the data, preserve its statistics

Rosa transforms the dataset so a downstream model cannot distinguish individuals by the protected attribute. It uses rank-mapping: values move to fair rank positions, but each column keeps its own distribution. In training output the distribution is preserved bit-for-bit, to float64 precision; in inference output, to roughly 1e-6.

That preservation is the key technical promise. Your data stays usable, your aggregate statistics stay true, and your model stays calibrated, because Rosa does not generate synthetic data: it reassigns your real values fairly. The one exception is standard pre-processing: empty cells are imputed before training (the column mean for numeric columns, the most frequent value for categorical columns), the same step most data pipelines already apply.

Output: a debiased CSV, a trained FAN model for use on future data, and a PDF report.

Before
model race-disparity
0.54
column distribution
After
with Rosa
0.09
same distribution, preserved
−83% disparity on average Pre-conditioned COMPAS · downstream recidivism model
See the methodology
03 / PROVE

Evidence as a byproduct of the work

Every run emits an immutable Run Manifest: job id, mode, input hash, schema hash, config hash, container digest, timestamps, row counts, and the bias summary, alongside the PDF report.

The input hash is the SHA-256 of the raw input file's bytes. Anyone holding the original file can recompute it and verify exactly what was processed. The manifest is written once per job, including failed ones, and retained indefinitely.

Audit evidence is generated by doing the work, not assembled as a separate documentation exercise afterwards.

Run Manifest verified
job_id
550e8400-e29b-41d4-a716-446655440000
mode
remove_bias_training
job_status
complete
timestamp_submitted
2026-06-10T09:14:02Z
timestamp_completed
2026-06-10T09:31:47Z
row_count
2,000
bias_columns
["race"]
input_hash
sha256:9f1c…e7a2
schema_hash
sha256:4b08…21cd
config_hash
sha256:d3aa…90f4
container_digest
sha256:71be…0c55
bias (pre)
0.21
residual_bias
0.001
artifacts
remove_bias_report.pdf, compas_preconditioned_fair.csv
Immutable. Written once per job, kept indefinitely. Illustrative values; field set mirrors the live manifest schema.

Rosa becomes a stage in your pipeline

A model trained on Rosa-debiased data should receive Rosa-debiased data at inference. So Rosa is not a one-time clean-up: it becomes a permanent, auditable stage in your data pipeline. Train once, then run inference on your operational data as it arrives.

That is a feature, not a catch. It means fairness is continuously applied and continuously evidenced: every batch that passes through the pipeline leaves a manifest behind it.

Three ways in: portal, REST, MCP

Customer Portal

One-click in the browser, including Test 1, the Apple Card demo. Upload a CSV, run Diagnose or Remove, download the outputs and the manifest.

portal.rosadebias.com

REST API

An asynchronous job API at api.rosadebias.com/v1: submit a job, poll its status, fetch artifacts, report, and manifest.

API guide in the portal

MCP server

Eight tools over the Model Context Protocol (MCP), the open standard for connecting AI agents to tools: diagnose, remove bias, job status, report, artifacts, manifest, list jobs, cancel.

MCP guide in the portal

Honest scope

  • Rosa removes bias it can statistically detect, on one protected attribute per run (univariate, in this phase).
  • It preserves your data's distributions; it does not synthesise records. The only values Rosa fills in are missing cells, imputed the standard way before training (column mean for numeric, most frequent value for categorical).
  • It will decline to "debias" a signal it cannot measure, and it says so rather than producing a hollow result.
  • Residual bias is dataset-specific. Your manifest reports the figure for your data; we do not quote a universal number.

See it run before you believe it.

Test 1 runs in the browser in one click. No credit card.