Ethnographic audits - Regular qualitative audits with community representatives to surface harms metrics miss. Deliberative buffers - Protected time in workflows for teams to review and contest automated outputs. Funding for discretion - Ring‑fenced budgets to support human review, exploratory research, and investigative journalism. Independent oversight - Create cross‑sector panels to review algorithmic deployments and publish findings.
Implementation pilots (practical, three‑month examples) Healthcare pilot: Introduce mandatory clinician override for 10% of triage referrals flagged as low‑priority; measure change in adverse events. Welfare pilot: Replace immediate sanctions with provisional holds and an automated prompt for claimant narrative; measure eviction and appeal rates. Employment pilot: Require a human shortlist review for any application filtered by keyword thresholds; measure diversity of interviewees.
Outcome metrics to track Rate of human overrides and reasons logged. Changes in adverse outcomes (missed diagnoses, sanctions, evictions).
Shared mechanisms to address Context flattening - Data inputs strip narrative detail; require narrative fields and contextual flags. Default deference - Staff accept system recommendations without interrogation; mandate named human sign‑off for high‑impact decisions. Feedback loops - Systems use enforcement data as validation; audit model inputs and separate enforcement metrics from risk signals. Incentive distortion - Performance metrics privilege measurable outputs; broaden KPIs to include qualitative civic outcomes. Capacity gaps - Understaffing makes automation necessary; invest in staffing and protected time for deliberation.
Cross‑sector policy recommendations Mandatory human oversight - All high‑stakes automated decisions must include a documented human reviewer with authority to override. Narrative requirement - Systems must capture a short contextual statement before finalising automated actions; narratives must be surfaced to reviewers. Provenance and transparency - Dashboards should display data provenance, model purpose, and limitations alongside recommendations.