NLD-002

Rotterdam Municipal Welfare Fraud Prediction Algorithm

Download PDF
Netherlands Europe & Central Asia High income Suspended / Halted Confirmed

Municipality of Rotterdam (Gemeente Rotterdam)

At a Glance

What it does Classification — Compliance and integrity
Who runs it Municipality of Rotterdam (Gemeente Rotterdam)
Programme Rotterdam Municipal Social Assistance (Bijstandsuitkering)
Confidence Confirmed
Deployment Status Suspended / Halted
Key Risks Model-related risks
Key Outcomes Algorithm operated from 2017 to mid-2021, scoring approximately 30,000 welfare recipients annually and referring 1,000-1,500 for investigation each year.
Source Quality 6 sources — News article / media, Legal document / regulation

The Municipality of Rotterdam deployed a machine learning algorithm in 2017 to predict welfare fraud among its approximately 30,000 social assistance recipients. The system was developed by the international consulting firm Accenture, which promoted the technology in a white paper as producing 'unbiased citizen outcomes' and an 'ethical solution' to fraud detection. The algorithm was a Gradient Boosting Machine consisting of 500 stacked decision trees, each with up to nine decision points, trained on a dataset of approximately 12,707 past fraud investigations conducted between the system's introduction and its suspension. The model processed 315 input variables to generate a risk score between zero and one for each welfare recipient, and on average the top 1,000 to 1,500 highest-scoring individuals were selected for fraud investigation each year.

The 315 input variables encompassed a wide range of personal and behavioural characteristics. Demographic variables included age, gender, marital status, and parenthood status. Linguistic variables comprised approximately 20 measures of Dutch language proficiency, spoken language, and compliance with language requirements. Financial variables covered days with financial problems and income stability. Residential variables included neighbourhood, housing type, roommate status, and tenure duration. Critically, the system also incorporated subjective caseworker assessments including observations about a recipient's physical appearance, their perceived ability to 'convince and influence others', how outgoing they were, and the length of their most recent romantic relationship. The mixture of objective demographic data with subjective behavioural assessments created a system in which caseworker biases were encoded directly into the algorithmic scoring process.

The algorithm's training data suffered from a significant structural flaw: the dataset contained approximately 50 percent fraud cases, whereas the actual fraud rate in the welfare population was approximately 21 percent. This overrepresentation of positive fraud cases in the training data meant the model was calibrated against a distorted picture of reality, learning patterns associated with being investigated rather than patterns genuinely predictive of fraud. The investigations that generated the training data were themselves shaped by existing enforcement biases and caseworker discretion, meaning the algorithm learned to replicate and amplify pre-existing patterns of selective scrutiny rather than identifying fraud on a neutral basis.

A landmark investigation published in March 2023 by Lighthouse Reports and WIRED, titled 'Suspicion Machines', obtained unprecedented access to Rotterdam's algorithm source code, machine learning model file, and training data — the first time a European government had provided complete transparency into a welfare fraud detection algorithm. Rotterdam was the only city among dozens contacted across Europe willing to share the code behind its system. The investigation subjected the algorithm to rigorous fairness testing using two standard measures: statistical parity (whether demographic groups reached the high-risk threshold proportionally) and controlled statistical parity (isolating the effect of specific variables by creating dataset copies with modified characteristics).

The fairness analysis revealed systematic discrimination across multiple protected characteristics. On gender, women were 1.25 times more likely than men to be flagged for investigation, an effect that intensified when combined with parenthood. Single mothers were classified as especially high-risk by the algorithm. On language and ethnicity, recipients who were not fluent in Dutch were almost twice as likely to be flagged as fluent speakers with otherwise identical profiles. The Netherlands Institute of Human Rights concluded that this constituted indirect discrimination on the basis of origin, since language proficiency correlates strongly with ethnic background and migration status. On age, the age variable was the single most important factor in the model — nearly three times more impactful than the second-ranked variable — with strong bias against younger recipients whose scores decreased systematically as they aged. On parenthood, parents were 1.7 times more likely to exceed the high-risk investigation threshold than non-parents, with single mothers facing compounded penalties from the intersection of gender, parenthood, and financial vulnerability.

Despite the extensive data collection and processing, independent analysis found that the algorithm's predictive performance was poor. ROC curve analysis demonstrated the system was only 50 percent more accurate than random selection of welfare recipients for investigation. An AI ethics expert who reviewed the system characterised its performance as 'essentially random guessing', suggesting that the elaborate surveillance infrastructure imposed substantial privacy costs on welfare recipients while delivering minimal improvement in fraud detection accuracy over simply selecting cases at random.

The algorithm was suspended in mid-2021 following a critical review by the Rekenkamer Rotterdam (Rotterdam Court of Audit), which found insufficient coordination between the algorithm's developers and the staff using it, potentially resulting in poor ethical decision-making. The audit also found that it was not possible for citizens to determine whether they had been flagged by the algorithm, and that some of the data used risked producing biased outputs. Rotterdam attempted to address the identified bias but ultimately concluded it was unable to eliminate the discrimination from the system. The city had taken over development from Accenture in 2018, but the fundamental architectural choices — including the selection of discriminatory input variables and the biased training data — persisted through subsequent iterations.

A data security incident also occurred during the investigation: Rotterdam accidentally revealed pseudonymised training data embedded in the HTML source code of histogram visualisations shared with Lighthouse Reports, which the city confirmed 'should not have happened'. The incident underscored the governance weaknesses surrounding the system's deployment and the challenges of maintaining data security in complex algorithmic systems.

The Rotterdam case became a centrepiece of broader European scrutiny of welfare surveillance algorithms. A European Parliament question was tabled in 2023 asking whether Rotterdam's fraud prediction algorithms constituted a violation of fundamental rights and the rule of law by the Dutch government. The case drew comparisons with the Dutch childcare benefits scandal (toeslagenaffaire), in which algorithmic profiling by the Dutch Tax Authority led to the wrongful accusation of thousands of predominantly minority families and contributed to the fall of the Rutte III government in January 2021. The Racism and Technology Center characterised Rotterdam's algorithm as 'racist technology in action', noting the systemic pattern of Dutch government agencies deploying discriminatory automated systems against vulnerable populations.

Classifications follow the DCI AI Hub Taxonomy. Hover over field labels for definitions.

Social Protection Functions

Implementation/delivery chain
Accountability mechanisms primaryCase management
SP Pillar (Primary) The social protection branch: social assistance, social insurance, or labour market programmes. Social assistance
Programme Name Rotterdam Municipal Social Assistance (Bijstandsuitkering)
Programme Type The type of social protection programme, classified under social assistance, social insurance, or labour market programmes. View in glossary Poverty targeted Cash Transfers (conditional or unconditional)
System Level Where in the social protection system the AI is applied: policy level, programme design, or implementation/delivery chain. View in glossary Implementation/delivery chain
Programme Description Rotterdam's municipal social assistance programme providing means-tested welfare payments to residents who have insufficient income and no other means of support. Administered by the municipality under the Dutch Participation Act (Participatiewet), covering approximately 30,000 recipients annually.
Implementation Type How the AI output is produced: Classical ML, Deep learning, Foundation model, or Hybrid. Affects validation, compute requirements, and governance profile. View in glossary Classical ML
Lifecycle Stage Current stage in the AI lifecycle, from problem identification through to monitoring, maintenance and decommissioning. View in glossary Monitoring, Maintenance and Decommissioning
Model Provenance Origin of the AI model: developed in-house, adapted from open-source, commercial/proprietary, or accessed via third-party API. View in glossary Commercial/proprietary
Compute Environment Where the AI system runs: on-premise, government cloud, commercial cloud, or edge/device. View in glossary Not documented
Sovereignty Quadrant Classification of data and compute sovereignty: I (Sovereign), II (Federated/Hybrid), III (Cloud with safeguards), or IV (Shared Innovation Zone). View in glossary Not assessed
Data Residency Where the data used by the AI system is stored: domestic, regional, or international. View in glossary Not documented
Cross-Border Transfer Whether data crosses national borders, and if so, whether documented safeguards are in place. View in glossary Not documented
Decision Criticality The rights impact of the decision the AI supports. High criticality requires HITL oversight; moderate requires HOTL; low may operate HOOTL. View in glossary High
Human Oversight Type Level of human involvement: Human-in-the-Loop (active review), Human-on-the-Loop (monitoring), or Human-out-of-the-Loop (periodic audit). View in glossary HOTL
Development Process Whether the AI system was developed fully in-house, through a mix of in-house and third-party, or fully by an external provider. View in glossary Fully third-party developed
Highest Risk Category The most significant structural risk source identified: data, model, operational, governance, or market/sovereignty risks. View in glossary Model-related risks
Risk Assessment Status Whether a formal risk assessment, informal assessment, or independent audit has been conducted for this system. Independent audit completed
Documented Risk Events Lighthouse Reports/WIRED 2023 investigation found systematic discrimination: women 1.25x more likely to be flagged; non-Dutch speakers almost 2x more likely; parents 1.7x more likely; age was the single most important variable with 3x impact. Netherlands Institute of Human Rights concluded language variable constituted indirect discrimination on basis of origin. ROC analysis showed system only 50% better than random selection. Training data overrepresented fraud cases (50% vs 21% actual rate). Rekenkamer Rotterdam audit found insufficient coordination and bias risks. Pseudonymised training data accidentally exposed in HTML source code. Algorithm suspended mid-2021 after Rotterdam was unable to eliminate discrimination.
  • Human oversight protocol
  • Independent evaluation
CategorySensitivityCross-System LinkageAvailabilityKey Constraints
Administrative data from other sectorsPersonalLinks data across multiple systemsCurrently available and usedEmployment records, income data, language proficiency assessment results, neighbourhood data, and housing information; Dutch language test results used as scoring variable with strong proxy effect for ethnic origin
Beneficiary registries and MISSpecial categoryLinks data across multiple systemsCurrently available and usedMunicipal welfare recipient records covering approximately 30,000 individuals; includes demographic data, benefit history, household composition, and caseworker assessment records; training dataset of 12,707 past fraud investigations with 50% fraud overrepresentation versus 21% actual rate
Unstructured and text-based contentSensitiveSingle source (no linkage)Currently available and usedSubjective caseworker assessments encoded as variables: physical appearance, perceived ability to 'convince and influence others', personality assessments, relationship history; encoded existing caseworker biases into algorithmic scoring

Constantaras, E., Geiger, G., Braun, J.A. and Mehrotra, D. (2023) 'This Algorithm Could Ruin Your Life', WIRED, 6 March. Available at: https://www.wired.com/story/welfare-algorithms-discrimination/ (Accessed: 26 March 2026).

View source News article / media

European Parliament (2023) 'Parliamentary Question: Rotterdam fraud prediction algorithms automating injustice: Dutch Government violating fundamental rights and the rule of law', E-000780/2023, European Parliament. Available at: https://www.europarl.europa.eu/doceo/document/E-9-2023-000780_EN.html (Accessed: 26 March 2026).

View source Legal document / regulation

Lighthouse Reports (2023) 'Suspicion Machines', Lighthouse Reports, March 2023. Available at: https://www.lighthousereports.com/investigation/suspicion-machines/ (Accessed: 26 March 2026).

View source News article / media

Lighthouse Reports (2023) 'Suspicion Machines Methodology', Lighthouse Reports, March 2023. Available at: https://www.lighthousereports.com/suspicion-machines-methodology/ (Accessed: 26 March 2026).

View source News article / media

Lighthouse Reports (2023) 'Inside a Fraud Prediction Algorithm', Lighthouse Reports, March 2023. Available at: https://www.lighthousereports.com/investigation/unlocking-a-welfare-fraud-prediction-algorithm/ (Accessed: 26 March 2026).

View source News article / media

Racism and Technology Center (2023) 'Racist Technology in Action: Rotterdam's welfare fraud prediction algorithm was biased', Racism and Technology Center, 17 March. Available at: https://racismandtechnology.center/2023/03/17/racist-technology-in-action-rotterdams-welfare-fraud-prediction-algorithm-was-biased/ (Accessed: 26 March 2026).

View source News article / media
Deployment Status How far the system has progressed into real-world operational use, from concept/exploration through to scaled and institutionalised. View in glossary Suspended / Halted
Year Initiated The year the AI system was first initiated or development began. 2017
Scale / Coverage The scale and geographic or population coverage of the deployment. Municipal; approximately 30,000 welfare recipients in Rotterdam scored annually; top 1,000-1,500 highest-scoring individuals selected for fraud investigation each year (2017-2021)
Funding Source The source(s) of funding for the AI system development and deployment. Municipality of Rotterdam operational budget; Accenture consulting contract (value not publicly disclosed)
Technical Partners External technology vendors, academic partners, or development partners involved. Accenture (algorithm development 2017-2018); Municipality of Rotterdam in-house development team (took over from 2018)
Outcomes / Results Algorithm operated from 2017 to mid-2021, scoring approximately 30,000 welfare recipients annually and referring 1,000-1,500 for investigation each year. Suspended after Rekenkamer Rotterdam audit identified bias risks that the city could not eliminate. Independent fairness analysis by Lighthouse Reports demonstrated discrimination across gender, ethnicity/language, age, and parenthood. System performance characterised as 'essentially random guessing' by AI ethics experts. Rotterdam was the only European city to provide full algorithmic transparency (source code, model file, training data).
Challenges Algorithm encoded caseworker biases through subjective assessment variables (appearance, personality, relationship history). Training data structurally biased with 50% fraud cases vs 21% actual rate. Gradient Boosting Machine architecture with 500 decision trees and 315 variables created an opaque system that could not be audited for bias without external investigation. City took over development from Accenture in 2018 but could not eliminate embedded discrimination. No mechanism for citizens to know they were algorithmically flagged. Pseudonymised data accidentally disclosed. European Parliament question raised about fundamental rights violations.

How to Cite

DCI AI Hub (2026). 'Rotterdam Municipal Welfare Fraud Prediction Algorithm', AI Hub AI Tracker, case NLD-002. Digital Convergence Initiative. Available at: https://socialprotectionai.org/use-case/NLD-002 [Accessed: 1 April 2026].

Change History

Created 30 Mar 2026, 08:41
by v2-import (import)