SWE-001

Försäkringskassan ML Risk Scoring for Fraud Investigations (Sweden — Suspended)

Download PDF
Sweden Europe & Central Asia High income Suspended / Halted Confirmed

Försäkringskassan (Swedish Social Insurance Agency)

At a Glance

What it does Classification — Compliance and integrity
Who runs it Försäkringskassan (Swedish Social Insurance Agency)
Programme Tillfällig föräldrapenning (Temporary Parental Benefit / VAB)
Confidence Confirmed
Deployment Status Suspended / Halted
Key Risks Model-related risks
Key Outcomes System flagged thousands of applicants annually for fraud investigations.
Source Quality 5 sources — Report (multilateral / development partner), Working paper / technical note, News article / media

The Försäkringskassan ML Risk Scoring system is a machine-learning-based risk profiling tool deployed by Sweden's Social Insurance Agency (Försäkringskassan) since 2013 to assign risk scores to social security benefit applicants and automatically flag high-scoring individuals for fraud investigations. The system was used primarily in the context of tillfällig föräldrapenning (temporary parental benefit), known colloquially as VAB (vård av barn), which provides income replacement at approximately 80 percent of estimated earnings to parents who stay home from work to care for a sick child (Lighthouse Reports, 2024; Amnesty International, 2024). Applicants assigned risk scores above a certain threshold by the ML model were automatically subjected to investigation by Försäkringskassan's 'control' department, which operates under an assumption of criminal intent, as distinct from standard caseworker reviews which carry no such presumption (Amnesty International, 2024).

The system was exposed as discriminatory in a landmark joint investigation published on 27 November 2024 by Lighthouse Reports and Swedish newspaper Svenska Dagbladet. The investigation analysed a dataset of 6,129 people selected for investigation in 2017, comprising 5,082 individuals flagged by the ML model and 1,047 randomly selected cases, drawn from approximately 977,730 total welfare applications that year (Lighthouse Reports, Methodology, 2024). The investigation team tested the algorithmic system against six standard statistical fairness metrics, including demographic parity, predictive parity, and false positive error rates, and found statistically significant discriminatory patterns across all metrics (Lighthouse Reports, Methodology, 2024).

The fairness analysis revealed that women were 1.5 times more likely than men to be selected for investigation by the model, despite random sampling showing that women do not make more mistakes on their benefit applications than men (Lighthouse Reports, 2024; Lighthouse Reports, Methodology, 2024). Individuals with foreign backgrounds — defined as born abroad or with both parents born abroad — were 2.5 times more likely to be flagged than those with Swedish backgrounds. People without university degrees were 3.31 times more likely to be selected, and those with below-median incomes were 2.97 times more likely to be flagged (Lighthouse Reports, Methodology, 2024). False positive error rate analysis showed even starker disparities: women without mistakes on their applications were 1.7 times more likely to be wrongly flagged; individuals with foreign backgrounds without mistakes were 2.4 times more likely; those without degrees were 3 times more likely; and below-median earners were 3 times more likely to be wrongly flagged (Lighthouse Reports, Methodology, 2024). All statistical findings were validated using bootstrapping with 10,000 resamples to generate p-values under the null hypothesis of zero difference (Lighthouse Reports, Methodology, 2024).

The investigation also found that the system's predictive parity was skewed, with the model being 1.08 times more precise for men than women, 1.20 times more precise for individuals with foreign backgrounds, 1.19 times more precise for non-degree holders, and 1.09 times more precise for below-median earners (Lighthouse Reports, Methodology, 2024). Notably, the system would have passed Försäkringskassan's own internal two-step fairness procedure, which the investigators characterised as having substantial gaps in threshold sensitivity and lacking intersectional analysis (Lighthouse Reports, Methodology, 2024). The pseudo-anonymised nature of the data released to investigators prevented merging across demographic characteristics, thereby precluding intersectional bias analysis (Lighthouse Reports, Methodology, 2024).

Prior to the 2024 investigation, concerns about the system had been raised internally on multiple occasions. In 2016, Sweden's Integrity Committee warned of 'citizen profiling' risks associated with the system (Lighthouse Reports, 2024). A 2018 report by ISF (Inspektionen för socialförsäkringen), Sweden's independent supervisory authority for social insurance, concluded that the algorithm 'in its current design does not meet equal treatment,' though Försäkringskassan disputed this analysis as resting on 'dubious grounds' (Computer Weekly, 2025; Amnesty International, 2024). In 2020, a data protection officer who previously worked for Försäkringskassan warned that the entire operation violated the European data protection regulation because the authority lacked a legal basis for profiling people (Computer Weekly, 2025; Amnesty International, 2024).

Fraud controllers empowered by high risk scores from the system had extensive investigative powers, including the ability to access applicants' social media accounts, obtain data from institutions such as schools and banks, and conduct interviews with neighbours (Amnesty International, 2024). The investigation also raised questions about the proportionality of the system's fraud narrative: in 2022, of 5,520 suspected fraud cases, only 166 resulted in convictions — a rate of just 3 percent (Lighthouse Reports, Methodology, 2024). The agency's estimated annual fraud loss of approximately EUR 113 million was found to be highly sensitive to arbitrary threshold choices, with the estimated fraud rate dropping from 24 percent to 6 percent of erroneous applications when the threshold for classifying a mistake as fraud was adjusted from two days to four days (Lighthouse Reports, Methodology, 2024).

Following the November 2024 investigation, Amnesty International published an analysis demanding the immediate discontinuation of the system, characterising it as violating rights to social security, equality, non-discrimination, and privacy. Amnesty's Senior Investigative Researcher David Nolan stated that 'the Swedish Social Insurance Agency's intrusive algorithms discriminate against people based on their gender, foreign background, income level, and level of education' and described the system as 'akin to a witch hunt against anyone who is flagged for social benefits fraud investigations' (Amnesty International, 2024). Amnesty drew explicit parallels to the Netherlands, where its 2021 'Xenophobic Machines' report exposed racial profiling in Dutch tax authority algorithms that falsely flagged childcare benefit claims as fraudulent, affecting tens of thousands of parents from ethnic minorities and low-income families (Amnesty International, 2024). Amnesty also referenced its November 2024 'Coded Injustice' report on AI-driven surveillance in Danish welfare systems and an October 2024 complaint against France's CNAF risk-scoring system (Amnesty International, 2024).

The Swedish Data Protection Authority (Integritetsskyddsmyndigheten, IMY) subsequently opened an inspection of Försäkringskassan. IMY lawyer Måns Lysén confirmed that 'while the inspection was ongoing, the Swedish Social Insurance Agency took the AI system out of use' (Computer Weekly, 2025). Försäkringskassan stated it discontinued use of the risk assessment profile 'in order to assess whether it complies with the new European AI regulation' and confirmed it had 'no plans to put it back into use since we now receive absence data from employers among other data, which is expected to provide a relatively good accuracy' (Computer Weekly, 2025). IMY closed its inspection after Försäkringskassan confirmed the system was no longer in use. The system's technical architecture remains opaque: Försäkringskassan refused to disclose the model's code, input variables, or training data to investigators, and the Lighthouse Reports methodology team noted that 'the model itself is a complete black box. We therefore do not understand how or why certain types of bias have manifested' (Lighthouse Reports, Methodology, 2024).

The case represents a significant cautionary example in the use of ML-based risk scoring in social protection systems, comparable to the Netherlands SyRI system and Denmark's welfare surveillance algorithms. The eight academic experts who reviewed the Lighthouse Reports methodology included scholars from institutions such as the Max Planck Institute for Intelligent Systems (Dr. Moritz Hardt), Carnegie Mellon University (Dr. Alexandra Chouldechova), NYU (Dr. Meredith Broussard), and Umeå University (Dr. Virginia Dignum) (Lighthouse Reports, Methodology, 2024). The investigation's complete dataset and analysis code were published on GitHub at github.com/Lighthouse-Reports/suspicion_machines_sweden (Lighthouse Reports, Methodology, 2024).

Classifications follow the DCI AI Hub Taxonomy. Hover over field labels for definitions.

Social Protection Functions

Implementation/delivery chain
Accountability mechanisms primaryAssessment of needs/conditions + enrolment
SP Pillar (Primary) The social protection branch: social assistance, social insurance, or labour market programmes. Social insurance
Programme Name Tillfällig föräldrapenning (Temporary Parental Benefit / VAB)
Programme Type The type of social protection programme, classified under social assistance, social insurance, or labour market programmes. View in glossary Maternity and paternity benefits (contributory)
System Level Where in the social protection system the AI is applied: policy level, programme design, or implementation/delivery chain. View in glossary Implementation/delivery chain
Programme Description Sweden's temporary parental benefit (tillfällig föräldrapenning), colloquially known as VAB (vård av barn), provides income replacement at approximately 80% of estimated earnings to parents who stay home from work to care for a sick child. Administered by Försäkringskassan, the Swedish Social Insurance Agency.
Implementation Type How the AI output is produced: Classical ML, Deep learning, Foundation model, or Hybrid. Affects validation, compute requirements, and governance profile. View in glossary Classical ML
Lifecycle Stage Current stage in the AI lifecycle, from problem identification through to monitoring, maintenance and decommissioning. View in glossary Monitoring, Maintenance and Decommissioning
Model Provenance Origin of the AI model: developed in-house, adapted from open-source, commercial/proprietary, or accessed via third-party API. View in glossary Not documented
Compute Environment Where the AI system runs: on-premise, government cloud, commercial cloud, or edge/device. View in glossary Not documented
Sovereignty Quadrant Classification of data and compute sovereignty: I (Sovereign), II (Federated/Hybrid), III (Cloud with safeguards), or IV (Shared Innovation Zone). View in glossary Not assessed
Data Residency Where the data used by the AI system is stored: domestic, regional, or international. View in glossary Not documented
Cross-Border Transfer Whether data crosses national borders, and if so, whether documented safeguards are in place. View in glossary Not documented
Decision Criticality The rights impact of the decision the AI supports. High criticality requires HITL oversight; moderate requires HOTL; low may operate HOOTL. View in glossary High
Human Oversight Type Level of human involvement: Human-in-the-Loop (active review), Human-on-the-Loop (monitoring), or Human-out-of-the-Loop (periodic audit). View in glossary HOTL
Development Process Whether the AI system was developed fully in-house, through a mix of in-house and third-party, or fully by an external provider. View in glossary Not documented
Highest Risk Category The most significant structural risk source identified: data, model, operational, governance, or market/sovereignty risks. View in glossary Model-related risks
Risk Assessment Status Whether a formal risk assessment, informal assessment, or independent audit has been conducted for this system. Formal assessment
Documented Risk Events November 2024: Joint investigation by Lighthouse Reports and Svenska Dagbladet revealed discriminatory profiling across gender, ethnicity, income, and education. 2018: ISF supervisory authority found the algorithm did not meet equal treatment standards. 2020: Former data protection officer warned system violated GDPR. 2025: System suspended following IMY (Swedish Data Protection Authority) inspection. Of 5,520 suspected fraud cases in 2022, only 166 (3%) resulted in convictions.
CategorySensitivityCross-System LinkageAvailabilityKey Constraints
Administrative data from other sectorsSpecial categoryLinks data across multiple systemsCurrently available and usedFraud controllers could access data from schools, banks, and other institutions during investigations triggered by the ML model; unclear whether these data sources fed into the model itself or were used only post-flagging
Beneficiary registries and MISSpecial categoryLinks data across multiple systemsCurrently available and usedFörsäkringskassan benefit application and claims data for temporary parental benefit (VAB); includes applicant demographics, benefit history, and application details; exact feature set undisclosed by the agency

Amnesty International (2024) 'Sweden: Authorities must discontinue discriminatory AI systems used by welfare agency', Amnesty International, 27 November. Available at: https://www.amnesty.org/en/latest/news/2024/11/sweden-authorities-must-discontinue-discriminatory-ai-systems-used-by-welfare-agency/ (Accessed: 25 March 2026).

View source Report (multilateral / development partner)

Lighthouse Reports (2024) 'How we investigated Sweden's Suspicion Machine', Lighthouse Reports, 27 November. Available at: https://www.lighthousereports.com/methodology/sweden-ai-methodology/ (Accessed: 25 March 2026).

View source Working paper / technical note

Granberg, S., Geiger, G., Tiberg, A., Braun, J.-C., Malmsten, H., Molén, T., Abdigadir, A., Ljungmark, I., Laurin, F., Constantaras, E. and Howden, D. (2024) 'Sweden's Suspicion Machine', Lighthouse Reports, 27 November. Available at: https://www.lighthousereports.com/investigation/swedens-suspicion-machine/ (Accessed: 25 March 2026).

View source News article / media

Ikumi, S. (2024) 'Swedish authorities urged to discontinue AI welfare system', Computer Weekly, 28 November. Available at: https://www.computerweekly.com/news/366616576/Swedish-authorities-urged-to-discontinue-AI-welfare-system (Accessed: 25 March 2026).

View source News article / media

Ikumi, S. (2025) 'Swedish welfare authorities suspend "discriminatory" AI model', Computer Weekly, 19 June. Available at: https://www.computerweekly.com/news/366634703/Swedish-welfare-authorities-suspend-discriminatory-AI-model (Accessed: 25 March 2026).

View source News article / media
Deployment Status How far the system has progressed into real-world operational use, from concept/exploration through to scaled and institutionalised. View in glossary Suspended / Halted
Year Initiated The year the AI system was first initiated or development began. 2013
Scale / Coverage The scale and geographic or population coverage of the deployment. Approximately 977,730 temporary parental benefit applications processed in 2017; 5,082 flagged by ML model for investigation that year; system operated nationally across all Försäkringskassan offices from 2013 until suspension in 2025
Outcomes / Results System flagged thousands of applicants annually for fraud investigations. In 2017, 5,082 of approximately 977,730 applications were algorithmically selected for investigation. However, the system's effectiveness was questioned: in 2022, only 3% of suspected fraud cases resulted in convictions (166 of 5,520). The agency's EUR 113 million annual fraud loss estimate was found to be highly sensitive to arbitrary threshold choices, with estimated fraud rates dropping from 24% to 6% when the classification threshold was adjusted from two days to four days.
Challenges Complete opacity of the ML model — Försäkringskassan refused to disclose code, input variables, or training data. System demonstrated statistically significant bias against women (1.5x overselection), individuals with foreign backgrounds (2.5x), people without university degrees (3.31x), and below-median earners (2.97x). Internal fairness procedure had substantial gaps and lacked intersectional analysis. Longstanding internal warnings from 2016 (Integrity Committee), 2018 (ISF report), and 2020 (data protection officer) were not acted upon. Legal basis for profiling questioned under GDPR and EU AI Act.

How to Cite

DCI AI Hub (2026). 'Försäkringskassan ML Risk Scoring for Fraud Investigations (Sweden — Suspended)', AI Hub AI Tracker, case SWE-001. Digital Convergence Initiative. Available at: https://socialprotectionai.org/use-case/SWE-001 [Accessed: 1 April 2026].

Change History

Updated 31 Mar 2026, 06:35
by system (system)
Created 30 Mar 2026, 08:41
by v2-import (import)