The Instituto Nacional de la Seguridad Social (INSS) — Spain's national social security institute — deploys two XGBoost (gradient-boosted decision tree) machine learning algorithms to assess sick leave (incapacidad temporal) cases on a daily basis. The system generates a numerical score between 0 and 1 for each worker currently on sick leave, estimating the likelihood that the worker is ready to return to work. These scores are used to prioritise which cases medical inspectors should review first, effectively creating a 'digital waiting list' that controls the order of medical inspection appointments.
The scoring system operates on a four-tier scale: 0.00–0.30 indicates slow recovery expected (maintain leave), 0.31–0.60 indicates favourable progress (standard review scheduling), 0.61–0.80 indicates notable improvement (priority appointment), and 0.81–1.00 indicates imminent clearance (possible end of leave). The model draws on a range of input variables including gender, age, place of residence (which carries three times the statistical weight of specific medical diagnosis), medical diagnoses, duration of current leave, patient medical history, prior leave history, case type, medical reports from public health services, reports from mutual insurance companies (mutuas), and inspector assessments recorded in the Atrium internal application.
The system was built by SAS (a US-based analytics software company) and implemented by ViewNext (a Spanish subsidiary of IBM), at a cost of at least EUR 1 million based on procurement tender documents. It was deployed in 2018 and integrated into inspector workflows from 2018. The current model version has been operational unchanged since November 2020.
Critically, the system operated in secret for approximately five years (2018–2023), with no public disclosure of its existence or functioning. It was exposed in April 2023 through an investigation by Lighthouse Reports and El Confidencial, part of the cross-border 'Suspicion Machines' investigative series that also examined algorithmic welfare systems in the Netherlands (SyRI), Serbia, and other countries. Following exposure, the Spanish Ministry of Inclusion denied transparency requests from journalists, citing that disclosure would 'compromise essential public interests' and affect system 'efficacy'.
Internal performance evaluations have revealed significant quality concerns. The system has a documented internal validation error rate of 15.4%, meaning it fails in approximately one out of every six cases. In the first half of 2025, only 35.48% of algorithmically-selected workers received medical discharge, compared to 41.48% for cases selected manually by inspectors — meaning human judgment consistently outperforms the algorithm. Senior INSS officials have conceded the algorithms are 'not accurate', and expert Ana Valdivia of the Oxford Internet Institute described the false positive performance as 'poor' and 'unbalanced'. Medical inspectors working with the system daily have stated they 'are not able to explain what it is'. The system has been described as 'rendered effectively useless' due to chronic underfunding and inspector staff shortages across the INSS inspection corps.
The incapacidad temporal programme processed by this system represents a major fiscal commitment: 2024 national spending was EUR 16.5 billion (approximately 1.8% of GDP), with spending having increased 60% since 2017. An average of 1.6 million workers are on sick leave on any given day across Spain. The system would be classified as high-risk under the EU AI Act (Annex III — systems determining access to public benefits), requiring conformity assessments, transparency obligations, and human oversight provisions. Compliance with these forthcoming requirements has not been verified.