Machine-Learning Early Warning System for Income Support Recipients (Australia

University of Exeter Business School; RMIT University; University of Melbourne (ARC grant administrator). Department of Social Services (DSS) / Services Australia (Centrelink) as custodians of the DOMINO administrative data. IZA Institute of Labor Economics published the working paper but was not a research partner.

At a Glance

What it does Prediction (including forecasting) — Vulnerability, needs and risk assessment, including predictive analytics

Who runs it University of Exeter Business School; RMIT University; University of Melbourne (ARC grant administrator). Department of Social Services (DSS) / Services Australia (Centrelink) as custodians of the DOMINO administrative data. IZA Institute of Labor Economics published the working paper but was not a research partner.

Programme Centrelink Income Support System (research prototype for ML-based recertification targeting)

Confidence Confirmed

Deployment Status Design & Development Phase

Key Risks Model-related risks

Key Outcomes ML ensemble achieves out-of-sample R-squared exceeding 76%, representing at least a 22% improvement (approximately 14-percentage-point increase in R-squared) compared to the best OLS heuristic model.

Source Quality 5 sources — Working paper / technical note, Academic journal article, Dataset / database, +1 more

The Machine-Learning Predictive Recertification Targeting system is a research prototype developed by Dario Sansone of the University of Exeter Business School and Anna Zhu of RMIT University, using Australian government administrative data to predict the intensity and duration of income support receipt among welfare enrollees in the Centrelink social security system. The system was designed to forecast the proportion of time each individual would remain on income support over a subsequent four-year horizon, with the explicit aim of identifying individuals at highest risk of long-term welfare dependency so that early intervention programmes and recertification review processes could be targeted more effectively (Sansone and Zhu, 2021, IZA DP 14377, p. 1).

The research uses the DOMINO (Data Over Multiple Individual Occurrences) longitudinal administrative dataset, which is maintained by the Australian Department of Social Services and captures individuals' interactions with the welfare system without identifiable information such as names and addresses (DSS Aristotle Metadata Registry). DOMINO contains daily-frequency records of income support receipt status from 2000 onwards, covering over 32 million persons who had any contact with the Centrelink system during that period (Sansone and Zhu, 2021, p. 5). The data are high quality because the government relies on this exact information to determine eligibility for payments: an individual's payment amount is a direct function of their income, wealth, savings, household structure, and other socio-economic factors, and these data are reconciled with Australian Tax Office records to ensure accuracy (Sansone and Zhu, 2021, p. 3). Recipients' eligibility for payments is assessed regularly, and recipients are required to report changes such as to relationship status, earnings, or living conditions within 14 days of the change (Sansone and Zhu, 2021, p. 10). The dataset includes information on demographics (sex, age, country of birth, and Indigenous status), household structure, government benefit receipt history by type, personal relationships, employment and underemployment, work instability, location and residential mobility, housing, education, income, and wealth (Sansone and Zhu, 2021, p. 13). In total, approximately 1,800 possible predictive features were constructed from these administrative records (Sansone and Zhu, 2021, p. 15).

The research was funded through Australian Research Council Linkage Project LP170100472 (Sansone and Zhu, 2021, acknowledgements footnote, p. 3). The analytical sample covers the period 2014 to 2018, using 2014 as the base year for predictive features and measuring welfare receipt intensity from 2015 to 2018. A 1% random sample of approximately 50,615 individuals aged 15 to 66 was drawn from the full population for computational reasons (Sansone and Zhu, 2021, p. 10-11).

The technical approach uses an ensemble of off-the-shelf classical machine-learning algorithms: LASSO (a regularised regression method), Support Vector Regression, and Boosting (gradient-boosted trees allowing up to 6-way interactions between input variables). The data were split into an 80% training sample and a 20% hold-out test sample for out-of-sample performance evaluation (Sansone and Zhu, 2021, pp. 15-16). The ensemble method, which combines predictions from all three algorithms using weighted linear regression, achieved the best performance overall (Sansone and Zhu, 2021, p. 18).

In terms of performance, the machine-learning ensemble achieved an out-of-sample R-squared exceeding 76%, representing at least a 22% improvement (approximately 14-percentage-point increase in R-squared) compared to the best-performing OLS heuristic model and standard early warning systems currently in use (Sansone and Zhu, 2021, p. 18; University of Exeter, 2021). The authors conducted back-of-the-envelope calculations showing that individuals identified by the ML model as long-term recipients accrued an additional welfare cost of approximately AUD 0.99 billion compared with comparably sized groups identified under the existing actuarial profiling approach used in the government's Try, Test and Learn programme, representing roughly 10% of total annual unemployment benefit expenditure (Sansone and Zhu, 2021, p. 18). The ML algorithms also identified new powerful predictors not commonly associated with long-term welfare receipt, including annual income variability, residential relocation frequency, and failure to meet mutual obligation criteria (Austaxpolicy, 2021).

The relevance to recertification and exit decisions lies in the system's ability to predict which individuals are most likely to remain on income support for extended periods, thereby enabling targeted recertification scheduling and resource allocation for exit-focused interventions. The paper explicitly notes that Australia's income support payments are strictly means-tested with regular eligibility assessment, and that recipients who fail to comply with mutual obligation requirements such as activity tests and job search can face sanctions including loss of payments (Sansone and Zhu, 2021, pp. 7-9). The ML predictions could inform which recipients receive more intensive casework review and which can be managed with lighter-touch recertification processes.

The human oversight model envisaged by the researchers is explicitly advisory and complementary to caseworker expertise. The authors state that the algorithms should not replace human expertise but rather act as its complement, allowing caseworkers to focus their attention and time providing personalised service and targeting appropriate support to individuals that the algorithm identifies as most at risk (University of Exeter, 2021; IZA Newsroom, 2021). The authors also advocate for a system to monitor and audit automated decision-making, referencing the Australian Robodebt scandal as a cautionary example of the potential harms from automated welfare systems (Austaxpolicy, 2021). The predictive models can reduce conscious and unconscious biases common in human decision-making by avoiding arbitrary selection of predictors or subgroups, and have the potential to prevent cream-skimming practices where employment service providers target individuals with easier-to-achieve outcomes (Sansone and Zhu, 2021, p. 6).

The authors acknowledge limitations of the predictive approach: prediction is only a first step, and policymakers additionally require evidence on the effectiveness of specific interventions, which can only be obtained through causal methodologies such as randomised controlled trials rather than predictive modelling alone (IZA Newsroom, 2021; Sansone and Zhu, 2021, p. 7). Furthermore, the ML algorithms would need to be retrained using data from economic downturns to ensure continued accuracy during recessionary periods (Sansone and Zhu, 2021, p. 25). The authors also note persistent scepticism regarding accuracy concerns and bias reinforcement in algorithmic systems (IZA Newsroom, 2021). As of the most recent verification, this system remains a research prototype and has not been operationally deployed within Services Australia or any other Australian government agency.

Classifications follow the DCI AI Hub Taxonomy. Hover over field labels for definitions.

AI Capabilities

Use Cases

Social Protection Functions

Implementation/delivery chain

Social assistance

Programme Name Centrelink Income Support System (research prototype for ML-based recertification targeting)

Other

Implementation/delivery chain

Programme Description Australia's Centrelink income support system administered by the Department of Social Services (DSS) / Services Australia, covering six main categories of means-tested payments: student payments, unemployment payments, parenting payments, disability payment, carer payment, and age pension. The ML model was developed as a research prototype to predict long-term income support receipt intensity and inform targeted recertification scheduling.

Classical ML

Model Selection and Training

Developed in-house

Not documented

Informal assessment

Risk Dimensions

Data-related risks

Governance and institutional oversight risks

Model-related risks

Operational and system integration risks

Impact Dimensions

Autonomy, human dignity and due process

Equality, non-discrimination, fairness and inclusion

Privacy and data security

Data minimisation controls
Human oversight protocol

Category	Sensitivity	Cross-System Linkage	Availability	Key Constraints
Administrative data from other sectors	Special category	Links data across multiple systems	Currently available and used	Linked records from Australian Tax Office (income reconciliation), employment, and education systems; used for eligibility verification and feature construction; approximately 1,800 predictive features derived
Beneficiary registries and MIS	Special category	Links data across multiple systems	Currently available and used	DOMINO longitudinal dataset maintained by DSS; daily-frequency income support receipt records from 2000 onwards for ~32 million persons; de-identified; access requires formal approval through DSS data integration portal
Social registries	Personal	Links data across multiple systems	Currently available and used	Demographic and household composition data from Centrelink registration: sex, age, country of birth, Indigenous status, household structure, marital status, migration status, residential location

Sansone, D. and Zhu, A. (2021) 'Using Machine Learning to Create an Early Warning System for Welfare Recipients', IZA Discussion Paper No. 14377. Bonn: Institute of Labor Economics.

View source Working paper / technical note

Sansone, D. and Zhu, A. (2021) 'Machine Learning in the Welfare System', Austaxpolicy: The Tax and Transfer Policy Blog, 24 June. Available at: https://www.austaxpolicy.com/machine-learning-in-the-welfare-system/ (Accessed: 23 March 2026).

View source Working paper / technical note

Sansone, D. and Zhu, A. (2023) 'Using Machine Learning to Create an Early Warning System for Welfare Recipients', Oxford Bulletin of Economics and Statistics, 85(5), pp. 959-992. doi:10.1111/obes.12550.

View source Academic journal article

Department of Social Services (2017) DOMINO (Data Over Multiple Individual Occurrences) - Dataset Standard Release, External Analytical Version. Canberra: Australian Government Department of Social Services.

View source Dataset / database

IZA Institute of Labor Economics (2021) 'Machine Learning in the Welfare System', IZA Newsroom, 23 June. Available at: https://newsroom.iza.org/en/archive/research/machine-learning-in-the-welfare-system/ (Accessed: 23 March 2026).

View source News article / media

Design & Development Phase

2018

1% random sample of ~5 million working-age Centrelink registrants (50,615 individuals) from a total population of ~32 million persons in DOMINO; research dataset only, not operational coverage

Australian Research Council Linkage Project LP170100472 (AUD 320,000; 2 July 2018 to 31 December 2024)

No commercial vendor identified. Models developed by academic researchers (Dario Sansone, University of Exeter; Anna Zhu, RMIT University) using off-the-shelf ML algorithms (LASSO, SVR, Boosting). No operational deployment platform documented.

Outcomes / Results ML ensemble achieves out-of-sample R-squared exceeding 76%, representing at least a 22% improvement (approximately 14-percentage-point increase in R-squared) compared to the best OLS heuristic model. Individuals identified by ML accrue approximately AUD 0.99 billion more in welfare costs than those identified under existing actuarial profiling (Try, Test and Learn programme), representing roughly 10% of total annual unemployment benefit expenditure. Novel predictors identified include annual income variability, residential relocation frequency, and failure to meet mutual obligation criteria. Approach is low-cost as it uses administrative data already available to caseworkers.

Challenges Prediction is only a first step; evidence on intervention effectiveness requires causal methods such as RCTs. ML algorithms would need retraining with economic downturn data to maintain accuracy during recessions. Persistent scepticism regarding accuracy concerns and bias reinforcement in algorithmic systems. No operational deployment documented despite research completion.

How to Cite

DCI AI Hub (2026). 'Machine-Learning Early Warning System for Income Support Recipients (Australia — Research Prototype)', AI Hub AI Tracker, case AUS-001. Digital Convergence Initiative. Available at: https://socialprotectionai.org/use-case/AUS-001 [Accessed: 17 May 2026].

Change History

Created 30 Mar 2026, 08:38

by v2-import (import)

Machine-Learning Early Warning System for Income Support Recipients (Australia — Research Prototype)

At a Glance

Overview

Classification

AI Capabilities

Use Cases

Social Protection Functions

Programme Details

Implementation Details

Risk & Oversight

Risk Dimensions

Impact Dimensions

Safeguards

Data Sources

Sources & Evidence (5)

Operational Details

How to Cite

Change History

Machine-Learning Early Warning System for Income Support Recipients (Australia — Research Prototype)

At a Glance

Overview

Classification

AI Capabilities

Use Cases

Social Protection Functions

Programme Details

Implementation Details

Risk & Oversight

Risk Dimensions

Impact Dimensions

Safeguards

Data Sources

Sources & Evidence (5)

Operational Details

How to Cite

Change History

Related Cases

Similar Cases