DCI AI Hub — AI Tracker socialprotectionai.org/use-case/AUS-001

AUS-001 Exported 4 July 2026

Machine-Learning Early Warning System for Income Support Recipients (Australia — Research Prototype)

Country Australia

Deployment Status Design & Development Phase

Confidence Confirmed

Implementing Agency University of Exeter Business School; RMIT University; University of Melbourne (ARC grant administrator). Department of Social Services (DSS) / Services Australia (Centrelink) as custodians of the DOMINO administrative data. IZA Institute of Labor Economics published the working paper but was not a research partner.

Overview

The Machine-Learning Predictive Recertification Targeting system is a research prototype developed by Dario Sansone of the University of Exeter Business School and Anna Zhu of RMIT University, using Australian government administrative data to predict the intensity and duration of income support receipt among welfare enrollees in the Centrelink social security system. The system was designed to forecast the proportion of time each individual would remain on income support over a subsequent four-year horizon, with the explicit aim of identifying individuals at highest risk of long-term welfare dependency so that early intervention programmes and recertification review processes could be targeted more effectively (Sansone and Zhu, 2021, IZA DP 14377, p. 1).

The research uses the DOMINO (Data Over Multiple Individual Occurrences) longitudinal administrative dataset, which is maintained by the Australian Department of Social Services and captures individuals' interactions with the welfare system without identifiable information such as names and addresses (DSS Aristotle Metadata Registry). DOMINO contains daily-frequency records of income support receipt status from 2000 onwards, covering over 32 million persons who had any contact with the Centrelink system during that period (Sansone and Zhu, 2021, p. 5). The data are high quality because the government relies on this exact information to determine eligibility for payments: an individual's payment amount is a direct function of their income, wealth, savings, household structure, and other socio-economic factors, and these data are reconciled with Australian Tax Office records to ensure accuracy (Sansone and Zhu, 2021, p. 3). Recipients' eligibility for payments is assessed regularly, and recipients are required to report changes such as to relationship status, earnings, or living conditions within 14 days of the change (Sansone and Zhu, 2021, p. 10). The dataset includes information on demographics (sex, age, country of birth, and Indigenous status), household structure, government benefit receipt history by type, personal relationships, employment and underemployment, work instability, location and residential mobility, housing, education, income, and wealth (Sansone and Zhu, 2021, p. 13). In total, approximately 1,800 possible predictive features were constructed from these administrative records (Sansone and Zhu, 2021, p. 15).

The research was funded through Australian Research Council Linkage Project LP170100472 (Sansone and Zhu, 2021, acknowledgements footnote, p. 3). The analytical sample covers the period 2014 to 2018, using 2014 as the base year for predictive features and measuring welfare receipt intensity from 2015 to 2018. A 1% random sample of approximately 50,615 individuals aged 15 to 66 was drawn from the full population for computational reasons (Sansone and Zhu, 2021, p. 10-11).

The technical approach uses an ensemble of off-the-shelf classical machine-learning algorithms: LASSO (a regularised regression method), Support Vector Regression, and Boosting (gradient-boosted trees allowing up to 6-way interactions between input variables). The data were split into an 80% training sample and a 20% hold-out test sample for out-of-sample performance evaluation (Sansone and Zhu, 2021, pp. 15-16). The ensemble method, which combines predictions from all three algorithms using weighted linear regression, achieved the best performance overall (Sansone and Zhu, 2021, p. 18).

In terms of performance, the machine-learning ensemble achieved an out-of-sample R-squared exceeding 76%, representing at least a 22% improvement (approximately 14-percentage-point increase in R-squared) compared to the best-performing OLS heuristic model and standard early warning systems currently in use (Sansone and Zhu, 2021, p. 18; University of Exeter, 2021). The authors conducted back-of-the-envelope calculations showing that individuals identified by the ML model as long-term recipients accrued an additional welfare cost of approximately AUD 0.99 billion compared with comparably sized groups identified under the existing actuarial profiling approach used in the government's Try, Test and Learn programme, representing roughly 10% of total annual unemployment benefit expenditure (Sansone and Zhu, 2021, p. 18). The ML algorithms also identified new powerful predictors not commonly associated with long-term welfare receipt, including annual income variability, residential relocation frequency, and failure to meet mutual obligation criteria (Austaxpolicy, 2021).

The relevance to recertification and exit decisions lies in the system's ability to predict which individuals are most likely to remain on income support for extended periods, thereby enabling targeted recertification scheduling and resource allocation for exit-focused interventions. The paper explicitly notes that Australia's income support payments are strictly means-tested with regular eligibility assessment, and that recipients who fail to comply with mutual obligation requirements such as activity tests and job search can face sanctions including loss of payments (Sansone and Zhu, 2021, pp. 7-9). The ML predictions could inform which recipients receive more intensive casework review and which can be managed with lighter-touch recertification processes.

The human oversight model envisaged by the researchers is explicitly advisory and complementary to caseworker expertise. The authors state that the algorithms should not replace human expertise but rather act as its complement, allowing caseworkers to focus their attention and time providing personalised service and targeting appropriate support to individuals that the algorithm identifies as most at risk (University of Exeter, 2021; IZA Newsroom, 2021). The authors also advocate for a system to monitor and audit automated decision-making, referencing the Australian Robodebt scandal as a cautionary example of the potential harms from automated welfare systems (Austaxpolicy, 2021). The predictive models can reduce conscious and unconscious biases common in human decision-making by avoiding arbitrary selection of predictors or subgroups, and have the potential to prevent cream-skimming practices where employment service providers target individuals with easier-to-achieve outcomes (Sansone and Zhu, 2021, p. 6).

The authors acknowledge limitations of the predictive approach: prediction is only a first step, and policymakers additionally require evidence on the effectiveness of specific interventions, which can only be obtained through causal methodologies such as randomised controlled trials rather than predictive modelling alone (IZA Newsroom, 2021; Sansone and Zhu, 2021, p. 7). Furthermore, the ML algorithms would need to be retrained using data from economic downturns to ensure continued accuracy during recessionary periods (Sansone and Zhu, 2021, p. 25). The authors also note persistent scepticism regarding accuracy concerns and bias reinforcement in algorithmic systems (IZA Newsroom, 2021). As of the most recent verification, this system remains a research prototype and has not been operationally deployed within Services Australia or any other Australian government agency.

Classification

AI Capabilities

Prediction (including forecasting) (primary)Clustering (similarity and grouping)

Use Cases

Vulnerability, needs and risk assessment, including predictive analytics (primary)Decision support for eligibility and benefits

Social Protection Functions

Implementation/delivery chain: Assessment of needs/conditions (primary)Implementation/delivery chain: Case managementImplementation/delivery chain: Profiling, job matching and support services

SP Pillar (Primary)

Social assistance

Programme Details

Programme Name	Centrelink Income Support System (research prototype for ML-based recertification targeting)
Programme Type	Other
System Level	Implementation/delivery chain

Australia's Centrelink income support system administered by the Department of Social Services (DSS) / Services Australia, covering six main categories of means-tested payments: student payments, unemployment payments, parenting payments, disability payment, carer payment, and age pension. The ML model was developed as a research prototype to predict long-term income support receipt intensity and inform targeted recertification scheduling.

Implementation Details

Implementation Type	Classical ML
Lifecycle Stage	Model Selection and Training
Model Provenance	Developed in-house
Compute Environment	Not documented
Sovereignty Quadrant	Not assessed
Data Residency	Not documented
Cross-Border Transfer	Not documented

Risk & Oversight

Decision Criticality	Moderate
Human Oversight	HITL
Development Process	Fully in-house
Highest Risk Category	Model-related risks
Risk Assessment Status	Informal assessment

Risk Dimensions

Data-related risks

Data or concept driftRepresentation bias

Governance and institutional oversight risks

Purpose limitation failureWeak documentation or auditability

Model-related risks

Opacity or limited explainabilityShortcut learning and proxy relianceSubgroup bias

Operational and system integration risks

Automation complacencyInadequate real-world validation

Impact Dimensions

Autonomy, human dignity and due process

Opaque or unexplained decisionPsychological stress, stigma or dignity harm

Equality, non-discrimination, fairness and inclusion

Discriminatory outcomeReinforcement of structural inequitySystematic exclusion from benefits or services

Privacy and data security

Disproportionate surveillance or profiling

Safeguards

Data minimisation controlsHuman oversight protocol

Deployment & Outcomes

Deployment Status	Design & Development Phase
Year Initiated	2018
Scale / Coverage	1% random sample of ~5 million working-age Centrelink registrants (50,615 individuals) from a total population of ~32 million persons in DOMINO; research dataset only, not operational coverage
Funding Source	Australian Research Council Linkage Project LP170100472 (AUD 320,000; 2 July 2018 to 31 December 2024)
Technical Partners	No commercial vendor identified. Models developed by academic researchers (Dario Sansone, University of Exeter; Anna Zhu, RMIT University) using off-the-shelf ML algorithms (LASSO, SVR, Boosting). No operational deployment platform documented.

Outcomes / Results

ML ensemble achieves out-of-sample R-squared exceeding 76%, representing at least a 22% improvement (approximately 14-percentage-point increase in R-squared) compared to the best OLS heuristic model. Individuals identified by ML accrue approximately AUD 0.99 billion more in welfare costs than those identified under existing actuarial profiling (Try, Test and Learn programme), representing roughly 10% of total annual unemployment benefit expenditure. Novel predictors identified include annual income variability, residential relocation frequency, and failure to meet mutual obligation criteria. Approach is low-cost as it uses administrative data already available to caseworkers.

Challenges

Prediction is only a first step; evidence on intervention effectiveness requires causal methods such as RCTs. ML algorithms would need retraining with economic downturn data to maintain accuracy during recessions. Persistent scepticism regarding accuracy concerns and bias reinforcement in algorithmic systems. No operational deployment documented despite research completion.

Sources

SRC-001-AUS-001 Sansone, D. and Zhu, A. (2021) 'Using Machine Learning to Create an Early Warning System for Welfare Recipients', IZA Discussion Paper No. 14377. Bonn: Institute of Labor Economics.
https://docs.iza.org/dp14377.pdf
SRC-005-AUS-001 Sansone, D. and Zhu, A. (2021) 'Machine Learning in the Welfare System', Austaxpolicy: The Tax and Transfer Policy Blog, 24 June. Available at: https://www.austaxpolicy.com/machine-learning-in-the-welfare-system/ (Accessed: 23 March 2026).
https://www.austaxpolicy.com/machine-learning-in-the-welfare-system/
SRC-002-AUS-001 Sansone, D. and Zhu, A. (2023) 'Using Machine Learning to Create an Early Warning System for Welfare Recipients', Oxford Bulletin of Economics and Statistics, 85(5), pp. 959-992. doi:10.1111/obes.12550.
https://onlinelibrary.wiley.com/doi/10.1111/obes.12550
SRC-003-AUS-001 Department of Social Services (2017) DOMINO (Data Over Multiple Individual Occurrences) - Dataset Standard Release, External Analytical Version. Canberra: Australian Government Department of Social Services.
https://dss.aristotlecloud.io/item/1942/dataset/domino-dataset-standard-release-external-version-f
SRC-004-AUS-001 IZA Institute of Labor Economics (2021) 'Machine Learning in the Welfare System', IZA Newsroom, 23 June. Available at: https://newsroom.iza.org/en/archive/research/machine-learning-in-the-welfare-system/ (Accessed: 23 March 2026).
https://newsroom.iza.org/en/archive/research/machine-learning-in-the-welfare-system/

How to Cite

DCI AI Hub (2026). 'Machine-Learning Early Warning System for Income Support Recipients (Australia — Research Prototype)', AI Hub AI Tracker, case AUS-001. Digital Convergence Initiative. Available at: https://socialprotectionai.org/use-case/AUS-001

Back to case page