GOV.UK Chat

Country United Kingdom

Deployment Status Pilot / Controlled Trial Phase

Confidence Confirmed

Implementing Agency Government Digital Service (GDS), Department for Science, Innovation and Technology (DSIT)

Overview

GOV.UK Chat is a Retrieval Augmented Generation (RAG) chatbot developed by the Government Digital Service (GDS) within the Department for Science, Innovation and Technology (DSIT) in the United Kingdom. The system is designed to provide citizens and businesses with quick, personalised answers to questions about government services, regulations, and guidance by drawing on the approximately 700,000 pages of content published on the GOV.UK website. Rather than requiring users to navigate through multiple pages of search results, GOV.UK Chat allows them to pose natural language questions and receive synthesised, conversational responses grounded in official government content.

The system works through a multi-stage RAG pipeline. When a user submits a question, the system first retrieves relevant content chunks from a vector database containing approximately 100,000 GOV.UK pages processed into roughly 700,000 chunks totalling 36.9 gigabytes of data. The retrieved content is then passed to a large language model which generates a natural language response based solely on the retrieved government content. Before the response reaches the user, it passes through safety guardrails that filter for quality, appropriateness, and adherence to tolerance thresholds. Each answer is presented alongside 'check this answer' links to the original GOV.UK source pages, enabling users to independently verify the information provided.

The project began in July 2023 as an internal experiment within GDS. The initial technical architecture used OpenAI's GPT-3.5-turbo-16k model accessed via API, with a Qdrant vector store for document retrieval and the LangChain framework for integration, all hosted on Google Cloud. The system was tested through five phased experiments, beginning with internal testing and progressing to a scaled pilot with 1,000 invited users accessed through 'magic links' on selected GOV.UK business pages. During this initial phase, nearly 70 percent of users found the chatbot's responses useful and approximately 65 percent were satisfied with their experience, while the system achieved an accuracy threshold of 80 percent.

The technology stack evolved significantly as the project matured. By the time of the Algorithmic Transparency Record filing, the system had migrated to Anthropic's Claude Sonnet model (specifically Claude Sonnet-4, model ID eu.anthropic.claude-sonnet-4-20250514-v1:0) hosted on AWS Bedrock in the Ireland EU region with cross-regional inference capability. The embedding model was updated to Amazon Titan Embed Text v2. The application infrastructure runs on Ruby on Rails deployed on Kubernetes within AWS, with an AWS RDS PostgreSQL database (encrypted at rest) for data storage and Amazon OpenSearch for search indexing. Google BigQuery is used for analytics.

In November 2024, GDS launched a private beta of GOV.UK Chat, providing access to business users through links on selected business-related GOV.UK pages, with a waiting list managing capacity. The focus on business content was deliberate, as it represents a domain where users frequently need to navigate complex cross-departmental policies and regulations. Testing during this phase included up to 2,000 users over four-week periods. In July 2025, GOV.UK Chat was selected as one of the Prime Minister's AI Exemplars and designated as one of five 'kickstarters' in the UK blueprint for modern digital government.

Safety and security have been central to the system's development. GDS conducted a Data Protection Impact Assessment (completed September 2025) and a Secure by Design framework review. The system implements multiple layers of protection: incoming user queries are screened via regex for phone numbers, email addresses, and card numbers, with queries containing personal data rejected outright; GOV.UK pages likely to contain personal data are filtered out before vectorisation; and response guardrails perform LLM-based filtering to catch outputs outside tolerance levels. Extensive red teaming exercises were conducted in collaboration with cyber security experts from CDDO, Number 10 Data Science, and the i.AI team, as well as the AI Security Institute. During testing, more than 500 jailbreak attempts were successfully blocked by existing safeguards, though the team acknowledges that it is not possible to guarantee no jailbreaking attempts will ever be successful.

The system's accuracy improved substantially over its development lifecycle. Answer accuracy rose from approximately 80 percent in the initial experiment to 90 percent in later evaluations, attributed both to advances in the underlying LLM technology and to GDS's own data science work on retrieval, chunking strategies, alternative embedding models, re-ranking, and improved prompt engineering. Evaluation uses a hybrid approach combining automated metrics (precision, recall, LLM-as-Judge) with manual assessment by subject matter experts from across government, including HMRC specialists who scored accuracy against content designer-written reference answers.

By late 2025 and into 2026, GDS began planning wider rollout, starting with deployment in the GOV.UK app (available on iOS and Android) before extending across the GOV.UK website. The team has also begun exploring experimental agentic AI capabilities, envisioning an evolution from a system that merely provides answers to one that can facilitate simple government transactions and hand off to departmental customer support when needed. Anthropic provided general advice and engineering support under a Joint Innovation Vehicle procurement arrangement, though no data access was granted for development purposes.

Classification

AI Capabilities

LLMs for content creation, transformation and modality conversion (primary)Perception and extraction from unstructured inputs

Use Cases

User communication and interaction (primary)

Social Protection Functions

Implementation/delivery chain: Outreach/communications/sensitisation (primary)

SP Pillar (Primary)

Social assistance

Programme Details

Programme Name	GOV.UK Chat
Programme Type	Other
System Level	Implementation/delivery chain

GOV.UK Chat is a RAG-based AI chatbot built by the Government Digital Service (GDS) to help citizens and businesses find information across the GOV.UK website through natural language queries. The system retrieves relevant content from approximately 700,000 GOV.UK content chunks and generates conversational responses grounded exclusively in official government guidance. It evolved from an internal experiment in July 2023 through private beta in November 2024 to planned wider rollout across the GOV.UK website and app in 2026.

Implementation Details

Implementation Type	Foundation model
Lifecycle Stage	Integration and Deployment
Model Provenance	API-accessed third-party
Compute Environment	Commercial cloud
Compute Provider	AWS (Amazon Web Services); previously Google Cloud during initial experiment
Sovereignty Quadrant	III — Compute-Intensive Cloud with safeguards
Data Residency	Regional
Cross-Border Transfer	With documented safeguards

Agentic AI

Is Agentic	Partial
Pipeline	Current system is a standard RAG pipeline (retrieve-generate-filter) without autonomous action. However, GDS has begun exploring experimental agentic AI capabilities to evolve from providing answers to facilitating simple government transactions and departmental handoffs.
Autonomy	Supervised
Override Points	Response guardrails filter all LLM outputs before delivery to users; GOV.UK AI Team monitors via admin system with audit logging; users encouraged to verify via source links

Risk & Oversight

Decision Criticality	Low
Human Oversight	HOTL
Development Process	Mix of in-house and third-party
Highest Risk Category	Model-related risks
Risk Assessment Status	Formal assessment

Risk Dimensions

Market, sovereignty and industry structure risks

Jurisdictional hosting riskUpstream model or API dependencyVendor lock-in

Model-related risks

Hallucination or misinformationModel security vulnerabilityOpacity or limited explainability

Impact Dimensions

Autonomy, human dignity and due process

Opaque or unexplained decision

Systemic and societal

Deepened digital divide

Safeguards

DPIA/AIA conductedData minimisation controlsHuman oversight protocol

Deployment & Outcomes

Deployment Status	Pilot / Controlled Trial Phase
Year Initiated	2023
Scale / Coverage	Private beta with up to 2,000 users per 4-week testing period; 1,000 users in initial scaled pilot; up to 15,000 planned for next phase; targeting rollout across GOV.UK website and app serving millions of users
Funding Source	Government (GDS/DSIT budget); Anthropic engineering support procured via Joint Innovation Vehicle
Technical Partners	Anthropic (LLM provider via AWS Bedrock, general advice and engineering support under Joint Innovation Vehicle procurement); AWS (cloud infrastructure, Bedrock hosting, RDS PostgreSQL, OpenSearch); previously OpenAI (GPT-3.5-turbo-16k and later GPT-4o/GPT-4o mini during earlier phases); Google Cloud (initial hosting and BigQuery analytics)

Outcomes / Results

Initial experiment: nearly 70% of users found responses useful, approximately 65% satisfied with experience, 80% accuracy threshold achieved. Subsequent improvements raised accuracy from 76% to 90%. During beta testing, more than 500 jailbreak attempts were successfully blocked. Just under 80% of research participants understood that GOV.UK Chat can contain inaccurate information after onboarding. Users showed preference for GOV.UK Chat over traditional search and navigation methods, particularly for complex cross-departmental queries.

Challenges

Hallucination risk remains inherent to LLM-based systems and cannot be fully eliminated despite RAG grounding. Initial chunking approach using whole pages caused token limit errors with long pages. Quality assurance at scale is labour-intensive; manual evaluation techniques used in early phases are not practical for full deployment. Users may over-trust GOV.UK Chat responses due to the credibility of the GOV.UK brand. Accuracy below 100% on a government website raises concerns given the duty of care associated with official guidance.

Sources

SRC-001-GBR-002 Central Digital and Data Office (2025) Artificial Intelligence Playbook for the UK Government. London: CDDO. Available at: https://www.gov.uk/government/publications/ai-playbook-for-the-uk-government/artificial-intelligence-playbook-for-the-uk-government-html (Accessed: 24 March 2026).
https://www.gov.uk/government/publications/ai-playbook-for-the-uk-government/artificial-intelligence-playbook-for-the-uk-government-html
SRC-002-GBR-002 Department for Science, Innovation and Technology (2025) 'DSIT: GOV.UK Chat', Algorithmic Transparency Recording Standard. Available at: https://www.gov.uk/algorithmic-transparency-records/dsit-gov-dot-uk-chat (Accessed: 24 March 2026).
https://www.gov.uk/algorithmic-transparency-records/dsit-gov-dot-uk-chat
SRC-003-GBR-002 Dub, S. and Davey, J. (2024) 'We're running a private beta of GOV.UK Chat', Inside GOV.UK Blog, 5 November. Available at: https://insidegovuk.blog.gov.uk/2024/11/05/were-running-a-private-beta-of-gov-uk-chat/ (Accessed: 24 March 2026).
https://insidegovuk.blog.gov.uk/2024/11/05/were-running-a-private-beta-of-gov-uk-chat/
SRC-004-GBR-002 GDS (2024) 'The findings of our first generative AI experiment: GOV.UK Chat', Inside GOV.UK Blog, 18 January. Available at: https://insidegovuk.blog.gov.uk/2024/01/18/the-findings-of-our-first-generative-ai-experiment-gov-uk-chat/ (Accessed: 24 March 2026).
https://insidegovuk.blog.gov.uk/2024/01/18/the-findings-of-our-first-generative-ai-experiment-gov-uk-chat/
SRC-005-GBR-002 GDS (2025) 'GOV.UK has entered the Chat: our vision for GOV.UK Chat', Inside GOV.UK Blog, 16 December. Available at: https://insidegovuk.blog.gov.uk/2025/12/16/gov-uk-has-entered-the-chat-our-vision-for-gov-uk-chat/ (Accessed: 24 March 2026).
https://insidegovuk.blog.gov.uk/2025/12/16/gov-uk-has-entered-the-chat-our-vision-for-gov-uk-chat/

How to Cite

DCI AI Hub (2026). 'GOV.UK Chat', AI Hub AI Tracker, case GBR-002. Digital Convergence Initiative. Available at: https://socialprotectionai.org/use-case/GBR-002

Back to case page