GOV.UK Chat
Overview
GOV.UK Chat is a Retrieval Augmented Generation (RAG) chatbot developed by the Government Digital Service (GDS) within the Department for Science, Innovation and Technology (DSIT) in the United Kingdom. The system is designed to provide citizens and businesses with quick, personalised answers to questions about government services, regulations, and guidance by drawing on the approximately 700,000 pages of content published on the GOV.UK website. Rather than requiring users to navigate through multiple pages of search results, GOV.UK Chat allows them to pose natural language questions and receive synthesised, conversational responses grounded in official government content.
The system works through a multi-stage RAG pipeline. When a user submits a question, the system first retrieves relevant content chunks from a vector database containing approximately 100,000 GOV.UK pages processed into roughly 700,000 chunks totalling 36.9 gigabytes of data. The retrieved content is then passed to a large language model which generates a natural language response based solely on the retrieved government content. Before the response reaches the user, it passes through safety guardrails that filter for quality, appropriateness, and adherence to tolerance thresholds. Each answer is presented alongside 'check this answer' links to the original GOV.UK source pages, enabling users to independently verify the information provided.
The project began in July 2023 as an internal experiment within GDS. The initial technical architecture used OpenAI's GPT-3.5-turbo-16k model accessed via API, with a Qdrant vector store for document retrieval and the LangChain framework for integration, all hosted on Google Cloud. The system was tested through five phased experiments, beginning with internal testing and progressing to a scaled pilot with 1,000 invited users accessed through 'magic links' on selected GOV.UK business pages. During this initial phase, nearly 70 percent of users found the chatbot's responses useful and approximately 65 percent were satisfied with their experience, while the system achieved an accuracy threshold of 80 percent.
The technology stack evolved significantly as the project matured. By the time of the Algorithmic Transparency Record filing, the system had migrated to Anthropic's Claude Sonnet model (specifically Claude Sonnet-4, model ID eu.anthropic.claude-sonnet-4-20250514-v1:0) hosted on AWS Bedrock in the Ireland EU region with cross-regional inference capability. The embedding model was updated to Amazon Titan Embed Text v2. The application infrastructure runs on Ruby on Rails deployed on Kubernetes within AWS, with an AWS RDS PostgreSQL database (encrypted at rest) for data storage and Amazon OpenSearch for search indexing. Google BigQuery is used for analytics.
In November 2024, GDS launched a private beta of GOV.UK Chat, providing access to business users through links on selected business-related GOV.UK pages, with a waiting list managing capacity. The focus on business content was deliberate, as it represents a domain where users frequently need to navigate complex cross-departmental policies and regulations. Testing during this phase included up to 2,000 users over four-week periods. In July 2025, GOV.UK Chat was selected as one of the Prime Minister's AI Exemplars and designated as one of five 'kickstarters' in the UK blueprint for modern digital government.
Safety and security have been central to the system's development. GDS conducted a Data Protection Impact Assessment (completed September 2025) and a Secure by Design framework review. The system implements multiple layers of protection: incoming user queries are screened via regex for phone numbers, email addresses, and card numbers, with queries containing personal data rejected outright; GOV.UK pages likely to contain personal data are filtered out before vectorisation; and response guardrails perform LLM-based filtering to catch outputs outside tolerance levels. Extensive red teaming exercises were conducted in collaboration with cyber security experts from CDDO, Number 10 Data Science, and the i.AI team, as well as the AI Security Institute. During testing, more than 500 jailbreak attempts were successfully blocked by existing safeguards, though the team acknowledges that it is not possible to guarantee no jailbreaking attempts will ever be successful.
The system's accuracy improved substantially over its development lifecycle. Answer accuracy rose from approximately 80 percent in the initial experiment to 90 percent in later evaluations, attributed both to advances in the underlying LLM technology and to GDS's own data science work on retrieval, chunking strategies, alternative embedding models, re-ranking, and improved prompt engineering. Evaluation uses a hybrid approach combining automated metrics (precision, recall, LLM-as-Judge) with manual assessment by subject matter experts from across government, including HMRC specialists who scored accuracy against content designer-written reference answers.
By late 2025 and into 2026, GDS began planning wider rollout, starting with deployment in the GOV.UK app (available on iOS and Android) before extending across the GOV.UK website. The team has also begun exploring experimental agentic AI capabilities, envisioning an evolution from a system that merely provides answers to one that can facilitate simple government transactions and hand off to departmental customer support when needed. Anthropic provided general advice and engineering support under a Joint Innovation Vehicle procurement arrangement, though no data access was granted for development purposes.
Classification
AI Capabilities
Use Cases
Social Protection Functions
| SP Pillar (Primary) | Social assistance |
Programme Details
| Programme Name | GOV.UK Chat |
| Programme Type | Other |
| System Level | Implementation/delivery chain |
GOV.UK Chat is a RAG-based AI chatbot built by the Government Digital Service (GDS) to help citizens and businesses find information across the GOV.UK website through natural language queries. The system retrieves relevant content from approximately 700,000 GOV.UK content chunks and generates conversational responses grounded exclusively in official government guidance. It evolved from an internal experiment in July 2023 through private beta in November 2024 to planned wider rollout across the GOV.UK website and app in 2026.
Implementation Details
| Implementation Type | Foundation model |
| Lifecycle Stage | Integration and Deployment |
| Model Provenance | API-accessed third-party |
| Compute Environment | Commercial cloud |
| Compute Provider | AWS (Amazon Web Services); previously Google Cloud during initial experiment |
| Sovereignty Quadrant | III — Compute-Intensive Cloud with safeguards |
| Data Residency | Regional |
| Cross-Border Transfer | With documented safeguards |
Agentic AI
| Is Agentic | Partial |
| Pipeline | Current system is a standard RAG pipeline (retrieve-generate-filter) without autonomous action. However, GDS has begun exploring experimental agentic AI capabilities to evolve from providing answers to facilitating simple government transactions and departmental handoffs. |
| Autonomy | Supervised |
| Override Points | Response guardrails filter all LLM outputs before delivery to users; GOV.UK AI Team monitors via admin system with audit logging; users encouraged to verify via source links |
Risk & Oversight
| Decision Criticality | Low |
| Human Oversight | HOTL |
| Development Process | Mix of in-house and third-party |
| Highest Risk Category | Model-related risks |
| Risk Assessment Status | Formal assessment |
Risk Dimensions
Market, sovereignty and industry structure risks
Model-related risks
Impact Dimensions
Autonomy, human dignity and due process
Systemic and societal
Safeguards
Deployment & Outcomes
| Deployment Status | Pilot / Controlled Trial Phase |
| Year Initiated | 2023 |
| Scale / Coverage | Private beta with up to 2,000 users per 4-week testing period; 1,000 users in initial scaled pilot; up to 15,000 planned for next phase; targeting rollout across GOV.UK website and app serving millions of users |
| Funding Source | Government (GDS/DSIT budget); Anthropic engineering support procured via Joint Innovation Vehicle |
| Technical Partners | Anthropic (LLM provider via AWS Bedrock, general advice and engineering support under Joint Innovation Vehicle procurement); AWS (cloud infrastructure, Bedrock hosting, RDS PostgreSQL, OpenSearch); previously OpenAI (GPT-3.5-turbo-16k and later GPT-4o/GPT-4o mini during earlier phases); Google Cloud (initial hosting and BigQuery analytics) |
Outcomes / Results
Initial experiment: nearly 70% of users found responses useful, approximately 65% satisfied with experience, 80% accuracy threshold achieved. Subsequent improvements raised accuracy from 76% to 90%. During beta testing, more than 500 jailbreak attempts were successfully blocked. Just under 80% of research participants understood that GOV.UK Chat can contain inaccurate information after onboarding. Users showed preference for GOV.UK Chat over traditional search and navigation methods, particularly for complex cross-departmental queries.
Challenges
Hallucination risk remains inherent to LLM-based systems and cannot be fully eliminated despite RAG grounding. Initial chunking approach using whole pages caused token limit errors with long pages. Quality assurance at scale is labour-intensive; manual evaluation techniques used in early phases are not practical for full deployment. Users may over-trust GOV.UK Chat responses due to the credibility of the GOV.UK brand. Accuracy below 100% on a government website raises concerns given the duty of care associated with official guidance.
Sources
- SRC-001-GBR-002 Central Digital and Data Office (2025) Artificial Intelligence Playbook for the UK Government. London: CDDO. Available at: https://www.gov.uk/government/publications/ai-playbook-for-the-uk-government/artificial-intelligence-playbook-for-the-uk-government-html (Accessed: 24 March 2026).
https://www.gov.uk/government/publications/ai-playbook-for-the-uk-government/artificial-intelligence-playbook-for-the-uk-government-html - SRC-002-GBR-002 Department for Science, Innovation and Technology (2025) 'DSIT: GOV.UK Chat', Algorithmic Transparency Recording Standard. Available at: https://www.gov.uk/algorithmic-transparency-records/dsit-gov-dot-uk-chat (Accessed: 24 March 2026).
https://www.gov.uk/algorithmic-transparency-records/dsit-gov-dot-uk-chat - SRC-003-GBR-002 Dub, S. and Davey, J. (2024) 'We're running a private beta of GOV.UK Chat', Inside GOV.UK Blog, 5 November. Available at: https://insidegovuk.blog.gov.uk/2024/11/05/were-running-a-private-beta-of-gov-uk-chat/ (Accessed: 24 March 2026).
https://insidegovuk.blog.gov.uk/2024/11/05/were-running-a-private-beta-of-gov-uk-chat/ - SRC-004-GBR-002 GDS (2024) 'The findings of our first generative AI experiment: GOV.UK Chat', Inside GOV.UK Blog, 18 January. Available at: https://insidegovuk.blog.gov.uk/2024/01/18/the-findings-of-our-first-generative-ai-experiment-gov-uk-chat/ (Accessed: 24 March 2026).
https://insidegovuk.blog.gov.uk/2024/01/18/the-findings-of-our-first-generative-ai-experiment-gov-uk-chat/ - SRC-005-GBR-002 GDS (2025) 'GOV.UK has entered the Chat: our vision for GOV.UK Chat', Inside GOV.UK Blog, 16 December. Available at: https://insidegovuk.blog.gov.uk/2025/12/16/gov-uk-has-entered-the-chat-our-vision-for-gov-uk-chat/ (Accessed: 24 March 2026).
https://insidegovuk.blog.gov.uk/2025/12/16/gov-uk-has-entered-the-chat-our-vision-for-gov-uk-chat/
How to Cite
DCI AI Hub (2026). 'GOV.UK Chat', AI Hub AI Tracker, case GBR-002. Digital Convergence Initiative. Available at: https://socialprotectionai.org/use-case/GBR-002