Service Assessment Question Bank
Sample questions based on the GOV.UK Service Standard, with AI-specific prompts added across the Alpha, Beta and Live phases plus a dedicated Tech pre-call set. Questions tagged AI highlight prompts that probe AI-specific risks (bias, hallucination, drift, fallback, oversight).
At a glance
Phase journey — click to explore
Assessor roles — click to filter
Any question that mentions AI capabilities, hallucination, drift, agentic behaviour, prompts or LLMs is tagged AI and shown with a purple left-border. Questions grouped under a "Standard X" or "Sub-phase" heading are also surfaced as AI-specific because they were added to the standard interview script for AI services.
How to use this navigator
- Search all — free-text search and combined filters across all 272 prompts.
- Alpha / Beta / Live / Tech pre-call — the original document order, grouped by assessor role.
- Coverage Matrix — cross-tab of roles against phases showing where assurance attention sits.
Search all questions
Combine filters to drill into the question bank. All filters are AND-combined.
Alpha assessment questions
64 question prompts across 5 role groupings. AI-specific questions are highlighted in purple.
- Who are your users?
- What are your users’ lives like?
- Why do they need the thing you’re planning to build or buy?
- Why has your department asked you to solve the problem you’re solving?
- How did you decide that the problem was worth solving?
- How did you identify which risky assumption(s) to test?
- How did you test those assumptions?
- What do you know about the offline parts of your users’ journey?
- How will you know your service joins up with those offline elements?
- Who are you inviting to your user research sessions?
- What are your plans for user research through the private beta stage? Is the budget sufficient?
- If using AI for user research, what checks have they done to make sure it's representative of humans and not biased.
- If they are using AI to synthesise research data, how have they verified that the "themes" it identifies accurately reflect the raw user feedback without losing nuance?
- What checks have they done to make sure the AI’s analysis is representative of humans and hasn’t introduced “hallucinations” or false insights?
- Are they using AI-generated “synthetic users” or personas? If so, how have they validated those findings against real-world testing with actual citizens?
- How have they audited their research tools for bias, particularly concerning marginalised groups or protected characteristics?
- If using AI for transcription or sentiment analysis, how does the tool handle diverse accents, dialects, or non-standard speech patterns?
- How have they tested for understandability for users, technological literacy and needs around impaired vision, hearing and others?
- Have they obtained explicit consent from research participants for their data (video, audio or notes) to be processed by an AI tool?
- Is the AI tool processing PII from research sessions, if so, how is that data being anonymised or deleted after analysis?
- What constraints are you working within? What are the barriers to removing or reducing the impact of those constraints?
- How do you know a service is the right way to solve the problem you’re looking at?
- How do you know you’ve got the scope of the service right?
- Talk us through a few of the prototype ideas you tested - and explain why you rejected them.
- How would the journey from GOV.UK work?
- What are you doing to make sure users don’t have to provide the same information multiple times?
- How are you planning to make sure the service doesn't exclude any existing or potential users?
- What is your approach to access needs and assisted digital needs?
- What support will you offer to users who have problems with the digital part of the service?
- Are they using existing government AI patterns (i.e. for chatbots or data processing) rather than building a custom solution from scratch?
- How are they ensuring data follow open standards to allow for future interoperability?
- What key performance indicators (KPIs) have you identified so far, beyond the mandatory 4?
- What existing data sources did you use to decide on your KPIs?
- How will you know your service is meeting your users’ needs?
- How will you know your service is successfully meeting its objectives, and what are the quantifiable measures that will demonstrate?
- How will you check the service is doing what your department needs it to?
- What data are you collecting and how are you using it to improve your service?
- What management and governance do you have? (for example senior information risk owner (SIRO) approval, not collecting personal data, retained a raw backup data view)
- Have you considered how you will report KPIs on data.gov.uk?
- What technical choices have you made and why? (for example language, framework, deployment, integration, third parties)
- Are there any potential conflicts with standards or guidance (like dependence on JavaScript)? How will you resolve these in beta?
- What other distinct options have you tried/tested?
- How do you plan to make the service open - open source, open standards, open data, common platforms, ownership of intellectual property?
- How will you maximise reuse of existing code, data and platforms? Are you engaging with services like GOV.UK Notify, GOV.UK Account, GOV.UK Pay, GOV.UK PaaS?
- How will you ensure the service is safe for users - data privacy, security threats, fraud vectors?
- What is the minimum amount of personal information the service needs to collect?
- Does your service need an authentication solution?
- What plans do you have if your service is unavailable for any length of time?
- Have you spoken to your department’s data protection officer about the decisions you’ve made?
- What are your technology plans for private beta?
- What are the perceived threats to their AI service (e.g., prompt injection, data poisoning) and how is the prototype mitigating them?
- What distinct AI options or models have they prototyped, and why did they settle on this specific approach (e.g., LLM vs. rules-based)?
- How have they validated that AI is the "right tool for the job" rather than a simpler, more deterministic solution?
- If using a novel or complex model (i.e. not tested by the department previously), how are they ensuring its outputs will be explainable to users and the department?
- How do they plan to make the AI implementation open?
- Who’s in your team?
- How are you engaging outside of your team, for example, inside or outside your department, or with the ops and policy professions?
- Has the team disagreed on anything and if so how was that resolved?
- How are you using agile methods?
- How does governance work? Are there any issues you’ve had to escalate?
- Did you discover anything that will make it difficult to design the service your users’ need? If so, what are your plans to address the issues you found?
- What will the team look like moving into private beta?
- What are you doing to make sure other teams know what you’re working on?
- Who in the team is responsible for making the final technical decisions regarding the AI?
Beta assessment questions
53 question prompts across 5 role groupings. AI-specific questions are highlighted in purple.
- Who are your users?
- What are your users’ lives like?
- Why do they need the thing you’re planning to build or buy?
- How did you define the MVP?
- What do you know about the offline parts of your user’s journey?
- How will you know your service joins up with those offline elements?
- Who are you inviting to your user research sessions?
- What are your plans for user research through the public beta stage? Is the budget sufficient?
- Have they tested the AI’s outputs with the team to ensure they don’t lead to a “pre-selected” solution that ignores outlier user needs?
- How are they iterating on their use of AI in research? Have they changed tools or prompts based on the quality of insights produced in Alpha?
- What constraints are you working within? What are the barriers to removing or reducing the impact of those constraints?
- How do you know a service is the right way to solve the problem you’re looking at?
- How do you know you’ve got the scope of the service right?
- Talk us through a few of the design iterations you tested - and explain why you rejected them
- How will users find the service initially?
- What are you doing to make sure users don’t have to provide the same information multiple times and reusing data from across government ?
- How are you planning to make sure that the service doesn't exclude any existing or potential users of your service?
- How are you improving users’ experience across different channels?
- How are you implementing recommendations from the Accessibility audit on your service?
- What is your approach to access needs and assisted digital needs?
- What support will you offer to users who have problems with the digital part of the service?
- Are they using common platforms (like GOV.UK Notify) to deliver AI outputs to users?
- Have they identified any “AI patterns” they’ve developed (like a specific prompt structure for eligibility checks) that could be contributed back to the government community?
- What key performance indicators (KPIs) have you identified so far, beyond the mandatory four?
- What existing data sources did you use to decide on your KPIs?
- Which metrics demonstrate your service is meeting your users’ needs?
- How will you check the service is doing what your department needs it to?
- What data are you collecting and how are you using it to improve your service?
- What management and governance do you have? (for example senior information risk owner (SIRO) approval, not collecting personal data, retained a raw backup data view)
- Which KPIs will you publish on data.gov.uk?
- Which roles are in your team?
- How are you engaging outside of your team? For example, inside or outside your department, or with the ops and policy professions?
- Has the team disagreed on anything and if so how was that resolved?
- How are you using agile methods?
- How does governance work? Are there any issues you’ve had to escalate?
- Did you discover anything that will make it difficult to design the service your users need? If so, what are your plans to address the issues you found?
- Is the team sustainable moving into public beta and beyond to live?
- What are you doing to make sure other teams know what you’re working on?
- What technical choices have you made and why - language, framework, deployment, integration, third parties?
- What other distinct options have you tried/tested?
- How have you resolved conflicts with standards or guidance in your tech choices? (for example JavaScript dependence)?
- How are you making the service open - open source, open standards, open data, common platforms, ownership of intellectual property?
- Have you maximised reuse of existing code, data and platforms? Are you engaging with services like GOV.UK Notify, GOV.UK Account, GOV.UK Pay, GOV.UK PaaS?
- How will you ensure the service is safe for users - data privacy, security threats, fraud vectors?
- What is the minimum amount of personal information the service needs to collect?
- Does your service need an authentication solution?
- What plans do you have if your service is unavailable for any length of time?
- Have you spoken to your department’s data protection officer about the decisions you’ve made?
- How do they determine if an AI deployment has succeeded, and what is the "roll-back" plan if the model starts producing unexpected results?
- How is the team managing "meaningful human "? At what stage does a human "double-check" the AI's decision before it impacts a user?
- Have they ed specifically targeting the AI's vulnerabilities, such as adversarial attacks?
- If the AI provider or model becomes unavailable, what is the fallback experience for the user?
- What data exists in the pre-production environments for training/testing, and is it appropriately anonymised?
Live assessment questions
49 question prompts across 5 role groupings. AI-specific questions are highlighted in purple.
- Who are your users?
- What are your users’ lives like?
- Why do they need the thing you’re building or buying?
- Why has your department asked you to solve the problem you’re solving?
- What do you know about the offline parts of your user’s journey?
- How do you know your service joins up with those offline elements?
- Who are you inviting to your user research sessions?
- What are your continued plans for user research post-live? Is the budget sufficient?
- What/how have the constraints you’re working within changed? What are the barriers to removing or reducing the impact of those constraints?
- How will you continue to ensure the service is scoped in a way that makes sense to users?
- What are you doing to ensure users don’t have to provide the same information multiple times?
- How do you make sure that the service doesn't exclude any existing or potential users of your service?
- What have you learnt about your approach to access needs and assisted digital needs?
- What support do you offer to users who have problems with the digital part of the service?
- How are they handling updates to their AI components? If an open standard or framework (such as pytorch) was used for the specific AI models evolves, can their service adapt without a total rebuild?
- How are the key performance indicators (KPIs) you have identified improving the service?
- What existing data sources did you use to decide on your KPIs?
- How do you know your service is meeting your users’ needs?
- How do you check the service is doing what your department needs it to?
- What data are you collecting and how are you using it to improve your service?
- What management and governance do you have? (for example senior information risk owner (SIRO) approval, not collecting personal data, retained a raw backup data view)
- Which KPIs are being published on data.gov.uk, has there been addition or change since launch ?
- Once live, how will you measure change over time?
- How do you know you’ve got appropriate metrics in place to measure the success of the service, based on what you’ve learned during public beta?
- How will you continue to use data to improve your service post-live?
- How frequently are they retraining or updating the AI model, and how do they ensure these updates improve the user experience?
- What is their process for keeping the AI's "knowledge" or training data up to date "hallucinations" or outdated advice?
- What metrics are they using to measure the AI's accuracy and fairness in a live environment?
- How do they measure the “accuracy” of AI-assisted research over time? Do they have a metric for “researcher-to-AI" agreement?
- What, if anything, is different in your Live team, compared to the Beta team?
- How are you engaging outside of your team? For example, inside or outside your department, or with the ops and policy professions?
- Has the team disagreed on anything and if so how was that resolved?
- How are you using agile methods and how will you continue post-live?
- How does governance work? Are there any issues you’ve had to escalate?
- What are you doing to make sure other teams know what you’re working on?
- What technical choices have you made and why - language, framework, deployment, integration, third parties?
- What other distinct options have you tried/tested?
- How are you managing your open source repository? How is it being shared with other teams who may need to use similar solutions? - open standards, open data, common platforms, ownership of Intellectual Property?*
- How do you ensure the service is safe for users - data privacy, security threats, fraud vectors?
- What is the minimum amount of personal information the service needs to collect?
- Does your service need an authentication solution?
- What are the plans (in place) if your service is unavailable for any length of time?
- Have you spoken to your department’s data protection officer about the decisions you’ve made?
- Have your technology plans changed and/or is there significant activity planned for modifying or upgrading the technology used?
- How are they staying aware of new security updates and emerging threats to the specific AI models they are using?
- Do they have an intelligence component or "audit trail" for analysts to review AI patterns and identify potential fraud?
- How are they managing the sustainability of the service? Does the energy consumption of the AI infrastructure align with the department's goals?
- Is AI still the most cost-effective way to meet the user need, or have changing requirements made a simpler solution viable?
- Is their implementation of AI “vendor-neutral”, or are they locked-into a proprietary pattern that makes it difficult to switch providers in the future?
Tech pre-call assessment questions
106 question prompts across 7 topic groupings. AI-specific questions are highlighted in purple.
- Who is the technical architect / technical lead on the team?
- Who makes technical decisions?
- Does the team possess the specific data science or AI engineering skills needed to implement and manage the chose model?
- Who is responsible for the ongoing monitoring and ethical oversight of the AI’s ?
- What external integrations do you have?
- Have you tested and de-risked them?
- Are you in touch with the relevant departments or external providers with these, and have they been included in your performance tests?
- Do you have a delivery pipeline?
- Are you doing continuous integration or continuous deployment? If not, why not?
- What tools and testing techniques are you using to ensure quality?
- Is the system available 24/7? Does it need to be?
- Is there user impacting downtime required for deployments?
- How many releases per day are deployed into your production environment?
- How do you monitor that a deployment improves user experience (performance testing etc)?
- How do you determine if a deployment has succeeded, and what do you do if it fails / goes wrong?
- If they’re using third party APIs (i.e. Azure, OpenAI, Anthropic, Google, etc), how are they managing the risk of model versioning or service changes?
- How do they monitor for ‘model drift’ or changes in the quality of AI outputs over time?
- Describe the languages, frameworks, and other technical choices you've made
- Are they using AI in a Category A (productivity aid to BUILD) or a Category B (AI integrated into the Service OUTPUT) manner?
- Why was this specific AI model or approach chosen over a standard, non-algorithmic solution?
- Have they ruled out the current stack and are going to investigate?
- Are they following a standard tech stack?
- Describe the development tool chain that you have selected
- How will you know if the service is healthy?
- Do they have monitoring, alerts and logging?
- How easy is it to rebuild the service if necessary, i.e. are they using configuration management and tools for provisioning?
- Describe how you are making new source code open and reusable?
- Describe how a team in another department can reuse your code?
- Are they sharing AI-specific assets such as prompts, model configuations, or training datasets with the wider government?
- What code from other teams/services are you using?
- How are you handling updates and bug fixes to the code?
- Describe your use of common government platforms
- Describe the integration mechanisms with any external systems
- What environments do you have?
- Where are you hosted?
- How do you create environments?
- What data exists in your pre-production environments?
- If the AI component fails (i.e. API downtime or model error), what is the fallback mechanism for the user?
- How are you gaining confidence that your service, and connected systems, will perform under expected loads?
- Have they considered the environmental and energy consumption of their AI infrastructure?
- What sorts of loads do they expect on a regular basis?
- How did they come up with those numbers?
- Have they considered peak events as well as normal loads?
- How confident are they about their impact on existing connected systems, either legacy systems or SaaS/PaaS that they are utilising?
- How will supplier failures affect you?
- Have you considered DDoS and protected yourself?
- What sort of runbook exists for issues?
- What is your data recovery strategy and how often are you testing it?
- Describe the perceived threats to your service and how you are designing the prototype to mitigate them
- Does the AI tool make final decisions autonomously,
- Are they aware of the OFFICIAL threat model and the security classification at OFFICIAL?
- If the service disburses money, what criminal elements might be interested?
- Does the service use PII or protected characteristics to train or fine-tune the model?
- If the service holds large amounts of personal data, who might want to steal it?
- What fraud vectors exist and what controls are you prototyping?
- Does the service either give a citizen/business money or an entitlement to money? If so, fraud is a risk
- How are they identifying users?
- Do they have an awareness of common fraud vectors?
- Do they have an idea for an intelligence component for analysts to look at common patterns?
- Describe your team’s approach to security and risk management
- Is the whole team aware of and owning security?
- Who owns the risks in the service? Does the team respect the risk appetite?
- How frequently are they penetration testing?
- Describe the threats to your service
- Are there threats that are specific to them rather than just OFFICIAL threats?
- How are they dealing with the threats?
- How widely is the team aware of the threat landscape? Are decisions made by fully informed people?
- Can the service be used as part of a jigsaw attack?
- What fraud vectors exist and what controls are you putting in place
- What reports are they generating for the business?
- Are lines of business joined up? (i.e. claiming petrol tax but not vehicle tax?)
- [Can we ask how much fraud is acceptable?]
- Describe your interactions with the business and information risk teams, for example senior information risk owner (SIRO), IAB, Data Guardians
- Are they meeting on a regular basis?
- Are the stakeholders highly engaged or just reading a report?
- Can the stakeholders actually input to the service design, and change anything?
- Describe any outstanding legal concerns for example data protection or data sharing
- Do they share or hold data that they shouldn’t?
- Do they make clear what data they process and why?
- Describe your cookie and privacy policy and how you arrived at it?
- Do they have a cookie policy?
- Do they have a privacy policy?
- Is it copied from GOV.UK?
- Does it match the cookies actually set by the service, for example Google Analytics
- Describe your teams approach to security and risk management
- Is the whole team aware of and owning security?
- Who owns the risks in the service? Does the team respect the risk appetite?
- How frequently are they penetration testing?
- Describe your ongoing interactions with the business and information risk teams, for example senior information risk owner (SIRO), IAB, Data Guardians
- Are they meeting on a regular basis?
- Are the stakeholders highly engaged or just reading a report?
- Can the stakeholders actually input to the service design, and change anything?
- Describe any outstanding legal concerns for example data protection or data sharing
- Do they share or hold data that they shouldn’t?
- Do they make clear what data they process and why?
- How are you keeping your understanding of the threats to your service up to date? Have the threats changed since the beta?
- How are you keeping your cookie and privacy policy up to date?
- How are you staying aware of security updates to your systems and how quickly can you respond?
- How do they apply patches for security updates
- What does a security release entail?
- What technical choices have you made and why - language, framework, deployment, integration, third parties?
- How do you plan to make the service open - open source, open standards, open data, common platforms?
- How will you ensure the service is safe for users - data privacy, security threats, fraud vectors?
- What other distinct options have you tried / tested?
- [Alpha] What are your technology plans for private beta?
- What plans do you have if your service is unavailable for any length of time?
Coverage Matrix — Role × Phase
Number of question prompts at each intersection of assessor role and assessment phase. Darker cells indicate higher concentration.
| Role / Topic | Alpha | Beta | Live | Tech pre-call | Total |
|---|---|---|---|---|---|
| Design | 11 | 13 | 7 | 31 | |
| Lead Assessor | 9 | 8 | 6 | 23 | |
| Performance Analytics | 8 | 7 | 14 | 29 | |
| Tech Pre-call | 106 | 106 | |||
| Technology | 16 | 15 | 14 | 45 | |
| User Research | 20 | 10 | 8 | 38 | |
| Total | 64 | 53 | 49 | 106 | 272 |
The Technology role and Tech pre-call topics dominate the bank, reflecting the depth of AI-specific technical probing introduced in March 2026. User research and design get focused but proportionate attention across the Service Standard lifecycle.