TELUS Digital Research Reveals a Hidden Risk in AI Model Behavior
Study shows the use of persona prompting can cause shifts in LLMs' moral judgements,
leading to unexpected and inconsistent responses
For enterprises, this means careful model selection, rigorous testing and ongoing evaluation are essential to ensure consistent,
reliable AI behavior in production
"When AI models adopt different personas, they don't just change how they speak, they can fundamentally alter their reasoning and decision-making," said
What is persona prompting?
Persona prompting, also known as role prompting, refers to instructing an AI model to respond as if it were a specific type of person or role with specific expertise or knowledge, such as a business leader, teacher, or customer support agent, rather than responding as a neutral system. For example: "You are a certified financial planner, tell me where to invest my retirement savings."
Persona prompting is also commonly used by model builders in system design and production to hardcode personas and assign fixed roles that will define the AI's behavior. For instance, building an AI-powered customer service bot that's configured to act as a helpful support agent with deep knowledge of product features and return policies. In practice, personas make AI outputs feel more consistent, helpful and context-aware without changing the underlying model.
How was TELUS Digital's research done?
The study, conducted by researchers working at the TELUS Digital Research Hub in the University of São
To assess the responses, researchers used the Moral Foundations Questionnaire, a tool used in social psychology to measure how judgments are made across dimensions such as harm, fairness, authority and loyalty. Rather than analyzing individual answers, the researchers examined patterns across tens of thousands of responses to measure how consistently each model reasoned across different personas.
The study identified two properties:
- Moral robustness describes how consistent a model's judgments remain while it stays within a single persona.
- Moral susceptibility captures how much a model's judgments shift when it moves from one persona to another.
When evaluated together, moral robustness and moral susceptibility reveal whether an AI model maintains consistent moral reasoning or produces contradictory judgments based on an assigned persona.
TELUS Digital's key findings on how personas affect AI model behavior
While it's well understood that LLM outputs can shift when personas are added to prompts, TELUS Digital's study highlights a more specific pattern. Moral robustness is driven mainly by model family, while moral susceptibility tends to increase with model size within the same family when the persona changes. This becomes a higher risk when those shifts show up in business decisions where consistency and oversight matter most, such as compliance, finance, healthcare or human resources.
The study identified additional patterns in how AI models respond when prompted to adopt different personas. Researchers described the findings as a "robustness paradox" because the models that were better at staying in character also showed larger shifts in moral judgments when the persona changed.
- Persona based prompts can systematically influence moral reasoning in AI: Changes in the models' judgment is not random, it shifts in predictable ways that are aligned to the assigned roles.
-
Judgment stability is primarily driven by model family: A subset of the findings indicated that:
- Claude demonstrated the highest overall moral robustness
- Gemini and GPT demonstrated moderate moral robustness
- Grok demonstrated comparatively low moral robustness
What are the real world impacts of persona prompting when building AI?
TELUS Digital's research findings highlight the importance of conducting ongoing testing and oversight of AI models as part of a robust governance framework. This is particularly important when AI models are employed in scenarios where decisions affect people's lives, safety, or rights, and in regulated environments, such as banking and finance, insurance and healthcare. Understanding how different AI models behave under different persona prompts is key information to help model builders and enterprises identify where variability is acceptable and where it can introduce risk.
"Our research findings underscore why enterprise AI deployment requires more than just picking the most advanced or largest model. Organizations must evaluate how individual models respond to variables such as persona prompting and choose options that deliver consistent, reliable outputs without introducing unexpected risk," said
Are you ready to uncover the vulnerabilities in your GenAI applications? Learn more at: https://www.fuelix.ai/products/fuel-fortify
The TELUS Digital Research Hub brings together academic researchers and industry practitioners to study how advanced AI models behave in real-world, human-facing contexts. For more information visit: https://www.telusdigital.com/research-hub
View related press releases:
TELUS Digital Launches Fuel iX™ Fortify for Automated Red-Teaming
TELUS Digital Research
Frequently asked questions:
What is persona prompting in AI models?
Persona prompting, also known as role prompting, refers to instructing an AI model to respond as if it were a specific type of person or role, such as a compliance officer, business leader, teacher, doctor, or customer support agent, rather than responding as a neutral system. This technique is commonly used to make AI outputs feel more relevant, context-aware, and aligned with organizational tone or expectations.
What does TELUS Digital's study say about persona prompting?
TELUS Digital's study, The Robustness Paradox: Why Better Actors Make Riskier Agents, found that when the same AI model is prompted to adopt different personas by users, it can make different judgment calls even when the underlying question does not change. These shifts followed consistent patterns aligned with the persona being used, showing that persona-based prompting can influence how models make decisions, not just their tone and how they communicate.
What is moral robustness in AI models?
Moral robustness describes how stable an AI model's moral judgments remain while it stays in the same persona. For example, if a model is role-prompted to respond as a compliance officer, moral robustness measures whether its judgment stays consistent across many questions while it remains in that compliance role.
What is moral susceptibility in AI models?
Moral susceptibility describes how much an AI model's moral judgments change when the persona changes. For example, a model might respond one way when role-prompted as a compliance reviewer, but shift its judgment when role-prompted as a business leader focused on efficiency, when asked the same question.
Why do TELUS Digital's study findings in the paper:
The Robustness Paradox: Why Better Actors Mak
e Riski
er Agents
,
matter for enterprises deploying AI?
TELUS Digital's study reveals a risk for enterprises deploying AI. It found that when leading open-source and proprietary LLMs are asked by users to "role-play" as part of a query or conversation (a technique known as persona prompting), the degree to which their moral judgments shift is driven primarily by model family and model size within a given family. These findings highlight a hidden risk for enterprise AI builders that should be proactively addressed during model selection and design, and through ongoing testing and monitoring once in production.
How does TELUS Digital's study inform AI governance and risk management frameworks?
TELUS Digital emphasizes the importance of ongoing testing and oversight as part of an enterprise AI risk management framework. Understanding how models behave under different persona prompts is a key input to responsible AI governance, helping organizations identify where variability is acceptable and where it introduces risk, particularly in higher-impact or regulated environments.
How does Fuel iX™ Fortify support continuous automated red-teaming, and policy-aligned persona testing?
Fuel iX Fortify supports AI testing with automated red-teaming and ongoing monitoring as part of enterprise AI governance. It helps teams evaluate how AI models respond under a wide range of real-world conditions, including adversarial prompts, high-risk scenarios and persona prompting.
About TELUS Digital
TELUS Digital
, a wholly-owned subsidiary of
Powered by purpose, TELUS Digital leverages technology, human ingenuity and compassion to serve customers and create inclusive, thriving communities in the regions where we operate around the world. Guided by our Humanity-in-the-Loop principles, we take a responsible approach to the transformational technologies we develop and deploy by proactively considering and addressing the broader impacts of our work. Learn more at: telusdigital.com
Contacts:
TELUS Digital Media Relations
Ali Wilson
media.relations@telusdigital.com
TELUS Investor Relations
Olena Lobach
ir@telus.com
View original content to download multimedia:https://www.prnewswire.com/news-releases/telus-digital-research-reveals-a-hidden-risk-in-ai-model-behavior-302696265.html
SOURCE TELUS Digital