What’s New in Psychology
That AI Tool You Use May be Biased
Jim Windell
The American Psychological Association recently published ethical guidelines for the use of Artificial Intelligence in the professional practice of health psychology.
You may not think this has anything to do with you. After all, you never use AI in offering clinical services to your clients, right?
Although you may think that people only use AI to help them write a paper for a high school or college course, the chances are that you are already using AI – perhaps every day. AI is deeply woven into the fabric of daily life for most Americans and statistics tend to support this.
For instance, consider these statistics:
- 79% of AI experts believe Americans interact with AI almost constantly or several times a day.
- But only 27% of U.S. adults think they interact with AI that frequently.
- In reality, 99% of Americans use at least one AI-powered tool weekly – like weather apps, streaming services, online shopping, social media, virtual assistants, or GPS.
- 52% of U.S. adults have used AI chatbots like ChatGPT, Gemini, Claude, or Copilot.
- 34% of those users interact with large language models (LLMs) at least once a day.
For psychologists and other mental health professionals who may still be skeptical, here’s more to ponder:
AI is already embedded in many aspects of mental health care, and most don’t know this. For example, AI tools like Mentalyc and Limbic help therapists by automatically generating therapy notes, treatment plans, and progress summaries based on session data. Also, AI is used to automate scheduling and appointment reminders, billing and insurance processing, and routine communication like FAQs or educational content. Additionally, there are AI-enhanced platforms that can suggest evidence-based interventions, track symptom progression, and even recommend digital therapeutics to complement in-person care. And, Chatbots like Wysa, Youper, and Replika offer CBT-based self-help tools, mood tracking, and emotional support between sessions. Sometimes clinicians may recommend these tools to clients without fully understanding the AI mechanisms that personalize the experience.
Given all of this, a new study out of Cedars-Sinai Health Sciences University in Los Angeles offers a cautionary note about using AI tools blindly. The findings of this study were recently published in the peer-reviewed journal NPJ Digital Medicine.
For the study, investigators studied four large language models (LLMs), a category of AI algorithms trained on enormous amounts of data, which enables them to understand and generate human language. In medicine, LLMs are drawing interest for their ability to quickly evaluate and recommend diagnoses and treatments for individual patients.
The study found that the LLMs, when presented with hypothetical clinical cases, often proposed different treatments for psychiatric patients when African American identity was stated or simply implied than for patients for whom race was not indicated. Diagnoses, by comparison, were relatively consistent.
“Most of the LLMs exhibited some form of bias when dealing with African American patients, at times making dramatically different recommendations for the same psychiatric illness and otherwise identical patient,” says Elias Aboujaoude, M.D., M.A., director of the Program in Internet, Health and Society in the Department of Biomedical Sciences at Cedars-Sinai and corresponding author of the study. “This bias was most evident in cases of schizophrenia and anxiety.”
Among the disparities the study uncovered were the following:
- Two LLMs omitted medication recommendations for an attention-deficit/hyperactivity disorder case when race was explicitly stated, but they suggested them when those characteristics were missing from the case.
- Another LLM suggested guardianship for depression cases with explicit racial characteristics.
- One LLM showed increased focus on reducing alcohol use in anxiety cases only for patients explicitly identified as African American or who had a common African American name.
Aboujaoude suggests the LLMs showed racial bias because they reflected bias found in the extensive content used to train them. Future research, he says, should focus on strategies to detect and quantify bias in artificial intelligence platforms and training data, create LLM architecture that resists demographic bias, and establish standardized protocols for clinical bias testing.
“The findings of this important study serve as a call to action for stakeholders across the healthcare ecosystem to ensure that LLM technologies enhance health equity rather than reproduce or worsen existing inequities,” says David Underhill, Ph.D., Chair of the Department of Biomedical Sciences at Cedars-Sinai and the Janis and William Wetsman Family Chair in Inflammatory Bowel Disease. “Until that goal is reached, such systems should be deployed with caution and consideration for how even subtle racial characteristics may affect their judgment.”
What do the new “Ethical Guidelines for AI in the Professional Practice of Health Psychology” say?
To quote just a small portion of the guidelines, the new document says: “AI systems should be evaluated with a focus on addressing bias and preventing exacerbation of existing health care disparities.”
To read the original article used for this blog, find it at:
Bouguettaya, A., Stuart, E.M. & Aboujaoude, E. (2025). Racial bias in AI-mediated psychiatric diagnosis and treatment: a qualitative comparison of four large language models. NPJ Digital Medicine, 8, 332 (2025). https://doi.org/10.1038/s41746-025-01746-4
The “Ethical Guidelines for AI in the Professional Practice of Health Psychology ” can be found at: //www.apa.org/topics/artificial-intelligence-machine-learning/ethical-guidance-ai-professional-practice




