Chatbots Not Always Reliable for Canc... - Fight Prostate Ca...

Fight Prostate Cancer

2,802 members1,032 posts

Chatbots Not Always Reliable for Cancer Treatment Advice - MedPage Today, August 24, 2023.

cujoe profile image
4 Replies

Seems AI for cancer info and advice might not yet be ready for prime time. For the immediate future, I'd trust something like Cancer Hacker Lab long before a Chatbot for info and guidance on treatment, esp. if it was other than SOC.

* * *

Chatbots Not Always Reliable for Cancer Treatment Advice - Studies show their potential, but reveal clear issues with treatment information reliability, by Mike Bassett, Staff Writer, MedPage Today August 24, 2023.

Chatbots had mixed results when it came to providing direct-to-patient cancer-related advice and treatment strategies for a wide variety of cancers, according to two studies in JAMA Oncology.

When testing GPT-3.5 (OpenAI) with prompts designed to obtain treatment strategies for different kinds of cancers, they found that while most answers were in accordance with National Comprehensive Cancer Network (NCCN) guidelines, one-third were at least partially nonconcordant, reported Danielle Bitterman, MD, of Mass General Brigham and Harvard Medical School in Boston, and colleagues in a research letter.

They suggested that clinicians "advise patients that LLM [large language model] chatbots are not a reliable source of treatment information."

Findings from the second study -- which tested four AI chatbots including GPT-3.5 on direct-to-patient advice -- were more positive, suggesting that their use "generally" produced accurate information on cancer-related search inquiries, but that these responses were not readily actionable and are written at a college level, according to Abdo Kabarriti, MD, of the State University of New York Downstate Health Sciences University in New York City, and colleagues.

"Findings of this study suggest that AI chatbots are an accurate and reliable supplementary resource for medical information," wrote Kabarriti and colleagues, "but are limited in their readability and should not replace healthcare professionals for individualized healthcare questions."

In an editorial accompanying the two studies, Atul Butte, MD, PhD, of the University of California San Francisco, said that while the results of these studies may suggest "our core belief in GPT technology as a clinical partner has not sufficiently been earned yet," the chatbots used in these studies are off the shelf and likely do not have specific healthcare training.

"Newer LLMs are now being released that have specific healthcare training, such as Google's Med-PaLM 2," he wrote. "Future medical evaluation studies are likely going to need to compare across several LLMs."

Moreover, Butte said the real potential of these tools in cancer care is that they can be trained from the very best centers, and then used "to deliver the right best care through digital tools to all patients, especially to those who do not have the resources or privilege to get that level of care."

Treatment Recommendations

For their study, Bitterman and colleagues developed four prompt templates for treatment recommendations for 26 different kinds of cancers (for a total of 104 prompts), and benchmarked the chatbot's recommendations against 2021 NCCN guidelines. Concordance of the chatbot output with NCCN guidelines was assessed by board-certified oncologists.

Findings showed the chatbot provided at least one recommendation for 102 of 104 (98%) prompts and all outputs with a recommendation included at least one NCCN-concordant treatment. However, 35 of 102 (34.3%) of these outputs also recommended one or more nonconcordant treatments, with 13 of 104 responses (12.5%) "hallucinated," meaning they weren't part of any recommended treatment.

"The chatbot did not purport to be a medical device, and need not be held to such standards," Bitterman and colleagues wrote. "However, patients will likely use such technologies in their self-education, which may affect shared decision-making and the patient-clinician relationship. Developers should have some responsibility to distribute technologies that do not cause harm, and patients and clinicians need to be aware of these technologies' limitations."

Consumer Health Info

In their study, Kabarriti and colleagues inputted Google Trends' top five search queries related to skin, lung, breast, colorectal, and prostate cancer into four chatbots. Outcomes included the quality of consumer health information based on the DISCERN instrument (a scale of 1-5, with 1 representing low quality) and the understandability and actionability of this information based on domains of the Patient Education Materials Assessment Tool (PEMAT), with scores ranging from 0% to 100%, with higher scores indicating a higher level of understandability and actionability.

They determined the quality of text responses generated by the four AI chatbots was good (median DISCERN score of 5, with no misinformation identified). Understandability was considered moderate (median PEMAT Understandability score of 66.7%) but actionability was poor (median PEMAT Actionability score of 20%), with authors noting that responses were written at the college level. "This finding suggests that AI chatbots use medical terminology that may not be familiar or useful for lay audiences," Kabarriti and colleagues said.

"These limitations suggest that AI chatbots should be used supplementarily and not as a primary source for medical information," they added. "To this end, AI chatbots typically encourage users to seek medical attention relating to cancer symptoms and treatment."

AI and LLMs are not yet perfect and can carry biases, Butte said in his editorial.

"These algorithms will need to be carefully monitored as they are brought into health systems," he continued. "But this does not alter the potential of how they can improve care for both the haves and have-nots of healthcare."

* * *

Here are links to 1. Medpage Article, 2. Dr. Bitterman's JAMA Research Letter, 3. the referenced "second study" JAMA Brief Report (full paper behind paywall), and 4. Dr. Atul Butte's Editorial (also behind paywall):

1. medpagetoday.com/hematology...

2. jamanetwork.com/journals/ja...

3. jamanetwork.com/journals/ja...

4. jamanetwork.com/journals/ja...

This from the JAMA Brief Report summarizes the current state on AI Chatbots for cancers:

* * *

Results The analysis included 100 responses from 4 chatbots about the 5 most common search queries for skin, lung, breast, colorectal, and prostate cancer. The quality of text responses generated by the 4 AI chatbots was good (median [range] DISCERN score, 5 [2-5]) and no misinformation was identified. Understandability was moderate (median [range] PEMAT Understandability score, 66.7% [33.3%-90.1%]), and actionability was poor (median [range] PEMAT Actionability score, 20.0% [0%-40.0%]). The responses were written at the college level based on the Flesch-Kincaid Grade Level score.

Conclusions and Relevance Findings of this cross-sectional study suggest that AI chatbots generally produce accurate information for the top cancer-related search queries, but the responses are not readily actionable and are written at a college reading level. These limitations suggest that AI chatbots should be used supplementarily and not as a primary source for medical information.

* * *

I think I would trust "expert AI" for readings on scans, but not sure I'm ready to turn the other aspects of my care over to it just yet - except, maybe, as a second opinion to challenge the flesh and blood intelligence I now rely on for my care.

Say "Hello" to your new Chatbot MO for us - and Stay S&W,

Ciao - K9

Written by
cujoe profile image
cujoe
To view profiles and participate in discussions please or .
Read more about...
4 Replies
Cooolone profile image
Cooolone

Yeah, when you see the responses with these "Bots" you know it's off. Anyone who would rely on some automated response over a trained professional oncologist and their 'team' certainly is traveling down a rocky road! Good Luck with that, lol.

Kuanyin profile image
Kuanyin

Exactly: using bots to analyze scan results makes most sense. It's ok to run results by a physician also would be useful. I have learned from this forum, that there are published papers the docs don't see. Who would have such time. We, as a community, act like a kind of bot!

MNFarmBoy profile image
MNFarmBoy

I think "Never reliable, although perhaps sometimes correct" would be a better assessment of medical advice from the chatbots that are currently available to the public. Numerous descriptions of how chatbots operate state that what they essentially do is predict what the next word should be, based on the preceding words. Describing the output from chatbots as "Not always reliable" for medical advice (or most anything ) internally contradicts the meaning of "reliable". Stating the reliability as a percent probability of being correct would make a title that doesn't blatantly invite ridicule.

pca2004 profile image
pca2004

There was once a young well-connected doctor who was diagnosed with aggressive PCa. He contacted every doctor he knew, however slightly, asking for advice. They all told him something different, but he was able to figure out a way forward.

Years later, he wrote about this in a NY Times Op Ed piece. The cancer was back. He was reaching out to colleagues again because he had faith in the "wisdom of the crowd".

Try building that into an AI engine!

-Patrick

You may also like...

[i][b]How Prostate Cancer May Begin - ScienceDaily, September 21, 2022[/b][/i]

morphologically normal tissue from cancerous and non-cancerous prostates, Molecular Cancer,...

New strategies against bone metastases from prostate cancer - Science Daily

providing different drug targets and suggesting different treatments . . . Importantly, both...