New AI focused on medicine: while I... - Advanced Prostate...

Advanced Prostate Cancer

23,344 members•28,775 posts

New AI focused on medicine

7 months ago•17 Replies

while I love specialized AI (the one used for a specific task, like AlphaFold) I consider generative AI (like chat gpt) as a useful but unreliable tool…useful for relatively simple tasks like a translation or summarizing a document (still, you better check it), unreliable when it comes to jobs that require in depth comprehension (and the ability to say “I don’t know” instead of improvising)

So I welcome a LLM focused on helping doctors:

marktechpost.com/2024/08/13...

Technically I would be curious to know if they fine tuned llama3 or they created a RAG since I am working on something similar, but still it’s an important achievement to help doctors to keep up with the exponentially growing amount of knowledge out there!

Written by

Maxone73

To view profiles and participate in discussions please or .

17 Replies

•

street-air7 months ago

I am always suspicious of LLM claims on benchmarks because I don't really see how they can train them while carefully avoiding all the benchmark questions/answers and question templates (same question type with slightly different parameters) whose data would not be isolated to the benchmark test content, but derived from existing medical texts, discussions and so on, to say nothing of endless medical exams with Q&A that are not part of the benchmarks but influenced them, or overlap them. The benchmark answers are infecting the training data, in other words.

eg if one trains an LLM on being knowledgeable about Harry Potter, by pouring the internet through it, but exclude the books of Harry Potter to avoid accusations that it is just regurgitating, it can do really well on a Harry Potter exam called "what comes next?", because of course the harry potter text is already everywhere in the internet, outside the books themselves.

Maxone73• in reply tostreet-air7 months ago

That’s not how you train an expert system based on neural networks. Normally you get a source of info that is reliable. For example for a system reading X-rays to detect breast cancer you use a database of various parameters (my system had 64 parameters or “dimensions” as we used to call them). Think about an excel file with 64 columns per each record in my case. You know the outcome a priori of course, there must be a field that says if at the end of the day that X-ray led to cancer or not. Then you calculate the correlation index between each pair of columns. In case of strong correlation you delete one of the columns to save time and resources when training. Then you check for dominance, for example a record that is often repeated, and you delete duplicates. At that point you split the data randomly (multiple times) and you use a part of it for training, the rest is used to test the results on unknown inputs (because we have deleted all the duplicates). You repeat the process multiple times to verify convergence and the reliability rate. And then comes the hard part: you find a panel of experts, make them guess if a patient had cancer basing their decision on their experience and on the same parameters the network has (but normally we humans use less). You compare the number of answers from each expert with the ones given by the system and that’s roughly your benchmark.With LLMs is similar. When you create a RAG you tokenize and make searchable a bunch of documents, very specialized and make them searchable by the LLM to use that data to create a coherent answer. In that sense you use not the LLM’s knowledge about the topic but it’s ability to create answers that are readable by a human. Plus you don’t have to retrain the whole system when you add new data. In your example, for a RAG I would make it digest Harry Potter books and then ask questions about the books that can be only inferred and not regurgitated. But as you said “old” benchmarks can become spurious if you spread them over the internet and then use the same internet as information database.

Do I make any sense??? 😂😂😂

Maxone73• in reply toMaxone737 months ago

Having said ths I can add… yes there is a bubble, yes it will burst soon and only the good ones will survive! 😜

street-air• in reply toMaxone737 months ago

Ah that's an earlier neural network training approach. Not an LLM which while it's a neural network is a sub-category called a transformer. An LLM absorbs all the text available. gigabytes of it. Vast data-sets. That's how it ends up "speaking" a usually plausible English. In this particular case I bet it was trained to absorb all the medical text available. I'm just skeptical as to how they can avoid absorbing all the answers to known medical benchmarks while they do this which of course means it then aces said benchmarks.

by the way my point has been raised many times.

bdtechtalks.com/2023/07/17/...

if someone wants their new LLM to make headlines regarding benchmarks, the temptation to play fast and loose and get data contamination is huge. I reckon.

Maxone73• in reply tostreet-air7 months ago

transformers are an architecture 😄 the concept behind is the same, they are multi layered networks, very deep, but yes with language is different in terms of gigabytes and emergent behaviors, but still I think they are going for RAGs and not trying to retrain a whole network. Having a system that can easily keep you updated with all the latest clinical trials, the level of reliability they have and so on is already something that could have great impact!

street-air• in reply toMaxone737 months ago

well my bet is all they are doing is taking the current LLM training process and narrowing the input data to everything they can scrape that is medical. Then launching that without the normal guardrails open-ai builds in to avoid being sued for a wrong diagnosis. In that sense, as the process of hand picking the training data to avoid pollution of benchmark related text would be enormously time consuming and difficult, and would work against claims of performance anyway, so my other bet is it is likely polluted with the benchmark data they do well on, and the last bet is the actual training will remain proprietary for "competitive reasons" and to avoid getting sued for the same things open-ai is getting sued for now: use of copyright work without permission.

Yes this is a cynical view but that's what 40 years in tech does to one. One ends up less frequently unpleasantly surprised that way.

Maxone73• in reply tostreet-air7 months ago

Oh another Dilbert reader! Hello brother!!

Maxone73• in reply tostreet-air7 months ago

Is it an effect of my ADT or you originally did not write “Not an LLM which while it's a neural network is a sub-category called a transformer.”?? 😂😂😂

Mgtd• in reply toMaxone737 months ago

Thank you so much. That was a great explanation of a very complex process. When my grandson talks to me about that process I get lost in the weeds. I guess just to complex for this old brain to comprehend.

j-o-h-n7 months ago

Hey Harriet! Yes Norman? They're posting about A I again. Wanna read about it or back to watching the pornos?. Oh ok Harriet, back to Señor Dong does Dallas....

Good Luck, Good Health and Good Humor.

j-o-h-n