AI in Lab Coat
Posts
Doctors vs GPT-4: Who Did Better at Reading Medical Notes?

Doctors vs GPT-4: Who Did Better at Reading Medical Notes?

plus: The Growing Chaos of AI in Medicine

Bauris T.
December 27, 2024

Happy Friday! It’s December 27^th.

As we close out 2024, I want to take a moment to say thank you. Your support and interest in this space have made this year truly special. It’s been amazing to see AI in healthcare evolve so rapidly, and I’m grateful to have you along for the ride.

In January, we’ll share our usual December funding report and we’re working on a full comprehensive recap of 2024’s AI healthcare startup landscape. Next year, we’ll bring in more expert voices, build new tools to track the AI rollout, and dive even deeper into the opportunities (and challenges) this transformation brings.

Our picks for the last issue of the year:

Featured Research: Doctors vs GPT-4: Who Did Better at Reading Medical Notes?
Perspectives: The Growing Chaos of AI in Medicine
Product Pipeline: FDA Authorizes First Device Proven to Help Spinal Cord Injuries Without Surgery
Policy & Ethics: Congress Unveils Ambitious Plan to Transform Healthcare with AI

FEATURED RESEARCH

Doctors vs GPT-4: Who Did Better at Reading Medical Notes?

A 2D illustration of a doctor reading her notebook with a robot trying to do the same on the right.

Medical notes hold vast amounts of patient information, but their unstructured format makes them tough for computers to process.

A new study evaluated GPT-4's ability to analyze these notes in English, Spanish, and Italian. The findings could reshape how hospitals manage patient records globally.

How it worked: The study included eight hospitals from the USA, Colombia, Singapore, and Italy, each contributing de-identified notes.

These notes ranged from admission to consultation records, focusing on patients with obesity and COVID-19. GPT-4 answered 14 predefined questions about the notes, with two physicians validating its responses.

Key findings: GPT-4 matched physician consensus in 79% of cases, with better performance on Spanish (88%) and Italian (84%) notes compared to English (77%).

It excelled at extracting clear details like patient age or conditions but struggled with implicit reasoning, such as linking symptoms to complications.

Challenges included variability in note styles and GPT-4's occasional hallucinations, where it created information not found in the text.

What this means: The results highlight GPT-4's potential to streamline medical data analysis across languages. However, it’s not yet ready for nuanced clinical decision-making.

The study suggests focusing on simpler tasks, like patient selection for trials, where GPT-4 performed well. Integrating tools like GPT-4 into workflows could save time but will require fine-tuning to handle complex clinical scenarios.

For more details: Full Article

Help Us Make This Newsletter Better!

What kind of AI healthcare data would you find most useful?

Your feedback will help us make content that’s useful for you!

Opinion and Perspectives

AI REGISTRY

The Growing Chaos of AI in Medicine

Photo by Alvaro Reyes on Unsplash

Hospitals are scrambling to keep up with the flood of AI tools promising to improve care. At Duke Health, a malfunctioning AI tool went unnoticed until it caused a major patient recall. It wasn’t logged in their system, a costly oversight that highlighted a larger issue: health systems don’t know what AI they’re using.

Building an AI registry: Michael Pencina, Duke’s Chief Data Scientist, is advocating for a federated AI registry.

Think of it as a shared database where hospitals log every AI tool in use. This kind of system would make it easier for health systems to track, evaluate, and share knowledge about the tools they’re deploying.

Why collaboration matters: Health systems evaluating AI in isolation can’t keep up with the pace or expense. At a recent Coalition for Health AI meeting, Pencina noted the strain smaller systems face.

Sharing evaluations and real-world outcomes across institutions can reduce duplication and improve patient safety—especially for underfunded health systems.

The bigger picture: Transparency is the foundation of trust. For clinicians, knowing how AI tools influence decisions is critical.

For patients, understanding the role of these tools in their care is essential. A shared registry isn’t just a practical solution; it’s a necessary step in making AI a trustworthy part of modern medicine.

For more details: Full Article

Top Funded Startups

Product Pipeline

SPINAL CORD INJURY

FDA Authorizes First Device Proven to Help Spinal Cord Injuries Without Surgery

ARC-EX System from ONWARD Medical

ONWARD Medical has achieved FDA De Novo classification and U.S. market authorization for the ARC-EX System, the first non-invasive spinal cord stimulation device designed to improve hand strength and sensation in individuals with chronic spinal cord injuries (SCI).

The device delivers programmed electrical stimulation through electrodes placed on the neck, providing a non-surgical approach to restoring some function.

ONWARD is also working on other innovative solutions, including the ARC-BCI System, which incorporates a brain-computer interface (BCI) powered by AI to explore additional options for SCI care.

Clinical trials showed that 90% of participants experienced improved strength or function, with some benefits observed even decades after injury.

This approval is a meaningful step forward for individuals living with SCI, with the ARC-EX earning recognition as a 2024 TIME Best Invention.

For more details: Full Article

Policy and Ethics

HEALTHCARE AI POLICY

Congress Unveils Ambitious Plan to Transform Healthcare with AI

The dome of the United States Capitol Building in Washington, DC.

Congress has released a bipartisan framework to shape how AI is used in healthcare. The report points to AI’s ability to analyze vast datasets, which could save billions annually by reducing administrative costs and expediting drug development.

For example, AI-driven diagnostics have already shown accuracy rates up to 87%, surpassing many traditional methods. However, risks like data breaches, biases in algorithms, and unclear liability for errors pose significant challenges.

To address these, the plan calls for stronger privacy protections, clear standards for safe AI use, and investments in research, aiming to ensure these technologies benefit patients without compromising safety or equity.

For more details: Full Article

Byte-Sized Break

📢 Three Things AI Did This Week

Former OpenAI engineer Suchir Balaji, known for his contributions to ChatGPT and raising concerns over AI copyright practices, tragically passed away in November 2024 at age 26. [Link]
Italy’s privacy watchdog fined OpenAI €15 million for processing ChatGPT user data without legal basis, lacking transparency, and inadequate age verification, prompting OpenAI to appeal and plan a public awareness campaign. [Link]
Encyclopaedia Britannica plans to go public with a valuation over $1 billion, leveraging AI-powered educational products to customize learning experiences, doubling its revenue to $100 million in two years. [Link]

Happy New Year!

❤️ Help us create something you'll love—tell us what matters!

💬 We read all of your replies, comments, and questions.

👉 See you all next week! - Bauris

Trivia Answer: We’ll be back next year with more AI Trivia!

See you all in the New Year!

How did we do this week?

Reply

or to participate.