The Power of Multimodality in Multimodal Large Language Models, Unimodal ChatGPT 5.0, and Human Clinical Experts on a Wound Care Certification Examination: Cross-Sectional Comparative Study
Mete Ucdal et al · JMIR Publications · 2026
Acceso al recurso
Entrá al contenido desde la opción principal o elegí otra fuente disponible.
Material complementario disponible
Resumen
Descripción general del contenido del recurso.
BackgroundMultimodal large language models (MLLMs) capable of integrating visual and textual information represent a promising advancement for clinical applications requiring image interpretation. Wound care assessment, which demands simultaneous analysis of wound photographs and clinical data, provides an ideal domain to evaluate multimodal vs unimodal artificial intelligence capabilities against human expertise.
ObjectiveThis study aims to compare the performance of MLLMs, unimodal ChatGPT 5.0, and human clinical experts on a standardized wound care certification examination.
MethodsThis cross-sectional comparative study evaluated 3 participant groups on a 25-question wound care certification examination spanning 4 clinical domains (Diagnosis, Treatment, Complication Management, and Wound Subtype Knowledge). Participants included 3 MLLMs (Med-PaLM 2, LLaVA-Med, and BioGPT), 1 unimodal large language model (ChatGPT 5.0), and 4 human clinical experts (general surgeon, wound care nurse, and 2 internal medicine physicians). Statistical analyses included one-way ANOVA with Tukey post hoc tests and domain-specific Kruskal-Wallis comparisons.
ResultsHuman experts achieved the highest accuracy (mean 86%, SD 9.1%), followed by MLLMs (mean 78.7%, SD 12.2%), while ChatGPT 5.0 achieved 64% accuracy, failing the 70% certification threshold. Significant overall group differences were observed (F2,5PPdPP
ConclusionsMLLMs demonstrate significant performance advantages over unimodal artificial intelligence in wound care assessment, particularly for visually dependent clinical tasks. While human experts with specialized wound care experience maintain overall superiority, the point estimate of the top-performing MLLM (Med-PaLM 2, 92%) fell within the observed range of human scores; however, the underpowered comparison (power=0.52) and wide CIs preclude definitive conclusions regarding noninferiority or equivalence to human experts. These findings support the potential role of MLLMs as clinical decision-support tools, warranting further adequately powered validation studies.
Cómo citar
Elegí el formato que necesitás y copiá la referencia al portapapeles.
APA 7
al, M. U. E. (2026). The Power of Multimodality in Multimodal Large Language Models, Unimodal ChatGPT 5.0, and Human Clinical Experts on a Wound Care Certification Examination: Cross-Sectional Comparative Study. https://doi.org/10.2196/88618
MLA
al, Mete Ucdal et. "The Power of Multimodality in Multimodal Large Language Models, Unimodal ChatGPT 5.0, and Human Clinical Experts on a Wound Care Certification Examination: Cross-Sectional Comparative Study." 2026. https://doi.org/10.2196/88618.
Chicago
al, Mete Ucdal et. 2026. "The Power of Multimodality in Multimodal Large Language Models, Unimodal ChatGPT 5.0, and Human Clinical Experts on a Wound Care Certification Examination: Cross-Sectional Comparative Study.". https://doi.org/10.2196/88618.
Harvard
al, M. U. E. 2026, The Power of Multimodality in Multimodal Large Language Models, Unimodal ChatGPT 5.0, and Human Clinical Experts on a Wound Care Certification Examination: Cross-Sectional Comparative Study, JMIR Publications, available at: https://doi.org/10.2196/88618 [Accessed 27 Jun. 2026].
Detalles del recurso
Información bibliográfica útil para confirmar que se trata del material correcto.
- Título
- The Power of Multimodality in Multimodal Large Language Models, Unimodal ChatGPT 5.0, and Human Clinical Experts on a Wound Care Certification Examination: Cross-Sectional Comparative Study
- Autor / colaboradores
- Mete Ucdal et al
- Editorial
- JMIR Publications
- Año de publicación
- 2026
- ISSN
- 2561-326X
- ISSN
- 2561-326X
- Idioma
- eng