← Volver a resultados
Ficha bibliográfica · Consulta y acceso
Artículo de revista

Benchmarking publicly accessible large language models for high-myopia multiple-choice question generation in digital ophthalmic education and public health training

Ligang Jiang et al · Frontiers Media S.A · 2026

Material complementario disponible
Lectura rápida. Revisá los datos básicos del recurso y luego accedé al contenido desde el botón principal. En esta ficha solo se muestra la información necesaria para identificar la obra, citarla y abrirla.
Publicación seriada

3D intelligent printing technology-assisted training improves core competencies of clinical medicine interns: bridging undergraduate further education and community health service needs

Esta publicación seriada contiene 107 contenidos relacionados.

Acceso al recurso

Entrá al contenido desde la opción principal o elegí otra fuente disponible.

Acceso principal

Material complementario disponible

El enlace apunta a material asociado, anexos, tablas, datos o página complementaria. No se marca como libro/texto completo.
Abrir material

Resumen

Descripción general del contenido del recurso.

BackgroundDigital tools are reshaping public health education and training, yet evidence on whether large language models (LLMs) can generate specialist ophthalmic teaching materials remains limited. High myopia (HM), a vision-threatening condition with long-term management needs and public health relevance, provides a suitable setting for evaluating this capability. This study compared five LLMs in generating HM-related multiple-choice questions (MCQs) for ophthalmic education.MethodsFive LLMs (ChatGPT-5.4, Gemini 3, DeepSeek, Kimi K2.5, and Doubao) completed 60 predefined HM MCQ generation tasks each, yielding 300 MCQs. A standardized blueprint covered four domains: basic knowledge, clinical cases, diagnosis and treatment decision-making, and screening/follow-up management. Objective evaluation included structural completeness, format compliance, keyed-answer accuracy, output features, and response time. Two ophthalmology experts rated six domains using 5-point Likert scales, and Spearman analyses examined associations among text features, response time, and expert ratings.ResultsAll models achieved 100.0% initial structural acceptability, structural completeness, and format compliance. Keyed-answer accuracy was highest for ChatGPT-5.4 and Gemini 3 (both 100.0%), followed by DeepSeek (98.3%) and Kimi K2.5 and Doubao (both 95.0%). Significant between-model differences were observed across all output features and response time (all p < 0.001). ChatGPT-5.4 generated the shortest stems, Gemini 3 the shortest explanations and fastest responses, and Kimi K2.5 and Doubao the longest explanations and total outputs. Inter-rater agreement was good (ICC range, 0.835–0.885). Significant differences were found in clarity, distractor quality, and mean subjective score (all p < 0.001), but not in content rigor, educational usefulness, cognitive-level alignment, or overall usability. DeepSeek achieved the highest median mean score, while direct usability was highest for ChatGPT-5.4 (91.7%) and Gemini 3 (90.0%). Content rigor was strongly associated with overall usability (ρ = 0.85, p < 0.05), whereas distractor quality was negatively associated with explanation length (ρ = −0.43, p < 0.05) and total output length (ρ = −0.37, p < 0.05).ConclusionLLMs can reliably generate structurally valid HM-related MCQs under standardized Chinese prompting conditions. Their value may lie in supporting digital ophthalmic education and public health training, although expert oversight remains necessary because meaningful differences persist in factual accuracy, distractor quality, and direct usability.

Cómo citar

Elegí el formato que necesitás y copiá la referencia al portapapeles.

APA 7

al, L. J. E. (2026). Benchmarking publicly accessible large language models for high-myopia multiple-choice question generation in digital ophthalmic education and public health training. https://doi.org/10.3389/fpubh.2026.1843045

MLA

al, Ligang Jiang et. "Benchmarking publicly accessible large language models for high-myopia multiple-choice question generation in digital ophthalmic education and public health training." 2026. https://doi.org/10.3389/fpubh.2026.1843045.

Chicago

al, Ligang Jiang et. 2026. "Benchmarking publicly accessible large language models for high-myopia multiple-choice question generation in digital ophthalmic education and public health training.". https://doi.org/10.3389/fpubh.2026.1843045.

Harvard

al, L. J. E. 2026, Benchmarking publicly accessible large language models for high-myopia multiple-choice question generation in digital ophthalmic education and public health training, Frontiers Media S.A, available at: https://doi.org/10.3389/fpubh.2026.1843045 [Accessed 29 Jun. 2026].

Compartir e imprimir

Guardá la ficha, copiá su enlace permanente o imprimila como PDF.

Exportar referencia

Si usás un gestor bibliográfico, podés exportar el registro en los formatos más comunes.

Detalles del recurso

Información bibliográfica útil para confirmar que se trata del material correcto.

Título
Benchmarking publicly accessible large language models for high-myopia multiple-choice question generation in digital ophthalmic education and public health training
Autor / colaboradores
Ligang Jiang et al
Editorial
Frontiers Media S.A
Año de publicación
2026
ISSN
2296-2565
ISSN
2296-2565
Idioma
eng

Materias

Explorá otros recursos relacionados a partir de estas materias.

Copiado