Musculoskeletal trauma is the leading cause of consultation in pediatric emergency departments. In this context, initial interpretation of radiographs is often entrusted to physicians who are not specialized in radiology, including interns, emergency physicians, and trainee pediatric surgeons, thereby exposing patients to a risk of diagnostic error due to the anatomical particularities of children’s bones. A missed fracture in a child is not trivial: it can lead to lasting functional sequelae, bone growth disturbances, and medicolegal disputes with serious consequences.
Faced with this clinical reality, artificial intelligence (AI) is progressively establishing itself as a decision-support tool in medical imaging. Automated fracture detection algorithms based on convolutional neural networks (deep learning) promise to act as a “second set of eyes” capable of identifying what a fatigued or inexperienced human eye might miss.
But are these systems genuinely effective under real conditions, in a pediatric population whose anatomical particularities are so distinct from those of adults?
Pediatric Radiology: A Particularly Demanding Environment for AI
A child’s bone is not a miniature adult bone. It presents evolving anatomical characteristics that considerably complicate radiographic interpretation, including for AI systems trained primarily on adult cohorts.
Growth plates (physes), secondary ossification centers, apophyses still undergoing maturation, and normal anatomical variants specific to each age group all constitute potential traps. An apophyseal nucleus can simulate an avulsion, and a physeal line can be confused with a fracture line. These subtleties explain why, in Europe, very few commercial fracture detection software products explicitly declare coverage of the pediatric population among CE-certified tools for musculoskeletal radiology.
Research in pediatric musculoskeletal AI has long been concentrated on the assessment of bone maturity (bone age), a field in which commercial tools have existed for more than a decade with well-established performance. By contrast, the detection of appendicular and vertebral fractures in pediatrics remained largely within the domain of research.
Evaluating the performance of an AI tool cannot be reduced to its metrics on standardized datasets. The real question is how it behaves in a flow of consecutive, unselected patients as they present to the emergency department. This is what researchers call the “real-world cohort.”
Results reach a sensitivity above 90%. This performance is comparable to that reported for adults, which is in itself a remarkable result given the complexity of pediatric anatomy.
However, these performance figures vary significantly depending on the anatomical region. The wrist, forearm, and leg display the best accuracy, while the ankle and shoulder show the weakest results. These disparities partly reflect the density of growth structures in certain articular zones.
In general, the negative predictive value, that is, the ability of AI to rule out a fracture, is higher than its positive predictive value. In other words, a negative AI result is more reliable than a positive one. This argues for using AI more as a reassurance tool than as a confirmation tool.
Regarding the impact on non-specialist physicians, available data show a statistically significant but modest improvement: the overall diagnostic accuracy of residents increases by approximately 2 to 3 percentage points with AI assistance. These gains, while limited in absolute value, may represent dozens of additional correctly diagnosed fractures at the scale of an emergency department over the course of a year.
One undesirable side effect deserves to be highlighted: the phenomenon of “excessive deference,” whereby physicians modified an initially correct diagnosis after consulting the AI result, adopting an erroneous conclusion instead. This phenomenon of excessive deference to the algorithm, sometimes referred to as automation bias, serves as a reminder that AI cannot substitute for a physician’s clinical judgment but must remain a decision-support tool.
When AI Sensitivity Becomes Critical
Not all fractures are equivalent in terms of the consequences of a missed diagnosis. Certain lesions that are subtle on radiography but potentially severe in their sequelae constitute a particular medicolegal concern in pediatrics.
Among these high-vigilance fractures are radial condyle fractures, medial malleolus fractures, and proximal metaphyseal tibial fractures, all characterized by minimally displaced lines, subtle morphology, and a high risk of complications in the absence of appropriate treatment: growth disturbances, axial deformities, and joint stiffness. In some medicolegal dispute studies, diagnostic errors involving fractures in children account for more than a quarter of lawsuits against pediatric surgeons, with a significant proportion of permanent injuries linked to an initial misinterpretation of radiographs.
These findings underscore the need for targeted training of algorithms on subpopulations of fractures with high clinical stakes, rather than settling for satisfactory overall performance on heterogeneous cohorts.
The evidence is accumulating: artificial intelligence applied to fracture detection in pediatrics works. It displays solid overall performance under real practice conditions, improves the accuracy of less experienced physicians, and can play a regulatory role in complex situations.
These encouraging results naturally come with areas for improvement that outline the roadmap for the years ahead. Variations in performance according to anatomical region, particularly for certain fractures with significant medicolegal stakes, indicate precisely where to focus training and validation efforts. The phenomenon of automation bias, this tendency to follow the algorithm uncritically, is now well identified and can be addressed through appropriate user training. AI in pediatric musculoskeletal imaging must therefore be conceived as a decision-support tool, not as an autonomous system replacing the radiologist or clinician. Its value lies in complementarity with human expertise, not in its substitution.
Toward Responsible Deployment
The responsible integration of AI in pediatric radiology requires addressing several challenges that will structure the agenda for the coming years.
On the industry side, expectations are becoming more precise and more demanding. It is no longer sufficient to validate overall performance; what is required is demonstrating robustness across truly representative pediatric subpopulations and making progress on fractures whose missed detection exposes patients to medicolegal consequences.
On the clinical and radiological teams side, the challenge is as much cultural as practical. Integrating these tools into daily workflows requires training users not in their passive use, but in their critical reading, meaning knowing how to recognize the situations where the algorithm is reliable, those where it should be questioned, and how to document its recommendations in the patient record. Defining these protocols rigorously, and evolving them as tools improve, will be one of the key missions of learned societies and radiology departments in the years to come. AI will not replace the pediatric radiologist. But, well used, it can amplify radiological capabilities, secure diagnoses at the most fragile links in the care chain, and contribute to more equitable medicine for the most vulnerable children.
Sources
Offiah AC. Current and emerging artificial intelligence applications for pediatric musculoskeletal radiology. Pediatr Radiol. 2022 Oct;52(11):2149-2158. doi: 10.1007/s00247-021-05130-8. PMID: 34272573. https://pubmed.ncbi.nlm.nih.gov/34272573/
Pauling C, Laidlow-Singh H, Evans E, Garbera D, Williamson R, Fernando R, Thomas K, Martin H, Arthurs OJ, Shelmerdine SC. External validation of an artificial intelligence tool for fracture detection in children with osteogenesis imperfecta: a multireader study. Eur Radiol. 2026 Jan;36(1):515-525. doi: 10.1007/s00330-025-11790-z. PMID: 40624375. https://pubmed.ncbi.nlm.nih.gov/40624375/
Ziegner M, Pape J, Lacher M, Brandau A, Kelety T, Mayer S, Hirsch FW, Rosolowski M, Gräfe D. Real-life benefit of artificial intelligence-based fracture detection in a pediatric emergency department. Eur Radiol. 2025 Oct;35(10):5881-5890. doi: 10.1007/s00330-025-11554-9. PMID: 40192806. https://pubmed.ncbi.nlm.nih.gov/40192806/