Why Orthopaedic AI Needs Surgeon-Engineer Collaboration
Why pure-tech teams fail in medical AI and how surgeon-engineer collaboration from day one produces better orthopaedic AI models and outcomes.
The Pattern That Fails
A recurring pattern in medical AI follows a predictable arc. An engineering team trains a deep learning model on a public dataset, achieves impressive accuracy metrics on a held-out test set, publishes a paper, and builds a product around the model. The product launches. Surgeons try it. Adoption stalls.
The model works on the benchmark but not in the clinic. The preprocessing pipeline does not handle the variety of radiographic techniques used across different hospitals. The output format does not match the clinical workflow. The measurement precision is insufficient for surgical planning. The model performs well on the dataset's distribution but poorly on the images that actually arrive from clinical practice.
This failure mode is not about engineering quality — it is about building in isolation from the clinical context where the product must eventually function.
What Surgeons Know That Engineers Do Not
An orthopaedic surgeon evaluating a knee radiograph does not simply classify the image into a category. They assess the image within a clinical context that includes patient symptoms, physical examination findings, prior imaging, comorbidities, activity level, and treatment goals.
This context shapes what "useful AI output" means. A KL grade without associated quantitative measurements (JSW, alignment angles) is insufficient for surgical planning. A classification without a confidence interval does not help with borderline cases. A model that works on AP views but fails on lateral views misses half the diagnostic information a surgeon uses.
Engineers building without clinical input will optimise for the wrong metric (overall accuracy instead of per-class recall), present results in the wrong format (probability distributions instead of clinical categories with measurements), and miss critical edge cases (post-surgical hardware, bilateral studies, paediatric anatomy).
What Engineers Know That Surgeons Do Not
The collaboration must flow in both directions. Surgeons typically underestimate the data requirements for robust model performance and overestimate the current capabilities of AI for complex clinical reasoning.
A surgeon might request "an AI that can plan my osteotomy" — a task that requires 3D reconstruction, anatomical landmark detection, biomechanical simulation, and surgical path planning. This is feasible but represents years of development, not months. An engineer can help decompose this into achievable milestones: first automated alignment measurement, then correction angle calculation, then guide design integration.
Engineers also understand the data pipeline constraints that surgeons may not consider. Training a reliable model requires not just images but consistently annotated images — and inter-observer variability in KL grading means that "ground truth" labels are themselves noisy. Managing this label noise through consensus grading, weighted loss functions, or ordinal regression is an engineering challenge that directly affects clinical utility.
How Collaboration Changed Our Product
At Salnus, every major product decision has been shaped by direct surgeon input. Three examples illustrate the impact:
The DICOM viewer was initially designed with a radiology-centric layout (single large viewport, reading list sidebar). Surgeon feedback redirected the design toward a surgical planning layout (multiplanar 2x2 grid, measurement tools prominent, AI results integrated into the viewing panel rather than a separate report).
The AI model's output format was changed from a single predicted class to a structured clinical summary: KL grade plus confidence plus JSW measurement plus GradCAM visualisation — because surgeons told us that a grade without supporting evidence was not clinically actionable.
The PDF reporting system was redesigned three times based on surgeon feedback about what information belongs in a clinical report, how findings should be structured, and what language is appropriate for patient communication versus referral documentation.
None of these changes would have emerged from an engineering team working in isolation.
How to Structure the Partnership
For organisations considering surgeon-engineer collaboration for medical AI, several structural elements improve outcomes.
Establish clear intellectual property agreements before development begins. The surgeon contributes clinical expertise, dataset access, and validation capacity. The engineering team contributes software development, model training, and platform infrastructure. Both contributions have value, and the arrangement should reflect that — whether through co-authorship, revenue-sharing, or licensing.
Invest in regular, structured communication. A monthly "show-and-tell" where the engineering team demonstrates current progress to the clinical team, and the clinical team presents real cases that test the current system's limits, is more valuable than sporadic email exchanges.
Plan for clinical validation from the beginning. A model that achieves 85% accuracy in development but has no clinical validation pathway is commercially worthless. The validation study design (patient population, comparison standard, outcome measures, sample size) should be defined before model development begins, not after.
The Invitation
Salnus is actively seeking orthopaedic surgeon partners for the next phase of our AI development — expanding our OA screening model to hip OA, ACL injury assessment, and fracture detection. If you are a surgeon with access to clinical data and an interest in AI-assisted clinical tools, we would like to discuss collaboration. Visit our Surgeon Portal to see the current platform, or contact us directly.
Salnus Medikal Yazılım ve Cihaz Teknolojileri San. Tic. A.Ş. — Bridging biomedical engineering and orthopaedic surgery through AI.
Reviewed by the Salnus biomedical engineering team.