Browser-Based DICOM Processing for Orthopaedic AI: Why We Made the Call
We chose to run orthopaedic AI directly in the browser, with DICOM never leaving the session boundary. Here is the architecture, the tradeoffs, and the regulatory case for client-first inference in clinical pilots.
When we set out to put orthopaedic AI into clinical use, we ran into the same architectural fork that most medical imaging startups eventually face. The default path is the one almost everybody takes: upload DICOM studies to a cloud backend, run inference on a GPU server, return results. It works, it scales, and the tooling is well understood. We chose the other path. DICOM stays in the browser, inference runs in the surgeon's session, and the server never sees the pixel data.
This post is the technical and regulatory case for that decision. It is also, honestly, an attempt to write down the reasoning before the pilot starts driving feedback that will inevitably revise parts of it.
The Server-Side Default
The standard architecture for clinical imaging AI follows a familiar pattern. A DICOM study is uploaded to an object store, indexed in a database, queued for inference, processed on a GPU, and the resulting segmentation or measurement is written back to the database. The frontend reads the result and displays it. This is how most PACS-integrated AI products work, and it is how most cloud-based orthopaedic planning tools operate today.
The advantages are real. Server-side inference centralises model versioning, makes A/B testing tractable, and removes the need to worry about client hardware variability. If a surgeon opens the platform on a low-power laptop, the cloud GPU still runs the model in the same time. Audit logging is straightforward because every inference event passes through a controlled backend. From an MLOps perspective, the server-side default is the path of least resistance.
The disadvantages compound when the data being processed is medical imaging in a regulated jurisdiction. DICOM studies contain pixel data plus patient metadata, and the moment they cross the server boundary, the system inherits responsibility for that data under KVKK in Turkey and GDPR in the EU. This means an audit trail of who accessed what, when, and why. It means a documented data processing agreement with every institution. It means a clear answer to where the data lives, how long it is retained, who has access, and what happens when a patient exercises their right to erasure. None of these are insurmountable, but each of them adds a regulatory surface area that has to be defended through the full lifecycle of the product.
For a clinical pilot, particularly one where the institutions involved have not yet signed long-form data sharing agreements, the server-side default creates an early-stage friction that we wanted to avoid.
The Browser-First Option
The alternative is to do the inference on the client. The DICOM study is loaded into the browser, the model runs in the browser via ONNX Runtime Web or WebGPU, the results are rendered locally, and the server receives only the derived artifacts the user explicitly chooses to upload, typically anonymised mesh files or measurement reports. The pixel data never leaves the device.
This is not a new idea in medical imaging. Cornerstone3D, OHIF, and several other open source projects have been making browser-based DICOM viewing production-grade for years. What is newer is that the model runtime in the browser has caught up to the point where clinically relevant inference is feasible without a server round-trip. ONNX Runtime Web with WebGPU acceleration can run modest segmentation networks at interactive speeds on a recent MacBook or Windows laptop with integrated graphics. For the orthopaedic CT and knee X-ray models that matter to us, this is enough.
The browser-first architecture rearranges the privacy story in a structural way. If the pixel data never reaches our server, our server cannot leak it, cannot be subpoenaed for it, and cannot lose it in a breach. The data processing question becomes much simpler because we are not processing patient data in the regulatory sense. The surgeon is processing it, on their own device, in their own session. We are the tooling, not the processor.
Session Boundary: The KVKK and GDPR Mechanics
The legal framing matters because it shapes which compliance regime applies. Under both KVKK and GDPR, a controller is the entity that determines the purposes and means of processing personal data. A processor is the entity that processes data on behalf of a controller. The controller carries most of the regulatory burden, including breach notification, data subject rights, and impact assessments.
When a clinical AI vendor ingests DICOM data on a server, they typically take on the role of processor under a data processing agreement with the hospital. This is the standard model. The hospital is the controller, the vendor is the processor, and the DPA defines the boundary.
When the inference runs in the browser and the server never receives pixel data, the vendor is closer to a software supplier than a processor. The hospital remains the controller. The browser session is an extension of the hospital's own processing environment. The vendor provides the tool, and the tool runs entirely inside the controller's perimeter.
This is the architectural translation of "DICOM never leaves the session boundary." The session is the surgeon's browser tab. Within that boundary, the controller's existing data handling policies apply. Outside that boundary, we as the vendor see only what the surgeon has explicitly chosen to share, which in our case is anonymised geometry, not pixel data.
We are not naïve about this. Browser-based does not mean the regulatory questions disappear. It does mean they take a different shape, one that is materially easier to defend for a pre-MDR-CE-marked product running a clinical pilot.
The Five Tradeoffs We Accepted
The browser-first architecture is not free. There are real costs, and they shaped which clinical use cases we prioritised first.
1. Client hardware floor. Inference performance varies meaningfully with the surgeon's device. On a MacBook Pro M-series or a Windows laptop with discrete GPU, our segmentation models run at acceptable speed. On a budget Chromebook, they will not. We set a minimum hardware spec and accept that the addressable market within our pilot is constrained by it.
2. Model size budget. A 200 MB model is a non-starter for browser delivery. We work within a roughly 30-60 MB total weight budget per inference task, which forces architectural choices, model distillation, and quantisation. This is a real constraint that shapes which models we can ship and which we cannot.
3. First-load latency. Models have to be fetched, cached, and warmed up. The first inference is always slower than subsequent ones. We mitigate with aggressive caching and lazy loading, but the cold-start cost is structural.
4. Inference observability is harder. When inference runs on a server, we can log every input, every output, every model version, every latency measurement. When it runs in the browser, we get only what the client chooses to send back, and we have to be careful about what we ask for so we do not accidentally reintroduce the privacy questions we designed around. We accept reduced observability and rely on the surgeon to report anomalies.
5. Browser API surface stability. WebGPU is still maturing. ONNX Runtime Web is updated frequently. We have to track upstream changes and validate that our inference pipeline keeps producing identical outputs across browser versions. This is real ongoing work.
These costs are real, but they are predictable, and we judged them as worth paying for the structural privacy advantages.
Where We Landed: Hybrid C-Lite
In practice, no clinical AI system is purely client-side. We rely on the server for authentication, account management, model delivery, mesh storage for STL artifacts the surgeon explicitly chooses to save, and audit logging of non-pixel events. What never crosses the server boundary is the DICOM pixel data itself.
We call this internally the Hybrid C-Lite architecture. The C is for client, and the Lite indicates that the server retains a minimal role focused on identity, model artifacts, and the derived geometry the surgeon explicitly persists. The pixel data stays in the session. Everything else has its own data handling story, documented separately, with per-user encrypted storage for the geometry layer.
The architecture decision was finalised in late May 2026 after we surfaced an isolation issue in an earlier hybrid pattern that briefly leaked between user contexts in IndexedDB. The lesson from that incident shaped the current design. Each user's session has a derived encryption key, every persisted artifact is encrypted with that key, and there is no shared storage namespace across users on the client side.
What the Pilot Will Validate
A clinical pilot is the only way to find out whether the architectural choices we have made survive contact with how surgeons actually use the tool. The pilot will tell us whether the browser-based inference path is acceptable in operative-day workflows, whether the cold-start latency is tolerable when a surgeon is making preoperative decisions under time pressure, and whether the absence of server-side audit logs is acceptable to the institutions that need to satisfy their own internal compliance teams.
We are opening the clinical pilot waitlist now. The first wave is invite-only and we are looking for orthopaedic surgeons, radiologists, and institutions willing to run the platform in real preoperative planning, give us frank feedback about where the architecture serves them and where it breaks, and help us shape the second wave.
If you are reading this and any of it resonates with how you think about clinical AI deployment, the waitlist is open at salnus.com/waitlist.
Closing Note
We made the browser-first call because we wanted the privacy story to be a structural property of the system rather than a layer of policy on top of a standard cloud architecture. We accepted the costs that come with that. The clinical pilot is where we find out whether the surgeons we built this for agree that the tradeoffs are the right ones. The architecture is documented. The compliance posture is defensible. The pilot is the next step.
Reviewed by the Salnus biomedical engineering team.