6 min read

GradCAM Explained: AI Visualisation in Medical Imaging

How GradCAM works, why it matters for clinical trust in medical AI, and what the heatmaps reveal when overlaid on a knee radiograph.

Salnus Orthopedic Solutions
GradCAMExplainable AIXAIDeep LearningMedical ImagingTrust

The Black Box Problem

A deep learning model examines a knee radiograph and outputs "KL Grade 2, 78% confidence." The surgeon's immediate question is not about the probability — it is about the reasoning. What features in this image led the model to that conclusion? Is it looking at the joint space, the osteophytes, or an artifact in the corner of the film?

Without an answer to this question, clinical trust is impossible. A model that produces the right answer for the wrong reason is more dangerous than a model that produces no answer at all — because it will fail unpredictably when the wrong reason is no longer correlated with the right answer.

Gradient-weighted Class Activation Mapping (GradCAM), introduced by Selvaraju et al. in 2017, is the most widely used technique for answering this question.

How GradCAM Works

The intuition behind GradCAM is straightforward. A convolutional neural network processes an image through successive layers, each producing feature maps — internal representations that capture progressively more abstract visual patterns. Early layers detect edges and textures; deeper layers detect complex structures like joint margins, bone contours, and osteophyte shapes.

GradCAM asks: which regions of the final convolutional feature maps were most important for the model's classification decision? It answers this by computing the gradient of the predicted class score with respect to each feature map. Features that had a strong positive influence on the prediction receive high gradient values; features that were irrelevant receive low values.

These gradients are averaged across each feature map to produce a single importance weight per feature map. The feature maps are then combined using these weights — essentially a weighted sum that highlights the spatial regions that contributed most to the decision. The result is a coarse heatmap at the resolution of the final convolutional layer, which is then upsampled to the input image dimensions and overlaid as a colour map.

Red (hot) regions indicate areas that strongly influenced the classification. Blue (cold) regions had minimal influence. The heatmap does not show what the model "sees" in a literal sense — it shows where the model focused its computational attention when making the classification decision.

What GradCAM Reveals in Knee OA Assessment

In our OA screening model, GradCAM heatmaps consistently highlight specific anatomical regions that align with known OA pathology:

For KL Grade 0 (Normal) images, the heatmap is typically diffuse with no strong focal activation — the model finds no specific region indicative of disease.

For KL Grade 2 (Mild OA), activation concentrates on the medial joint space and the margins of the medial femoral condyle and tibial plateau — precisely where early osteophyte formation and initial joint space narrowing occur.

For KL Grade 3–4 (Moderate to Severe), the heatmap strongly highlights the medial compartment joint space (where narrowing is most pronounced), the osteophyte margins, and subchondral sclerotic regions. In severe cases, the activation pattern often extends to the lateral compartment as well, reflecting the global joint involvement characteristic of advanced disease.

This anatomical alignment between GradCAM activation and known OA features provides confidence that the model is learning clinically meaningful patterns rather than exploiting incidental correlations (image borders, equipment markers, text overlays).

When GradCAM Raises Red Flags

GradCAM is equally valuable when it reveals problems. If the heatmap highlights a region outside the joint — the image border, a radiographic marker, or text annotation — it indicates the model may be relying on a spurious feature rather than genuine anatomy. This is a strong signal that the model's output should not be trusted for that particular image.

During model development, we use GradCAM systematically to audit model behaviour. Every misclassified image in our validation set is examined with GradCAM to determine whether the error reflects a genuine diagnostic difficulty (borderline cases where expert radiologists also disagree) or a model failure (the model focused on the wrong region entirely).

Clinical Value: Trust Through Transparency

In medical AI, accuracy alone is insufficient for adoption. A 2023 survey of orthopaedic surgeons found that the most frequently cited barrier to AI adoption was not accuracy concerns but lack of interpretability — surgeons wanted to understand why the model reached its conclusion before considering the output in their clinical workflow.

GradCAM directly addresses this barrier. By overlaying the attention heatmap on the original radiograph within the DICOM viewer, the surgeon can evaluate the model's output in the same visual context used for manual assessment. The AI result becomes a second opinion with visible reasoning, not an opaque numerical output.

In the Salnus Surgeon Portal, the GradCAM heatmap is generated alongside every OA screening result and displayed as a toggleable overlay on the radiograph. The surgeon can switch the overlay on and off, adjust its opacity, and correlate the AI's attention with their own clinical impression.

Technical Limitations

GradCAM has known limitations that clinicians should understand. The heatmap resolution is limited by the final convolutional layer's spatial dimensions — for DenseNet-121, this is 7x7, which is upsampled to the input image size. Fine-grained localisation (e.g., "this specific osteophyte") is beyond GradCAM's resolution.

GradCAM is class-specific — it shows regions important for the predicted class, not regions important for discrimination between classes. The regions highlighted for KL-2 may overlap with those for KL-3, making it difficult to determine what distinguishes the two grades from the heatmap alone.

The heatmap is generated post-hoc and describes the model's behaviour, not its internal reasoning process. Two models with identical GradCAM outputs may be using different internal features to arrive at the same spatial attention pattern. GradCAM should be interpreted as "what" the model attends to, not "why" in a mechanistic sense.

The Path Forward

Explainable AI in medical imaging is evolving rapidly. Current research directions include concept-based explanations (the model explains its decision in terms of clinical concepts like "joint space narrowing" and "osteophyte size" rather than pixel heatmaps), counterfactual explanations (showing how the image would need to change for the model to give a different classification), and uncertainty quantification (not just what the model predicts, but how confident it is and where that confidence breaks down).

At Salnus, we view GradCAM as the minimum viable explanation — necessary for clinical trust but not the end goal. Our development roadmap includes integrating quantitative measurements (JSW, alignment angles) alongside the classification output, providing the surgeon with both the what (KL grade) and the why (specific measurements that support the grade).


Disclaimer: This article is for educational purposes. AI-generated explanations should be evaluated by qualified healthcare professionals in clinical context.

References:

  • Selvaraju RR, et al. Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization. ICCV 2017.
  • Reyes M, et al. On the Interpretability of Artificial Intelligence in Radiology. Radiology: AI. 2020;2(3):e190043.

Reviewed by the Salnus biomedical engineering team.

← All Posts

Orthopaedic AI Research Updates

Monthly research digest, product updates, and clinical AI insights.

Unsubscribe anytime.