Computer VisionApril 12, 20266 min read

How AI Can Read a Chest X-Ray and Detect Pneumonia

Doctors in many parts of the world do not have enough radiologists to read every X-ray. This AI does it in under 200ms with 86% accuracy. Here is how transfer learning makes that possible.

Pneumonia kills over 2 million children under 5 every year. The scary part? Most of those deaths are preventable if the infection is caught early. The problem in many developing regions is simple: not enough radiologists to read the X-rays fast enough. A patient might wait days for a diagnosis that could be given in seconds.

๐Ÿซ

What is Pneumonia on an X-Ray?

A healthy lung on an X-ray looks dark and clear โ€” air does not show up on X-rays. Pneumonia fills parts of the lung with fluid, which shows up as bright white patches called 'consolidation'. A trained eye spots it instantly. The challenge: training an AI to spot it just as reliably.

How a Computer Sees an X-Ray ๐Ÿ‘๏ธ

A chest X-ray is just a grid of pixels โ€” numbers from 0 (black) to 255 (white). A 256ร—256 image is 65,536 numbers. The AI's job is to look at all those numbers and decide: are the patterns here consistent with pneumonia, or not? Raw numbers are hard to reason about directly, so Convolutional Neural Networks (CNNs) learn to detect progressively complex patterns.

What the CNN Learns at Each Layer

๐Ÿ–ผ๏ธ   Input X-Ray (256 ร— 256 pixels)
        โ”‚
        โ–ผ
๐Ÿ”ฒ  Layer 1 โ€” Detects EDGES
         (where light meets dark, outlines of ribs)
        โ”‚
        โ–ผ
๐Ÿ”ท  Layer 2 โ€” Detects SHAPES
         (curves, circles, the shape of lungs)
        โ”‚
        โ–ผ
๐Ÿซ  Layer 3 โ€” Detects PATTERNS
         (the white haze of fluid, rib texture)
        โ”‚
        โ–ผ
๐Ÿง   Final Layer โ€” Classifies
         PNEUMONIA (94%) or NORMAL (6%)

The Secret Weapon: Transfer Learning ๐Ÿ”

Training a CNN from scratch would need millions of X-ray images. We only had 5,216. Transfer learning solves this beautifully. Imagine teaching a kid to recognise cats. Once they know what a cat looks like, teaching them to recognise a leopard is much faster โ€” they already understand 'fur', 'eyes', 'four legs'. We use a CNN called ResNet18 that was already trained on 1.2 million everyday photos. It already knows edges, textures, and shapes. We just teach its final layers the new task: healthy lung vs pneumonia.

๐Ÿ“ฅ

Start with a Pre-Trained ResNet18

ResNet18 was trained on ImageNet โ€” 14 million photos of dogs, cars, plants, and more. It already understands visual patterns at a deep level. We download these pre-trained weights and keep them mostly frozen.

๐Ÿ”„

Replace the Final Classification Layer

The original model classifies 1,000 different objects. We swap the last layer for a new one with just 2 outputs: PNEUMONIA and NORMAL. This is the only layer that needs heavy training.

๐Ÿ‹๏ธ

Fine-Tune on 5,216 X-Ray Images

We train for 10 epochs on the chest X-ray dataset, using data augmentation (flips, rotations, brightness shifts) to artificially multiply our training examples and prevent overfitting.

๐Ÿš€

Deploy as a Flask API on Render

The trained model is wrapped in a Flask web server. Upload any chest X-ray image, and it returns a prediction with confidence score in under 200ms. Deployed live and publicly accessible.

86.22%

Test accuracy

+48pp

Above baseline CNN (37.5%)

200ms

Inference time

4 days

Build to deployment

This project is a proof of concept, not a clinical diagnostic tool โ€” but it demonstrates exactly how AI can assist (not replace) doctors in resource-constrained environments. The same transfer learning technique applies to any image classification problem: product defect detection, satellite image analysis, or document classification.

#Computer Vision#PyTorch#Transfer Learning#Medical AI#ResNet

Need This Built for Your Business?

Kumar Katariya builds production-grade AI systems like this. Explore related services or get in touch.

KK

Kumar Katariya

AI/ML Engineer ยท Top Rated Plus on Upwork ยท Kaggle Expert