WhenRadiologistsStoppedWorryingAboutWhatTheyMightHaveMissed
A diagnostic network was processing 8,000 scans a day with a radiologist workforce stretched too thin. We built an AI platform that reads every scan in under 5 seconds and triages urgency—giving radiologists a reliable second pair of eyes so they always know where to look first.
Business Context & Telemetry
Our client operated 14 diagnostic imaging centres processing 8,000 daily studies across X-ray, CT, and MRI. Post-pandemic volume had surged, but their radiologist headcount of 38 had not. The team was working under sustained pressure, dealing with heavy backlogs, and facing the acute professional risk that comes with reading complex scans in an exhausted state.
Multi-site diagnostic imaging network
38 radiologists supported by 120 radiography and administrative staff
Four-city network across two Indian states with high-throughput volumes
Radiologist Workstation Integration, Clinical Dashboard, PACS Integration, Referring Physician Portal
2009
“I read 120 studies yesterday. Ninety were routine chest X-rays that didn't need my level of attention. But I couldn't know which 90 until I'd looked at all 120. That's the problem. Not the volume. The inability to know where the volume actually matters.”
Lead Radiologist & Clinical Director
Eight thousand scans a day. Thirty-eight radiologists. And no way to know which studies needed urgent attention.
Radiology is a field where the consequences of a missed finding are often irreversible. The network's radiologists weren't underperforming; they were operating in conditions that made optimal performance structurally difficult. Volume was the mechanism. Time pressure was the accelerant.
A blind, time-ordered worklist
Radiologists began each shift with a queue ordered by arrival time. A routine annual X-ray sat next to a CT chest for suspected haemoptysis. Urgent scans only received attention when the radiologist finally reached them in sequence.
Subtle findings missed under fatigue
Reader fatigue is a well-documented predictor of diagnostic error. A radiologist reading their 80th study of the day, rushing to clear an evening backlog, is operating at a meaningfully lower detection accuracy than on their 20th.
Manual critical communications
When a severe finding like a pulmonary embolism was caught, radiologists had to step away to manually call the referring clinician. In a high-volume setting, this took 15 minutes per urgent case, compounding worklist delays.
Fragmented prior-study comparisons
Comparing a current scan to a historical one required manually hunting across four different legacy PACS systems. It took minutes per case, leading to comparisons being done from memory when time pressure was highest.
Physicians waiting hours in the dark
Routine studies cleared quickly in the morning, but afternoon backlogs meant urgent evening scans could take 6–8 hours to report. Referring physicians frequently interrupted the reading room with phone calls just to check status.
They had tried outsourcing overflow to a teleradiology service, which helped with volume but fractured quality consistency. They also trialled a commercial AI screening tool for X-rays. It was heavily binary ('normal/abnormal'), produced too many false positives, and lacked the nuance required for clinical trust. It was abandoned after four months.
"The Clinical Director had watched colleagues burn out. He wasn't looking for AI to replace his radiologists. He wanted a tool that would let them do what they trained for—thinking carefully about complex cases—rather than spending their cognitive capacity processing normal studies."
We started by reading scans alongside the radiologists, because the AI had to fit how they actually worked.
Before discussing model architectures, we embedded with the radiology team. We mapped out shift handovers, clinical decision workflows, and the informal practices they used to survive the volume.
Discovery & Methods
We spent 12 days sitting at reading workstations across four centres, interviewing all 38 radiologists. We asked each to describe their last difficult shift and the last time they wished they'd had more context. We also analyzed 6 months of audit data. The revelation was unanimous: the problem wasn't radiologist skill. It was the absence of a triage layer before they opened the scan.
The radiologist's problem wasn't reading scans. It was knowing which scans needed careful reading.
Radiologists could easily read 120 normal X-rays a shift. The burden was having to read them all at maximum attention just to find the one abnormality. The AI's primary job was triage: giving a reliable signal so the clinician knew exactly how to pace their attention.
Design Philosophy
The AI assists; it does not replace. A computer vision model cannot replicate clinical context or interpretive expertise. We designed for augmentation, ensuring the AI stayed in its lane so radiologists could confidently stay in theirs.
Constraints Respected
- No rip-and-replace: AI had to act as a DICOM node integrating silently with 4 legacy PACS systems.
- Zero app-switching: Findings had to surface directly inside the existing reading workstation.
- Regulatory compliance: Full adherence to Indian medical device regulations and IEC 62304 standards.
- Local validation: Models had to be validated against the network's specific patient demographics and equipment, not just published benchmark datasets.
An AI that reads every scan the moment it arrives — so radiologists know exactly where to start.
Five interconnected capabilities built straight into the existing PACS environment, designed to triage urgency without disrupting the workflow radiologists already trusted.
Multi-Modality Detection Engine
Analyses every X-ray, CT, and MRI within 5 seconds of arrival. It identifies 28 finding categories, assigning each a confidence score and severity classification, displayed right on the worklist thumbnail.
Radiologists get a preliminary map of areas warranting attention before they even open the study. For normal studies, it speeds reporting. For abnormal ones, they arrive pre-alerted to the complexity.
Ensemble of CNNs trained on 100K+ local studies. Inference runs on co-located GPU infrastructure to hit the 5-second latency requirement. Outputs in standard DICOM SR format.Core deep learning framework and medical imaging tools
Co-located inference infrastructure for sub-5-second latency
Native protocol compatibility with all legacy PACS systems
AI inference API and clinical alert orchestration layer
Standardized clinical data exchange for EMR integration
Audit logs, triage state management, and real-time queuing
Scalable compute and HIPAA-compliant DICOM archive storage
AI findings are an overlay, not a separate application.
“If radiologists have to click away to a new window, they won't do it under pressure. We injected severity indicators directly onto the existing worklist thumbnails.”
Visible, honest confidence scores.
“Radiologists who blindly trust AI are dangerous. When the model's confidence was low, the UI explicitly greyed out the indicator and prompted a full independent review. This honesty is what ultimately built trust in the high-confidence reads.”
Twenty-two weeks to launch. Validated by radiologists, not just benchmark datasets.
Medical AI requires a higher standard of proof than commercial software. We structured our validation framework so that the clinical team—not just our data scientists—were the ultimate arbiters of accuracy.
Delivery Timeline
Operational Log
Clinical Discovery & Dataset Prep
Weeks 1–4Workflow observation and technical audits. 100,000+ local studies were retrospectively labelled in collaboration with 8 senior radiologists acting as ground-truth annotators.
Model Training & Internal Validation
Weeks 5–10Iterative model training per modality. Models were strictly withheld from the pilot unless internal validation sensitivity exceeded 94%.
PACS Integration & UI Build
Weeks 11–14DICOM node integration across all 4 legacy PACS. Triage interface and critical alert pathways built and tested using de-identified data.
Clinical Pilot
Weeks 15–19Prospective pilot with six volunteer radiologists across three centres. Daily case reviews with the Clinical Director to recalibrate thresholds based on live feedback.
Network Rollout & Regulatory Clearances
Weeks 20–22Phased launch across all 14 centres. Medical device registration documentation completed alongside the launch of the referring physician portal.
Team Topology
Deployed Roster
Collaboration
Working Rhythm
The eight senior annotating radiologists weren't just labelers; they were design partners. Weekly review sessions over conflicting diagnoses revealed exactly where the model needed targeted training data. We didn't move forward until the Clinical Director approved the outputs.
Course Corrections
Diagnostic Log
Equipment bias. The network used different generations of CT and MRI scanners. Models trained on one manufacturer systematically underperformed on another.
We implemented scanner-stratified training, running a 3-week targeted labelling sprint on the underrepresented machines. We also baked the scanner type into the confidence scoring model, ensuring the AI flagged its own uncertainty on older machines.
Alert fatigue. During week one of the pilot, the system generated 9 false positive 'critical' alerts per day. Clinicians were annoyed.
We held an intensive review and realized the threshold for 'large pleural effusion' was too sensitive. We recalibrated those specific findings to an 'urgent' rather than 'critical' tier. Daily alert volume dropped to a highly accurate, clinically workable baseline.
Automation bias. Three pilot radiologists began heavily relying on the AI overlay before forming their own independent impressions.
We immediately changed the UX. The AI overlay was moved behind an opt-in toggle, requiring a deliberate click to display. This simple friction restored the radiologist's independent interpretation as the primary act, while keeping the AI as a powerful secondary check.
Twelve months later: higher accuracy, zero backlogs, and a team finally finishing shifts on time.
The metrics were outstanding, but the real victory was cultural. The platform fundamentally changed the relationship between the radiologists and their overwhelming volume, massively reducing burnout.
95%
AI detection sensitivity
consistently validated across 100K+ post-deployment clinical studies
70%
for routine screening studies
3×
reduction in time from scan completion to radiologist notification
Qualitative Objectives Reached
- Radiologist retention hit 100% in the year following deployment, halting a prior trend of resignations directly tied to workload burnout.
- Referring physician satisfaction for turnaround times jumped from 41% to 82%, largely driven by the transparency of the new portal.
- During the pilot, the AI flagged a subtle 4mm pulmonary nodule that was missed on the independent read. A follow-up confirmed stage IA lung cancer. This case became the network's internal proof that the system saved lives.
"I've been a radiologist for 19 years. This is the first technology change I've experienced that made my job better at the things that matter to me — catching things I might have missed, spending my attention where it's actually needed. The AI does things I can't do at volume and at speed. And it leaves the nuance and clinical synthesis to me. It took 19 years to see it, but that's the exact right division of labour."
Lead Radiologist & Clinical Director
Diagnostic Imaging Network Client
Insights Gained
Valuable lessons and strategic insights uncovered through this project that inform our future work and architectural decisions.
In medical AI, the radiologist's trust is the actual product.
A highly accurate model is useless if clinicians won't look at it. Trust is earned through transparency—visible confidence scores, acknowledged limitations, and interfaces that respect their independent judgment.
Automation bias is a design problem.
When doctors blindly trust AI, it's often because the UI made the AI the most salient thing on the screen. Changing an overlay to an 'opt-in toggle' proved that clinical safety is fundamentally tied to UX design.
Benchmark datasets mean nothing in the real world.
Published literature doesn't account for a clinic's specific mix of aging equipment and unique demographics. Real-world, local validation on the client's actual data is the only metric that dictates clinical success.
Capabilities & Archive
Running a diagnostic service where scan volume is outpacing your clinical capacity — and you're aware that something is eventually going to be missed? That's the exact problem this platform was built to solve.
Services Leveraged
Every scan read under immense time pressure is a scan lacking full clinical attention.
We build diagnostic AI for imaging networks that have tried generic tools and been disappointed. The difference is in our clinical design, localized validation, and deep workflow integration. Tell us about your radiology volume, and we'll give you a straight read on what AI can actually do for your team.
"No benchmark accuracy claims. A real conversation about your radiology workflow."
