Introduction:
Delay in diagnosis and treatment has been found to be associated with poorer surgical and functional outcomes in patients with symptomatic metastatic epidural spinal cord compression (MESCC). Staging CT scans are performed routinely in cancer patients and high grade MESCC is often underdiagnosed in these scans. We had previously developed and validated a deep learning model (DLM) to automate the detection of high grade (Bilsky 2/Bilsky 3) MESCC. In this study we aim to assess the utility of a DLM in detecting high grade MESCC and potential reduction in diagnostic delays.
Methods:
This is a retrospective review of 140 patients who had underwent surgical decompression and stabilization for MESCC between Jan 2015 to Jan 2022. All patients had high grade MESCC (Bilsky 2-3) between C7 to L2. Prior staging CT Thorax Abdomen and Pelvis up to 4 months prior to diagnostic MRI was reviewed by a consultant musculoskeletal radiologist (JH) and consultant spinal surgeon (JT) and classified into cases with and without high grade MESCC. A previously validated deep learning model (DLM)was then used to classify these scans. Their findings were then compared to the original radiologist (OR) reports. Inter-rater agreement was assessed. Potential decrease in diagnostic delay was calculated in days from screening CT to first MRI scan diagnosing high grade MESCC.
Results:
95/140 (67.8%) of patients had available pre-operative CT scans. High grade MESCC was identified in 84/95 (88.4%) of the pre-operative CT scans by both JH and JT. High grade MESCC was reported in only 32/95 (33.7%) of pre-operative scans by the OR. There was almost perfect agreement between JH vs JT kappa=0.947 (CI 0.893-1.000)(p<0.001) , JH vs DLM kappa=0.891 (0.816-0.967)(p<0.001) and JT vs DLM kappa = 0.891(0.816-0.067) (p<0.001). There was poor interobserver agreement between the OR and all other readers (kappa between 0.021 to 0.125). There was a mean potential reduction in diagnostic delay of 19 days.
Discussion:
There was a high incidence of undiagnosed high grade MESCC in the OR reports. The DLM had an almost perfect interobserver agreement with both reviewers and this is the first clinical study to demonstrate its potential for reducing diagnostic delays. There is a need for further prospective studies to characterize its role in the early diagnosis and treatment of MESCC