Conference PaperPDF Available

A Study of the Feasibility of using slabbing to reduce Tomosynthesis Review Time

Authors:

Abstract and Figures

This study aimed to investigate whether decreasing the amount of slices in breast tomosynthesis (BT) image volumes reduce reading time. BT slices were combined into so-called slabs, by reconstructing thin slices and merging them into thicker slabs. Sets of slabs where created from 35 clinical BT volumes with malignant or benignant findings and from 50 BT volumes drawn from screening sets (without any prior review). The image sets were reviewed in two separate sessions while the review time was recorded. A total of five experienced radiologists were employed for the image review. Additionally a VGA study was performed to compare slabbed images with the originals in order to ensure that the image quality was not significantly degraded. One set of 27 pathological cases (13 masses and 14 microcalcification clusters) and one of 22 subtle lesions that had been missed on digital mammography but detected on BT were presented to an experienced radiologist and 2 medical physicists who rated the quality of the slabbed versions relative to the originals. The study could find no significant degradation in image quality when using 2 mm slabs instead of 1 mm slices. There was no significant decrease in reading time on clinical cases (P=.133), but on screening images there was a significant decrease of 7.7 +/- 9.6 s from an average level of 32.2 +/- 14.5 s (P<.0001). This suggests that increasing slab thickness can reduce the time radiologists spend studying normal images by 20%.
Content may be subject to copyright.
A Study of the Feasibility of using slabbing to reduce Tomosynthesis
Review Time
Magnus Dustler1*, Martin Andersson1, Daniel Förnvik1, Pontus Timberg2, Anders Tingberg1
1Medical Radiation Physics, Department of Clinical Sciences Malmö, Faculty of Medicine, Lund
University, Skåne University Hospital, SE-205 02, Sweden
2 Diagnostic Radiology, Department of Clinical Sciences Malmö, Faculty of Medicine, Lund
University, Skåne University Hospital Malmö, SE-205 02, Sweden
ABSTRACT
This study aimed to investigate whether decreasing the amount of slices in breast tomosynthesis (BT) image volumes
reduce reading time. BT slices were combined into so-called slabs, by reconstructing thin slices and merging them into
thicker slabs. Sets of slabs where created from 35 clinical BT volumes with malignant or benignant findings and from 50
BT volumes drawn from screening sets (without any prior review). The image sets were reviewed in two separate
sessions while the review time was recorded. A total of five experienced radiologists were employed for the image
review.
Additionally a VGA study was performed to compare slabbed images with the originals in order to ensure that the image
quality was not significantly degraded. One set of 27 pathological cases (13 masses and 14 microcalcification clusters)
and one of 22 subtle lesions that had been missed on digital mammography but detected on BT were presented to an
experienced radiologist and 2 medical physicists who rated the quality of the slabbed versions relative to the originals.
The study could find no significant degradation in image quality when using 2 mm slabs instead of 1 mm slices. There
was no significant decrease in reading time on clinical cases (P = .133), but on screening images there was a significant
decrease of 7.7 ± 9.6 s from an average level of 32.2 ± 14.5 s (P < .0001). This suggests that increasing slab thickness
can reduce the time radiologists spend studying normal images by 20%.
Keywords: Mammography, breast tomosynthesis, slab, thick slice, reading time
1. INTRODUCTION
Breast Tomosynthesis (BT) is on the rise as perhaps a viable alternative to mammography for breast cancer screening.
This study aims to reduce one of the major potential obstacles: the increased review time per case when compared to
mammography1-3. Reducing the amount of slices in the BT image volume by merging them into thicker slabs might be
effective in this regard, as especially in a screening modality, the ability to promptly review a case is of critical
importance.
This study is based in part on data collected during the spring of 2011 with the intention of possibly expanding them at a
later date. The data was later reanalyzed and added to during 2012, when more advanced methods of producing image
slabs became available.
2. METHOD(S)
2.1 Slice thickness
Increasing the thickness of BT slices can be done in two ways4. The most straightforward of these is to vary the thickness
* Corresponding author: magnus.dustler@med.lu.se
+46 40 33 86 56
during reconstruction, as the filtered back projection (FBP) algorithm used can handle arbitrary reconstruction
thicknesses, merely affecting the size of the reconstructed image voxels along the z-direction. To improve the quality of
Medical Imaging 2013: Image Perception, Observer Performance, and Technology Assessment,
edited by Craig K. Abbey, Claudia R. Mello-Thoms, Proc. of SPIE Vol. 8673, 86731L
© 2013 SPIE · CCC code: 1605-7422/13/$18 · doi: 10.1117/12.2006987
Proc. of SPIE Vol. 8673 86731L-1
Downloaded From: http://spiedigitallibrary.org/ on 09/27/2013 Terms of Use: http://spiedl.org/terms
the reconstructed volume and add to the 3D-appearance by reducing the influence of out-of-plane artifacts a slice
thickness filter tailored to a specific z-direction sampling interval (or slice thickness) is used. The application of the slice
thickness filter is intended to keep slice thickness constant, but it also reduces the sharpness of high-frequency image
components. As the Siemens MAMMOMAT Inspiration Tomo (Siemens AG, Erlangen, Germany) and its slice thickness
filter is optimized for a sampling interval of 1 mm in the z-direction doing this would introduce an additional factor of
uncertainty if slices are made thicker 5.
2.2 Slabbing
The alternative to generating thicker slices is to combine several reconstructed slices into new thicker ones, which are
called slabs (Figure 1). There are a number of possible approaches to do this, most prominently by setting each pixel in
the slab either as the average or the maximum value of the corresponding pixels in the constituent slices (known as AIP
(average-intensity projection) and MIP (maximum-intensity projection slabs, respectively). These approaches do not
differ in the number or thickness of slabs they produce, but they will affect the image information contained in the slices
in different ways. Averaging suppresses image noise, but will also reduce the contrast of small features such as
microcalcifications, especially if such structures are included in only one slice. Selecting the MIP will instead increase
visibility of micro-calcifications while also increasing the noise level. For this study, it was decided that the MIP and AIP
approaches would both be appopriate, created either by importing and slabbing BT-volumes using Matlab (Mathworks,
Natick, MA) or by employing Siemens prototype system with a built-in option of reconstruction based on statistical
artifact reduction and superresolution, which dispenses with the slice-thickness filter and instead reconstructs slices with
a thickness of 1/6th mm and combines them into slabs of the desired thickness 6. This method was not available during
the initial phase of the study.
Figure 1: This figure illustrates the slabbing procedure. Each pixel of a slab consists of the maximum values of corresponding pixels in
a number of thinner slices.
In order not to complicate the study it was decided to only investigate the reading time reduction obtained by reducing
the number of included slices in the image volumes by one half. The required reduction of the number of slices required
a comparison with 2 mm slabs. Two different methods were used to create the slabs. Firstly, the straightforward
Proc. of SPIE Vol. 8673 86731L-2
Downloaded From: http://spiedigitallibrary.org/ on 09/27/2013 Terms of Use: http://spiedl.org/terms
approach of building each 2 mm AIP slab of two consecutive 1 mm slices reconstructed with standard settings using the
slice thickness filter was used. Secondly, the previously mentioned built-in option of reconstruction based on statistical
artifact reduction was used when it became available during 2012, combining 12 slices into 2 mm slabs. These
approaches are broadly similar in that the AIP slabs they produce are somewhat noisier than 1 mm reconstructed slices
but preserve (or improve) the visibility of high frequency components (such as microcalcifications).
2.3 Image Quality
To ensure that this procedure would not dramatically affect lesion detection slabs of a set of 27 pathological structures
from clinical images (13 lesions and 14 microcalcification clusters) were created. These were presented side-by-side with
the original in a relative Visual Grading Analysis (VGA) 7-8. One experienced radiologist and two medical physicists
were asked to rate the quality of the slabbed image relative to the original, using ImageJ software 9.
Later 22 sets of images with subtle findings (defined as lesions that were detected at tomosynthesis but not on
mammography) were used to compare slabs of 2, 3 and 5 mm thickness (using the now available thin slice superreslution
reconstruction with statistical artifact reduction which had the added benefit of being able to easily import the images to
standard mammography workstations, allowing their use for review) with 1 mm originals. The same radiologist and
medical physicists ranked these images from best to worst. The criteria which reviewers were told use in their rating and
ranking of images were (if applicable); mass visibility, microcalcification conspicuity, mass visibility and
microcalcification continuity. Continuity was defined in this context as the degree to which structures could be followed
from one slice to the next.
2.4 Reading time
The reading time study was split into two parts, the first performed during 2011 using standard 2 mm slabs. The first part
involved two experienced radiologists reviewing 35 clinical images (30 with minor proven benign findings and 5 with
proven malignant findings) in a simulated screening setting using ViewDEX10-11 where the radiologists where told to
mark findings as normal while being timed with a chronometer. The enriched material was intended to investigate the
effect of slabbing on tomosynthesis cases with suspicious findings. The radiologists were presented with the cases in two
sessions separated by three weeks. 19 image stacks were randomly selected to be slabbed during the first session, while
the remaining 16 stacks were slabbed during the second session.
The second part of the study involved slabbing 50 sets of image stacks taken directly from a BT screening study. These
image stacks were purposely randomly selected without any prior review, in order to represent the type of cases
commonly encountered in a screening setting, i.e. likely normal cases with no pathology. Four experienced radiologists
(one of which was part of the first clinical part as well) reviewed these cases both with the standard 1 mm slices and 2
mm superresolution MIP slabs under normal screening conditions using a standard mammography workstation. The
reviewers reviewed each case until they felt confident in assigning it a BIRADS score 12.This was done in between two
and four sessions separated by between two days and three weeks, with the same image reviewed no more than once
during any session. The reading time of each individual case was recorded using a chronometer, with the radiologist
allowed to use all of the workstations functions and told to review the mammograms as would normally be done for a
new screening patient with no prior history or images available.
3. RESULTS
3.1 Image quality
The results of both the relative VGA study and the image ranking study were investigated using a non-parametric
Friedman test for multiple readers. The VGA study showed no significant difference in image quality for the original
images and the slabbed images (Figure 2), while the ranking study (on the statistical artifact reduction AIP slabs) did not
show a significant difference between the original images and 2 mm images (Figure 3), but did show significant
differences for 3 and 5 mm slabs.
Proc. of SPIE Vol. 8673 86731L-3
Downloaded From: http://spiedigitallibrary.org/ on 09/27/2013 Terms of Use: http://spiedl.org/terms
1
4_' F
Friedman score 4
3.2 Reading time
The first part of the reading time study, on clinical images, was inconclusive; with one radiologist showing a significant
decrease in reading time on the clinical images from a mean of 69.6 s to 55.4 s (P =.039) and one showing a very minor,
non-statistically significant decrease of 41.4 s to 40.4 s (P = 0.906). An ANOVA for multiple readers showed no
significant difference in review time (P = .133). The second part indicated a difference in reading time on the unsorted
screening material of 7.7 ± 9.6 s from an average level of 32.2 ± 14.5 s. A 2-way ANOVA performed on the whole
dataset showed that this decrease was significant (P<.0001), and indicated that there were significant differences in
reading time between the four reviewing radiologists. Further all four reviewers individually showed statistically
significant decreases in reading time (P< .0001) using the paired Student’s t-test. Table 1 summarizes data for each
reviewer.
Figure 2: Results of the relative VGA, using the Friedman test for multiple readers. There is no significant difference between group
1(original images) and group 2 (2 mm MIP slabs), as the confidence intervals overlap.
Reading time / s
1 mm 2 mm Difference
Reviewer 1 23.4 ± 6.6 14.1 ± 3.7 -9.4 ± 8.3
Reviewer 2 52.4 ± 12.6 41.2 ± 10.8 -11.2 ± 13.7
Reviewer 3 27.7 ± 8.1 22.4 ± 5.6 -5.2 ± 8.2
Reviewer 4 25.0 ± 4.7 20.2 ± 4.2 -4.8 ± 4.2
Total 32.2 ± 14.5 24.5 ± 12.1 -7.7 ± 9.6
Table 1: Summary of statistics from the reading time experiment on unsorted screening images.
Proc. of SPIE Vol. 8673 86731L-4
Downloaded From: http://spiedigitallibrary.org/ on 09/27/2013 Terms of Use: http://spiedl.org/terms
1
E
EF
5
0 2 4 6
Friedman score
810 12
Figure 3: Comparison of the ranking of original (1 mm) slices and 2, 3 and 5 mm slabs using the Friedman test for multiple readers.
There is a significant difference between the 3 and 5 mm slabs and the 1 mm slices as there is no overlap between their confidence
intervals.
4. DISCUSSION
The VGA and ranking studies did not indicate a significant decrease in image quality, which opens the possibility of
using thick slices to shorten reading time. Though the study would have to be considerably larger to conclusively show
that there is no difference, one can assume that such a difference would be minor and could likely be compensated for by
optimizing reconstruction parameters for the new slice thickness. Still, there is of course the possibility that slabbing
does impair image quality to an unacceptable degree. This could be construed as a limitation of the study, and of course
it is a serious problem if the image quality should be compromised. However, the scope of this study is about quantifying
the effect of slabbing on reading time, with the implication that although a larger image quality investigation is required
to implement thicker slabs in clinical practice; it is not pertinent to perform such a study if the gain in reading time is too
small to be relevant, though of course at what level a decrease in workload is relevant is an issue that has to be
considered by health economists.
Because of similar reasons to those mentioned above, no direct comparison between the quality of the two methods of
slabbing was performed.
Regarding the clinical images there was a large variation between the two radiologists both in absolute reading time and
changes in reading time, with one showing a significant improvement of reading time by 20% and the other showing no
applicable difference. However, by comparison the reading time differential of the unsorted images was very consistent,
with all reviewers showing a significant and substantial decrease. The average difference was 7.7 s or nearly 24%,
though it should be noted that there are substantial differences in reading time between different radiologists.
To put this in perspective, review of digital mammography is roughly 25-50 % faster than that of BT1-3. This implies that
there is a potential to speed up reading times by a meaningful amount. As noted before, there are individual variations in
reading time between readers and we theorize that such variations become more pronounced in a “busy” breast with
suspicious looking regions. Thus, even though the results for the clinical cases with suspicious findings were
Proc. of SPIE Vol. 8673 86731L-5
Downloaded From: http://spiedigitallibrary.org/ on 09/27/2013 Terms of Use: http://spiedl.org/terms
inconclusive, the fact that the vast majority of images do not have any suspicious findings and that there was, critically, a
reduction in reading time on the unsorted screening cases supports the idea that it could be beneficial to implement
slabbing in a screening program as it would speed up review of the majority of cases, those that show no suspicious
structures.
5. CONCLUSIONS
We conclude that it is feasible to use slabbing to reduce the review time of BT image sets to a level closer to that of
standard digital mammography. We cannot conclude whether slabbed images provide equivalent image quality (with
regards to lesion detection); but if not, the difference is likely minor. Thus, we consider slabbing to be a viable method to
reduce the workload of radiologists, though it requires thought and further investigation before it can be used as a
standard.
ACKNOWLEDGEMENTS
We would like to acknowledge Siemens AG Healthcare for the opportunity to use the statistical artifact reduction and
superresolution reconstructions and for financial support. We would also like to acknowledge all participating
radiologists at Unilabs AB and SUS Malmö.
REFERENCES
[1] Zuley ML, Bandos AI, Abrams GS, Cohen C, Hakim CM, Sumkin JH, Drescher J, Rockette HE, Gur D,
“Time to diagnosis and performance levels during repeat interpretations of digital breast tomosynthesis:
Preliminary observations,” Acad Radiol 17, 450-455 (2010)
[2] Good WF, Abrams GS, Catullo VJ, Chough DM, Ganott MA, Hakim CM, Gur D, “Digital breast
tomosynthesis: A pilot observer study,” Am J Roentgenol 190, 865-869 (2008)
[3] Gur D, Abrams GS, Chough DM, Ganott MA, Hakim CM, Perrin RL, Rathfon GY, Sumkin JH, Zuley
ML, Bandos AI, “Digital breast tomosynthesis: Observer performance study,” Am J Roentgenol 193, 586-591
(2009)
[4] Diekmann F, et al., “Thick Slices from Tomosynthesis Data Sets: Phantom Study for the Evaluation of
Different Algorithms,” Journal of Digital Imaging 22(5), 519-526 (2009)
[5] Mertelmeier T, et al., “Optimizing filtered backprojection reconstruction for a breast tomosynthesis
prototype device,” Proc. SPIE 6142, 131-142 (2006)
[6] Abdurahman S, Jerebko A, Mertelmeier T, Lasser T, Navab N, “Out-of-plane artifact reduction in
tomosynthesis based on regression modeling and outlier detection,” IWDM’12 Proc. of the 11th international
conference on Breast Imaging, 729-736 (2012)
[7] Båth M and Månsson L.G, “Visual grading characteristics (VGC) analysis: a non-parametric rank-
invariant statistical method for image quality evaluation,” Br J Radiol 80(951), 169-76 (2007)
[8] Månsson L.G, “Methods for the evaluation of image quality: A review,” Radiat Prot Dosimetry 90, 89-99
(2000)
[9] Rasband W.S, ImageJ, U. S. National Institutes of Health, Bethesda, Maryland, USA,
http://imagej.nih.gov/ij/, 1997-2012
[10] Håkansson M, Svensson S, Båth M, Månsson L.G, “ViewDEX – A Java-based software for presentation
and evaluation of medical images in observer performance studies,” Proc. SPIE 6509, (2007)
[11] Börjesson S, Håkansson M, Båth M, Kheddache S, Svensson S, Tingberg A, Grahn A, Ruschin M,
Hemdal B, Mattsson S, Månsson L.G, “A software tool for increased efficiency in observer performance
studies in radiology,” Radiat. Prot. Dosimetry 114, 45-52 (2005)
[12] D’Orsi CJ, Mendelson EB, Ikeda DM, et al. [Breast Imaging Reporting and Data System: ACR BI-
RADS – Breast Imaging Atlas], American College of Radiology, Reston, (2003)
Proc. of SPIE Vol. 8673 86731L-6
Downloaded From: http://spiedigitallibrary.org/ on 09/27/2013 Terms of Use: http://spiedl.org/terms
... A possible way to decrease the amount of data and shorten the reading time in BT is to reduce the number of slice images in the reconstructed image volume [7] . This can be done by combining adjacent image planes into thicker slice images, so called slabbing [7,8] . ...
... A possible way to decrease the amount of data and shorten the reading time in BT is to reduce the number of slice images in the reconstructed image volume [7] . This can be done by combining adjacent image planes into thicker slice images, so called slabbing [7,8] . Since thicker slabbing decreases the depth resolution, and thereby increases tissue overlap, it could affect the detection of lesions [9,10] . ...
... That depends on if radiologists spend an equal amount of time on all slice images, or routinely go through the outer parts of the image volume faster. However, earlier studies suggest that increased slab thickness reduces the reading time [7] . Furthermore, during the image evaluation in this work, there was a wait for the standard image volumes (2 mm slab thickness) to upload to the workstation, while the smaller image volumes (6 mm and 10 mm slab thickness) was uploaded almost instantaneous. ...
Conference Paper
The large image volumes in breast tomosynthesis (BT) have led to large amounts of data and a heavy workload for breast radiologists. The number of slice images can be decreased by combining adjacent image planes (slabbing) but the decrease in depth resolution can considerably affect the detection of lesions. The aim of this work was to assess if thicker slabbing of the outer slice images (where lesions seldom are present) could be a viable alternative in order to reduce the number of slice images in BT image volumes. The suggested slabbing (an image volume with thick outer slabs and thin slices between) were evaluated in two steps. Firstly, a survey of the depth of 65 cancer lesions within the breast was performed to estimate how many lesions would be affected by outer slabs of different thicknesses. Secondly, a selection of 24 lesions was reconstructed with 2, 6 and 10 mm slab thickness to evaluate how the appearance of lesions located in the thicker slabs would be affected. The results show that few malignant breast lesions are located at a depth less than 10 mm from the surface (especially for breast thicknesses of 50 mm and above). Reconstruction of BT volumes with 6 mm slab thickness yields an image quality that is sufficient for lesion detection for a majority of the investigated cases. Together, this indicates that thicker slabbing of the outer slice images is a promising option in order to reduce the number of slice images in BT image volumes.
... Other approaches to improving the reading time of DBT include slabbing to reduce the number of planes to review by combining adjacent planes to create thicker planes [24] and reviewing single-view DBT planes (MLO views) without synthetic or FFDM images [25]. Slabbing has been suggested to reduce reading time by 20% without any significant loss in image quality [24]. ...
... Other approaches to improving the reading time of DBT include slabbing to reduce the number of planes to review by combining adjacent planes to create thicker planes [24] and reviewing single-view DBT planes (MLO views) without synthetic or FFDM images [25]. Slabbing has been suggested to reduce reading time by 20% without any significant loss in image quality [24]. An explorative analysis of single-view DBT exams demonstrated improved cancer detection rate with only a small increase in recall rate and no change in positive predictive value compared to two-view FFDM alone [25]. ...
Article
Purpose: Evaluate concurrent Computer-Aided Detection (CAD) with Digital Breast Tomosynthesis (DBT) to determine impact on radiologist performance and reading time. Materials and methods: The CAD system detects and extracts suspicious masses, architectural distortions and asymmetries from DBT planes that are blended into corresponding synthetic images to form CAD-enhanced synthetic images. Review of CAD-enhanced images and navigation to corresponding planes to confirm or dismiss potential lesions allows radiologists to more quickly review DBT planes. A retrospective, crossover study with and without CAD was conducted with six radiologists who read an enriched sample of 80 DBT cases including 23 malignant lesions in 21 women. Area Under the Receiver Operating Characteristic (ROC) Curve (AUC) compared the readings with and without CAD to determine the effect of CAD on overall interpretation performance. Sensitivity, specificity, recall rate and reading time were also assessed. Multi-reader, multi-case (MRMC) methods accounting for correlation and requiring correct lesion localization were used to analyze all endpoints. AUCs were based on a 0-100% probability of malignancy (POM) score. Sensitivity and specificity were based on BI-RADS scores, where 3 or higher was positive. Results: Average AUC across readers without CAD was 0.854 (range: 0.785-0.891, 95% confidence interval (CI): 0.769,0.939) and 0.850 (range: 0.746-0.905, 95% CI: 0.751,0.949) with CAD (95% CI for difference: -0.046,0.039), demonstrating non-inferiority of AUC. Average reduction in reading time with CAD was 23.5% (95% CI: 7.0-37.0% improvement), from an average 48.2 (95% CI: 39.1,59.6) seconds without CAD to 39.1 (95% CI: 26.2,54.5) seconds with CAD. Per-patient sensitivity was the same with and without CAD (0.865; 95% CI for difference: -0.070,0.070), and there was a small 0.022 improvement (95% CI for difference: -0.046,0.089) in per-lesion sensitivity from 0.790 without CAD to 0.812 with CAD. A slight reduction in specificity with a -0.014 difference (95% CI for difference: -0.079,0.050) and a small 0.025 increase (95% CI for difference: -0.036,0.087) in recall rate in non-cancer cases were observed with CAD. Conclusions: Concurrent CAD resulted in faster reading time with non-inferiority of radiologist interpretation performance. Radiologist sensitivity, specificity and recall rate were similar with and without CAD.
... However, two studies reported reading times of one sided one-view wide-angle DBT. 8,9 These times were similar to the DBT reading times in some of the studies comparing one-sided two-view DBT with DM, 6,7 but shorter than some of the studies. 4,5 Thus, the reading time of one-view wide-angle DBT might be slightly shorter than two-view DBT, but still longer than two-view DM; however, comparison between studies might be complicated due to different study designs. ...
Article
Purpose: Breast cancer screening is predominantly performed using digital mammography (DM), but digital breast tomosynthesis (DBT) has higher sensitivity. DBT demands more resources than DM, and it might be more feasible to reserve DBT for women with a clear benefit from the technique. We explore if artificial intelligence (AI) can select women who would benefit from DBT imaging. Approach: We used data from Malmö Breast Tomosynthesis Screening Trial, where all women prospectively were examined with separately double read DM and DBT. We retrospectively analyzed DM examinations (n=14768) with a breast cancer detection system and used the provided risk score (1 to 10) for risk stratification. We tested how different score thresholds for adding DBT to an initial DM affects the number of detected cancers, additional DBT examinations needed, detection rate, and false positives. Results: If using a threshold of 9.0, 25 (26%) more cancers would be detected compared to using DM alone. Of the 41 cancers only detected on DBT, 61% would be detected, with only 1797 (12%) of the women examined with both DM and DBT. The detection rate for the added DBT would be 14/1000 women, whereas the false-positive recalls would be increased with 58 (21%). Conclusion: Using DBT only for selected high gain cases could be an alternative to complete DBT screening. AI can analyze initial DM images to identify high gain cases where DBT can be added during the same visit. There might be logistical challenges, and further studies in a prospective setting are necessary.
... Is it preferable for an entire cluster to appear in one thick slice or to separate the calcifications/parts of the clusters among several slices? SRSAR allows for slices to be merged to form thick slabs and in a study by Dustler et al. 2 mm and 1 mm slices were compared with regards to image quality and time (22). Again, VGA was used to compare image quality but the limited number of cases did not support evidence to prove any statistical differences. ...
Article
Full-text available
Purpose: To investigate detection performance for calcification clusters in reconstructed digital breast tomosynthesis (DBT) slices at different dose levels using a Super Resolution and Statistical Artifact Reduction (SRSAR) reconstruction method. Method: Simulated calcifications with irregular profile (0.2 mm diameter) where combined to form clusters that were added to projection images (1-3 per abnormal image) acquired on a DBT system (Mammomat Inspiration, Siemens). The projection images were dose reduced by software to form 35 abnormal cases and 25 normal cases as if acquired at 100%, 75% and 50% dose level (AGD of approximately 1.6 mGy for a 53 mm standard breast, measured according to EUREF v0.15). A standard FBP and a SRSAR reconstruction method (utilizing IRIS (iterative reconstruction filters), and outlier detection using Maximum-Intensity Projections and Average-Intensity Projections) were used to reconstruct single central slices to be used in a Free-response task (60 images per observer and dose level). Six observers participated and their task was to detect the clusters and assign confidence rating in randomly presented images from the whole image set (balanced by dose level). Each trial was separated by one weeks to reduce possible memory bias. The outcome was analyzed for statistical differences using Jackknifed Alternative Free-response Receiver Operating Characteristics. Results: The results indicate that it is possible reduce the dose by 50% with SRSAR without jeopardizing cluster detection. Conclusions: The detection performance for clusters can be maintained at a lower dose level by using SRSAR reconstruction.
... We did not register the reading time in our study, since the reading and scoring procedures were specific to the trial and would not reflect the true time consumption in a normal screening workflow. In a previous study from our group, we found that the reading time for one-view DBT (MLO) was roughly 30 s (in an enriched population of clinical and screening cases) [27]. This is about one third of the reported reading time for the so-called combination modetwo-view DBT in combination with two-view DMused in the Oslo Tomosynthesis Screening Trial [12]. ...
Article
Full-text available
To assess the performance of one-view digital breast tomosynthesis (DBT) in breast cancer screening. The Malmö Breast Tomosynthesis Screening Trial is a prospective population-based one-arm study with a planned inclusion of 15000 participants; a random sample of women aged 40-74 years eligible for the screening programme. This is an explorative analysis of the first half of the study population (n = 7500). Participants underwent one-view DBT and two-view digital mammography (DM), with independent double reading and scoring. Primary outcome measures were detection rate, recall rate and positive predictive value (PPV). McNemar's test with 95 % confidence intervals was used. Breast cancer was found in sixty-eight women. Of these, 46 cases were detected by both modalities, 21 by DBT alone and one by DM alone. The detection rate for one-view DBT was 8.9/1000 screens (95 % CI 6.9 to 11.3) and 6.3/1000 screens (4.6 to 8.3) for two-view DM (p < 0.0001). The recall rate after arbitration was 3.8 % (3.3 to 4.2) for DBT and 2.6 % (2.3 to 3.0) for DM (p < 0.0001). The PPV was 24 % for both DBT and DM. Our results suggest that one-view DBT might be feasible as a stand-alone screening modality. • One-view DBT as a stand-alone breast cancer screening modality has not been investigated. • One-view DBT increased the cancer detection rate significantly. • The recall rate increased significantly but was still low. • Breast cancer screening with one-view DBT as a stand-alone modality seems feasible.
Article
The ultimate goals of the application of artificial intelligence (AI) to digital breast tomosynthesis (DBT) are the reduction of reading times, the increase of diagnostic performance and the reduction of interval cancer rates. In this review, after outlining the journey from computer-aided detection/diagnosis systems to AI applied to digital mammography (DM), we summarize the results of studies where AI was applied to DBT, noting that long-term advantages of DBT screening and its crucial ability to decrease the interval cancer rate are still under scrutiny. AI has shown the capability to overcome some shortcomings of DBT in the screening setting by improving diagnostic performance and by reducing recall rates (from −2% to −27%) and reading times (up to −53%, with an average 20% reduction), but the ability of AI to reduce interval cancer rates has not yet been clearly investigated. Prospective validation is needed to assess the cost-effectiveness and real-world impact of AI models assisting DBT interpretation, especially in large-scale studies with low breast cancer prevalence. Finally, we focus on the incoming era of personalized and risk-stratified screening that will first see the application of contrast-enhanced breast imaging to screen women with extremely dense breasts. As the diagnostic advantage of DBT over DM was concentrated in this category, we try to understand if the application of AI to DM in the remaining cohorts of women with heterogeneously dense or non-dense breast could close the gap in diagnostic performance between DM and DBT, thus neutralizing the usefulness of AI application to DBT.
Chapter
The introduction of mammography as a radiographic imaging modality optimized for breast imaging revolutionized breast cancer care. Throughout the decades, conventional, screen-film-based mammography has given way to digital mammography, resulting in many benefits, including a streamlined workflow and improved performance in certain subgroups of patients. More importantly, the introduction of digital technology in mammographic imaging resulted in the development of even more advanced technologies, such as digital breast tomosynthesis. Tomosynthesis, with its ability to result in pseudo-tomographic imaging of the breast with a system that has the same footprint and workflow as mammography, has had an important impact in the breast imaging clinic.In this chapter, the basic concepts of X-ray-based breast imaging, common for both mammography and tomosynthesis, are reviewed. The major components of these imaging systems are described, and the resulting and potential clinical and screening performance of these modalities is discussed. Finally, considering their widespread use in asymptomatic women during screening, the dosimetry aspects of X-ray-based breast imaging are explained.
Article
Objectives Tomosynthesis (DBT) has proven to be more sensitive than digital mammography, but it requires longer reading time. We retrospectively compared accuracy and reading times of a simplified protocol with 1-cm-thick slabs versus a standard protocol of slabs + 1-mm-spaced planes, both integrated with synthetic 2D. Methods We randomly selected 894 DBTs (including 12 cancers) from the experimental arm of the RETomo trial. DBTs were read by two radiologists to estimate specificity. A second set of 24 cancers (8 also present in the first set) mixed within 276 negative DBTs was read by two radiologists. In total, 28 cancers with 64 readings were used to estimate sensitivity. Radiologists read with both protocols separated by a 3-month washout. Only women that were positive at the screening reading were assessed. Variance was estimated taking into account repeated measures. Results Sensitivity was 82.8% (53/64, 95% confidence interval (95% CI) 67.2–92.2) and 90.6% (95% CI 80.2–95.8) with simplified and standard protocols, respectively. In the random screening setting, specificity was 97.9% (1727/1764, 95% CI 97.1–98.5) and 96.3% (95% CI 95.3–97.1), respectively. Inter-reader agreement was 0.68 and 0.54 with simplified and standard protocols, respectively. Median reading times with simplified protocol were 20% to 30% shorter than with standard protocol. Conclusions A simplified protocol reduced reading time and false positives but may have a negative impact on sensitivity. Key Points • The adoption of digital breast tomosynthesis (DBT) in screening, more sensitive than mammography, could be limited by its potential effect on the radiologists’ workload, i.e., increased reading time and fatigue. • A DBT simplified protocol with slab only, compared to a standard protocol (slab plus planes) both integrated with synthetic 2D, reduced time and false positives but had a negative impact on sensitivity.
Conference Paper
Digital Breast Tomosynthesis (DBT) has the potential to replace or supplement Digital Mammography (DM). Studies have shown that it takes radiologists more time to read DBT examinations compared with DM. The slice separation of image volumes has been set to 1 mm on most systems. By using thicker slices review time could be reduced. This paper investigates the possibility of using 2 mm Average Intensity Pixel (AIP) slabs for image review. The thicker slabs were created using a method based on statistical artifact reduction and super-resolution. Six radiologists were presented with 20 sets of images containing 16 tumor masses and 8 micro-calcification clusters. They ranked 2 mm slabbed sets relative to standard 1 mm. Visibility (P = .0044) of micro-calcifications improved and there was no significant effect on mass visibility (P = .46). The results indicate that it is possible to review DBT-volumes with 2 mm slabs without compromising image quality.
Article
Full-text available
In medical imaging, information about the patient and possible abnormalities is transferred to the radiologist in two major steps: (i) data acquisition and image formation, and (ii) processing and display. Step one is mainly dependent on technical and physical characteristics of the equipment. Step two includes the vital importance of the performance of the radiologist; i.e. how he or she detects and interprets the structures in the image. The quality of a radiographical procedure must therefore be described with regard to both these steps. The spectrum of possible evaluation methods of importance will be described. The principles, benefits and drawbacks of some of these methods will be given together with examples of their use.
Article
Full-text available
Observer performance studies are time-consuming tasks, both for the participating observers and for the scientists collecting and analysing the data. A possible way to optimise such studies is to perform them in a completely digital environment. A software tool-ViewDEX (Viewer for Digital Evaluation of X-ray images)-has been developed in Java, enabling it to function on almost any computer. ViewDEX is designed to handle several types of studies, such as visual grading analysis (VGA), image criteria scoring (ICS) and receiver operating characteristics (ROC). The results from each observer are saved in a log file, which can be exported for further analysis in, for example, a special software for analysing ROC results. By using ViewDEX for an ROC experiment, an evaluation rate of approximately 200 images per hour can be achieved, compared to approximately 25 images per hour using hard copy evaluation. The results are obtained within minutes of completion of the viewing. The risk of human errors in the process of data collection and analysis is also minimised. The viewer has been used in a major trial containing approximately 2700 images.
Article
Full-text available
Visual grading of the reproduction of important anatomical structures is often used to determine clinical image quality in radiography. However, many visual grading methods incorrectly use statistical methods that require data belonging to an interval scale. The rating data from the observers in a visual grading study with multiple ratings is ordinal, meaning that non-parametric rank-invariant statistical methods are required. This paper describes such a method for determining the difference in image quality between two modalities called visual grading characteristics (VGC) analysis. In a VGC study, the task of the observer is to rate his confidence about the fulfilment of image quality criteria. The rating data for the two modalities are then analysed in a manner similar to that used in receiver operating characteristics (ROC) analysis. The resulting measure of image quality is the VGC curve, which--for all possible thresholds of the observer for a fulfilled criterion--describes the relationship between the proportions of fulfilled image criteria for the two compared modalities. The area under the VGC curve is proposed as a single measure of the difference in image quality between two compared modalities. It is also described how VGC analysis can be applied to data from an absolute visual grading analysis study.
Article
Full-text available
PURPOSE Tomosynthesis is a 3-dimensional mammography technique that generates thin slices separated one to the other by typically 1 mm from source data sets. The relatively high image noise in these thin slices raises the value of 1-cm thick slices computed from the set of reconstructed slices for image interpretation. In an initial evaluation, we investigated the potential of different algorithms for generating thick slices from tomosynthesis source data (maximum intensity projection—MIP; average algorithm—AV, and image generation by means of a new algorithm, so-called softMip). The three postprocessing techniques were evaluated using a homogeneous phantom with one textured slab with a total thickness of about 5 cm in which two 0.5-cm-thick slabs contained objects to simulate microcalcifications, spiculated masses, and round masses. The phantom was examined by tomosynthesis (GE Healthcare). Microcalcifications were simulated by inclusion of calcium particles of four different sizes. The slabs containing the inclusions were examined in two different configurations: adjacent to each other and close to the detector and with the two slabs separated by two 1-cm thick breast equivalent material slabs. The reconstructed tomosynthesis slices were postprocessed using MIP, AV, and softMip to generate 1-cm thick slices with a lower noise level. The three postprocessing algorithms were assessed by calculating the resulting contrast versus background for the simulated microcalcifications and contrast-to-noise ratios (CNR) for the other objects. The CNRs of the simulated round and spiculated masses were most favorable for the thick slices generated with the average algorithm, followed by softMip and MIP. Contrast of the simulated microcalcifications was best for MIP, followed by softMip and average projections. Our results suggest that the additional generation of thick slices may improve the visualization of objects in tomosynthesis. This improvement differs from the different algorithms for microcalcifications, speculated objects, and round masses. SoftMip is a new approach combining features of MIP and average showing image properties in between MIP and AV.
Conference Paper
We propose a method for out-of-plane artifact reduction in digital breast tomosynthesis reconstruction. Because of the limited angular range acquisition in DBT, the reconstructed slices have reduced resolution in z-direction and are affected by artifacts. The out-of-plane blur caused by dense tissue and large masses complicates reconstruction of thick slices volumes. The streak-like out-of-plane artifacts caused by calcifications and metal clips distort the shape of calcifications which is regarded by many radiologists as an important malignancy predictor. Small clinical features such as micro-calcifications could be obscured by bright artifacts. The proposed technique involves reconstructing a set of super-resolution slices and predicting the artifact-free voxel intensity based on the corresponding set of projection pixels using a statistical model learned from a set of training data. Our experiments show that the resulting reconstructed images are de-blurred and streak-like artifacts are reduced, visibility of clinical features, contrast and sharpness are improved and thick-slice reconstruction is possible without the loss of contrast and sharpness.
Article
Digital breast tomosynthesis is a new technique intended to overcome the limitations of conventional projection mammography by reconstructing slices through the breast from projection views acquired from different angles with respect to the breast. We formulate a general theory of filtered backprojection reconstruction for linear tomosynthesis. The filtering step consists of an MTF inversion filter, a spectral filter, and a slice thickness filter. In this paper the method is applied first to simulated data to understand the basic effects of the various filtering steps. We then demonstrate the impact of the filter functions with simulated projections and with clinical data acquired with a research breast tomosynthesis system.** With this reconstruction method the image quality can be controlled regarding noise and spatial resolution. In a wide range of spatial frequencies the slice thickness can be kept constant and artifacts caused by the incompleteness of the data can be suppressed.
Article
Observer performance studies are time-consuming tasks, both for the participating observers and for the scientists collecting and analyzing the data. A possible way to optimize such studies is to perform the study in a completely digital environment. A software tool - ViewDEX (Viewer for Digital Evaluation of X-ray images) - has been developed in Java, enabling it to function on almost any computer. ViewDEX is a DICOM-compatible software tool that can be used to display medical images with simultaneous registration of the observer's response. ViewDEX is designed so that the user in a simple way can alter the types of questions and images presented to the observers, enabling ROC, MAFC and visual grading studies to be conducted in a fast and efficient way. The software can also be used for bench marking and for educational purposes. The results from each observer are saved in a log file, which can be exported for further analysis. The software is freely available for non-commercial purposes.
Article
To compare time to interpretation and diagnostic performance levels during repeat readings of full-field digital mammography (FFDM) and digital breast tomosynthesis (DBT) in a retrospective study. Three experienced radiologists twice interpreted 125 selected examinations, 35 with verified cancers and 90 negative for cancer during a period of 22 months using FFDM alone followed by a combined FFDM + DBT mode. Changes in time to "review and rate" these examinations as well as in diagnostic performance levels where assessed. A fixed-effect analysis accounting for cross-correlation due to the review of the same examinations by the same readers was performed. The total (combined) time to review and rate an examination increased on average by 33% between the first and second readings of the same examinations (P < .001). Radiologists reduced their time to review FFDM before making the DBT available for viewing. However, they spent more time reviewing the combined FFDM + DBT mode. The recall rates for examinations depicting cancer remained largely unchanged. Among the groups of examinations with concordant and discordant recall recommendations during the two readings only the group examinations that were "newly recalled" during repeat reading, took significantly longer (P < .01). DBT-based breast imaging may ultimately result in a substantial increase in performance; however, without efficiency improvements DBT may take longer to interpret. Addition of "false-positive recalls" was most strongly associated with increase in interpretation time while elimination of "false-positive recalls" did not require longer interpretation time.
Article
The purpose of this study was to compare in a retrospective observer study the diagnostic performance of full-field digital mammography (FFDM) with that of digital breast tomosynthesis. Eight experienced radiologists interpreted images from 125 selected examinations, 35 with verified findings of cancer and 90 with no finding of cancer. The four display conditions included FFDM alone, 11 low-dose projections, reconstructed digital breast tomosynthesis images, and a combined display mode of FFDM and digital breast tomosynthesis images. Observers rated examinations using the screening BI-RADS rating scale and the free-response receiver operating characteristic paradigm. Observer performance levels were measured as the proportion of examinations prompting recall of patients for further diagnostic evaluation. The results were presented in terms of true-positive fraction and false-positive fraction. Performance levels were compared among the acquisitions and reading modes. Time to view and interpret an examination also was evaluated. Use of the combination of digital breast tomosynthesis and FFDM was associated with 30% reduction in recall rate for cancer-free examinations that would have led to recall if FFDM had been used alone (p < 0.0001 for the participating radiologists, p = 0.047 in the context of a generalized population of radiologists). Use of digital breast tomosynthesis alone also tended to reduce recall rates, an average of 10%, although the observed decrease was not statistically significant (p = 0.09 for the participating radiologists). There was no convincing evidence that use of digital breast tomosynthesis alone or in combination with FFDM results in a substantial improvement in sensitivity. Use of digital breast tomosynthesis for breast imaging may result in a substantial decrease in recall rate.
Article
The objective of our study was to assess ergonomic and diagnostic performance-related issues associated with the interpretation of digital breast tomosynthesis-generated examinations. Thirty selected cases were read under three different display conditions by nine experienced radiologists in a fully crossed, mode-balanced observer performance study. The reading modes included full-field digital mammography (FFDM) alone, the 11 low-dose projections acquired for the reconstruction of tomosynthesis images, and the reconstructed digital breast tomosynthesis examination. Observers rated cases under the free-response receiver operating characteristic, as well as a screening paradigm, and provided subjective assessments of the relative diagnostic value of the two digital breast tomosynthesis-based image sets as compared with FFDM. The time to review and diagnose each case was also evaluated. Observer performance measures were not statistically significant (p > 0.05) primarily because of the small sample size in this pilot study, suggesting that showing significant improvements in diagnosis, if any, will require a larger study. Several radiologists did perceive the digital breast tomosynthesis image set and the projection series to be better than FFDM (p < 0.05) for diagnosing this specific case set. The time to review, interpret, and rate the examinations was significantly different for the techniques in question (p < 0.05). Tomosynthesis-based breast imaging may have great potential, but much work is needed before its optimal role in the clinical environment is known.