Conference PaperPDF Available

A Study of the Feasibility of using slabbing to reduce Tomosynthesis Review Time

March 2013
Proceedings of SPIE - The International Society for Optical Engineering 8673:86731L

March 2013
8673:86731L

DOI:10.1117/12.2006987

Conference: SPIE Medical Imaging

Authors:

Magnus. Dustler

Lund University

Martin Andersson

Lund University

Daniel Fornvik

Lund University

Show all 5 authorsHide

This study aimed to investigate whether decreasing the amount of slices in breast tomosynthesis (BT) image volumes reduce reading time. BT slices were combined into so-called slabs, by reconstructing thin slices and merging them into thicker slabs. Sets of slabs where created from 35 clinical BT volumes with malignant or benignant findings and from 50 BT volumes drawn from screening sets (without any prior review). The image sets were reviewed in two separate sessions while the review time was recorded. A total of five experienced radiologists were employed for the image review. Additionally a VGA study was performed to compare slabbed images with the originals in order to ensure that the image quality was not significantly degraded. One set of 27 pathological cases (13 masses and 14 microcalcification clusters) and one of 22 subtle lesions that had been missed on digital mammography but detected on BT were presented to an experienced radiologist and 2 medical physicists who rated the quality of the slabbed versions relative to the originals. The study could find no significant degradation in image quality when using 2 mm slabs instead of 1 mm slices. There was no significant decrease in reading time on clinical cases (P=.133), but on screening images there was a significant decrease of 7.7 +/- 9.6 s from an average level of 32.2 +/- 14.5 s (P<.0001). This suggests that increasing slab thickness can reduce the time radiologists spend studying normal images by 20%.

This figure illustrates the slabbing procedure. Each pixel of a slab consists of the maximum values of corresponding pixels in a number of thinner slices.

…

Figures - uploaded by Anders Tingberg

Content may be subject to copyright.

Content uploaded by Anders Tingberg

Content may be subject to copyright.

A Study of the Feasibility of using slabbing to reduce Tomosynthesis

Review Time

Magnus Dustler1*, Martin Andersson1, Daniel Förnvik1, Pontus Timberg2, Anders Tingberg1

1Medical Radiation Physics, Department of Clinical Sciences Malmö, Faculty of Medicine, Lund

University, Skåne University Hospital, SE-205 02, Sweden

2 Diagnostic Radiology, Department of Clinical Sciences Malmö, Faculty of Medicine, Lund

University, Skåne University Hospital Malmö, SE-205 02, Sweden

ABSTRACT

This study aimed to investigate whether decreasing the amount of slices in breast tomosynthesis (BT) image volumes

reduce reading time. BT slices were combined into so-called slabs, by reconstructing thin slices and merging them into

thicker slabs. Sets of slabs where created from 35 clinical BT volumes with malignant or benignant findings and from 50

BT volumes drawn from screening sets (without any prior review). The image sets were reviewed in two separate

sessions while the review time was recorded. A total of five experienced radiologists were employed for the image

review.

Additionally a VGA study was performed to compare slabbed images with the originals in order to ensure that the image

quality was not significantly degraded. One set of 27 pathological cases (13 masses and 14 microcalcification clusters)

and one of 22 subtle lesions that had been missed on digital mammography but detected on BT were presented to an

experienced radiologist and 2 medical physicists who rated the quality of the slabbed versions relative to the originals.

The study could find no significant degradation in image quality when using 2 mm slabs instead of 1 mm slices. There

was no significant decrease in reading time on clinical cases (P = .133), but on screening images there was a significant

decrease of 7.7 ± 9.6 s from an average level of 32.2 ± 14.5 s (P < .0001). This suggests that increasing slab thickness

can reduce the time radiologists spend studying normal images by 20%.

Keywords: Mammography, breast tomosynthesis, slab, thick slice, reading time

1. INTRODUCTION

Breast Tomosynthesis (BT) is on the rise as perhaps a viable alternative to mammography for breast cancer screening.

This study aims to reduce one of the major potential obstacles: the increased review time per case when compared to

mammography1-3. Reducing the amount of slices in the BT image volume by merging them into thicker slabs might be

effective in this regard, as especially in a screening modality, the ability to promptly review a case is of critical

importance.

This study is based in part on data collected during the spring of 2011 with the intention of possibly expanding them at a

later date. The data was later reanalyzed and added to during 2012, when more advanced methods of producing image

slabs became available.

2. METHOD(S)

2.1 Slice thickness

Increasing the thickness of BT slices can be done in two ways4. The most straightforward of these is to vary the thickness

* Corresponding author: magnus.dustler@med.lu.se

+46 40 33 86 56

during reconstruction, as the filtered back projection (FBP) algorithm used can handle arbitrary reconstruction

thicknesses, merely affecting the size of the reconstructed image voxels along the z-direction. To improve the quality of

Medical Imaging 2013: Image Perception, Observer Performance, and Technology Assessment,

edited by Craig K. Abbey, Claudia R. Mello-Thoms, Proc. of SPIE Vol. 8673, 86731L

Proc. of SPIE Vol. 8673 86731L-1

Downloaded From: http://spiedigitallibrary.org/ on 09/27/2013 Terms of Use: http://spiedl.org/terms

the reconstructed volume and add to the 3D-appearance by reducing the influence of out-of-plane artifacts a slice

thickness filter tailored to a specific z-direction sampling interval (or slice thickness) is used. The application of the slice

thickness filter is intended to keep slice thickness constant, but it also reduces the sharpness of high-frequency image

components. As the Siemens MAMMOMAT Inspiration Tomo (Siemens AG, Erlangen, Germany) and its slice thickness

filter is optimized for a sampling interval of 1 mm in the z-direction doing this would introduce an additional factor of

uncertainty if slices are made thicker 5.

2.2 Slabbing

The alternative to generating thicker slices is to combine several reconstructed slices into new thicker ones, which are

called slabs (Figure 1). There are a number of possible approaches to do this, most prominently by setting each pixel in

the slab either as the average or the maximum value of the corresponding pixels in the constituent slices (known as AIP

(average-intensity projection) and MIP (maximum-intensity projection slabs, respectively). These approaches do not

differ in the number or thickness of slabs they produce, but they will affect the image information contained in the slices

in different ways. Averaging suppresses image noise, but will also reduce the contrast of small features such as

microcalcifications, especially if such structures are included in only one slice. Selecting the MIP will instead increase

visibility of micro-calcifications while also increasing the noise level. For this study, it was decided that the MIP and AIP

approaches would both be appopriate, created either by importing and slabbing BT-volumes using Matlab (Mathworks,

Natick, MA) or by employing Siemens prototype system with a built-in option of reconstruction based on statistical

artifact reduction and superresolution, which dispenses with the slice-thickness filter and instead reconstructs slices with

a thickness of 1/6th mm and combines them into slabs of the desired thickness 6. This method was not available during

the initial phase of the study.

Figure 1: This figure illustrates the slabbing procedure. Each pixel of a slab consists of the maximum values of corresponding pixels in

a number of thinner slices.

In order not to complicate the study it was decided to only investigate the reading time reduction obtained by reducing

the number of included slices in the image volumes by one half. The required reduction of the number of slices required

a comparison with 2 mm slabs. Two different methods were used to create the slabs. Firstly, the straightforward

Proc. of SPIE Vol. 8673 86731L-2

Downloaded From: http://spiedigitallibrary.org/ on 09/27/2013 Terms of Use: http://spiedl.org/terms

approach of building each 2 mm AIP slab of two consecutive 1 mm slices reconstructed with standard settings using the

slice thickness filter was used. Secondly, the previously mentioned built-in option of reconstruction based on statistical

artifact reduction was used when it became available during 2012, combining 12 slices into 2 mm slabs. These

approaches are broadly similar in that the AIP slabs they produce are somewhat noisier than 1 mm reconstructed slices

but preserve (or improve) the visibility of high frequency components (such as microcalcifications).

2.3 Image Quality

To ensure that this procedure would not dramatically affect lesion detection slabs of a set of 27 pathological structures

from clinical images (13 lesions and 14 microcalcification clusters) were created. These were presented side-by-side with

the original in a relative Visual Grading Analysis (VGA) 7-8. One experienced radiologist and two medical physicists

were asked to rate the quality of the slabbed image relative to the original, using ImageJ software 9.

Later 22 sets of images with subtle findings (defined as lesions that were detected at tomosynthesis but not on

mammography) were used to compare slabs of 2, 3 and 5 mm thickness (using the now available thin slice superreslution

reconstruction with statistical artifact reduction which had the added benefit of being able to easily import the images to

standard mammography workstations, allowing their use for review) with 1 mm originals. The same radiologist and

medical physicists ranked these images from best to worst. The criteria which reviewers were told use in their rating and

ranking of images were (if applicable); mass visibility, microcalcification conspicuity, mass visibility and

microcalcification continuity. Continuity was defined in this context as the degree to which structures could be followed

from one slice to the next.

2.4 Reading time

The reading time study was split into two parts, the first performed during 2011 using standard 2 mm slabs. The first part

involved two experienced radiologists reviewing 35 clinical images (30 with minor proven benign findings and 5 with

proven malignant findings) in a simulated screening setting using ViewDEX10-11 where the radiologists where told to

mark findings as normal while being timed with a chronometer. The enriched material was intended to investigate the

effect of slabbing on tomosynthesis cases with suspicious findings. The radiologists were presented with the cases in two

sessions separated by three weeks. 19 image stacks were randomly selected to be slabbed during the first session, while

the remaining 16 stacks were slabbed during the second session.

The second part of the study involved slabbing 50 sets of image stacks taken directly from a BT screening study. These

image stacks were purposely randomly selected without any prior review, in order to represent the type of cases

commonly encountered in a screening setting, i.e. likely normal cases with no pathology. Four experienced radiologists

(one of which was part of the first clinical part as well) reviewed these cases both with the standard 1 mm slices and 2

mm superresolution MIP slabs under normal screening conditions using a standard mammography workstation. The

reviewers reviewed each case until they felt confident in assigning it a BIRADS score 12.This was done in between two

and four sessions separated by between two days and three weeks, with the same image reviewed no more than once

during any session. The reading time of each individual case was recorded using a chronometer, with the radiologist

allowed to use all of the workstations functions and told to review the mammograms as would normally be done for a

new screening patient with no prior history or images available.

3. RESULTS

3.1 Image quality

The results of both the relative VGA study and the image ranking study were investigated using a non-parametric

Friedman test for multiple readers. The VGA study showed no significant difference in image quality for the original

images and the slabbed images (Figure 2), while the ranking study (on the statistical artifact reduction AIP slabs) did not

show a significant difference between the original images and 2 mm images (Figure 3), but did show significant

differences for 3 and 5 mm slabs.

Proc. of SPIE Vol. 8673 86731L-3

Downloaded From: http://spiedigitallibrary.org/ on 09/27/2013 Terms of Use: http://spiedl.org/terms

4_' F

Friedman score 4

3.2 Reading time

The first part of the reading time study, on clinical images, was inconclusive; with one radiologist showing a significant

decrease in reading time on the clinical images from a mean of 69.6 s to 55.4 s (P =.039) and one showing a very minor,

non-statistically significant decrease of 41.4 s to 40.4 s (P = 0.906). An ANOVA for multiple readers showed no

significant difference in review time (P = .133). The second part indicated a difference in reading time on the unsorted

screening material of 7.7 ± 9.6 s from an average level of 32.2 ± 14.5 s. A 2-way ANOVA performed on the whole

dataset showed that this decrease was significant (P<.0001), and indicated that there were significant differences in

reading time between the four reviewing radiologists. Further all four reviewers individually showed statistically

significant decreases in reading time (P< .0001) using the paired Student’s t-test. Table 1 summarizes data for each

reviewer.

Figure 2: Results of the relative VGA, using the Friedman test for multiple readers. There is no significant difference between group

1(original images) and group 2 (2 mm MIP slabs), as the confidence intervals overlap.

Reading time / s

1 mm 2 mm Difference

Reviewer 1 23.4 ± 6.6 14.1 ± 3.7 -9.4 ± 8.3

Reviewer 2 52.4 ± 12.6 41.2 ± 10.8 -11.2 ± 13.7

Reviewer 3 27.7 ± 8.1 22.4 ± 5.6 -5.2 ± 8.2

Reviewer 4 25.0 ± 4.7 20.2 ± 4.2 -4.8 ± 4.2

Total 32.2 ± 14.5 24.5 ± 12.1 -7.7 ± 9.6

Table 1: Summary of statistics from the reading time experiment on unsorted screening images.

Proc. of SPIE Vol. 8673 86731L-4

Downloaded From: http://spiedigitallibrary.org/ on 09/27/2013 Terms of Use: http://spiedl.org/terms

0 2 4 6

Friedman score

810 12

Figure 3: Comparison of the ranking of original (1 mm) slices and 2, 3 and 5 mm slabs using the Friedman test for multiple readers.

There is a significant difference between the 3 and 5 mm slabs and the 1 mm slices as there is no overlap between their confidence

intervals.

4. DISCUSSION

The VGA and ranking studies did not indicate a significant decrease in image quality, which opens the possibility of

using thick slices to shorten reading time. Though the study would have to be considerably larger to conclusively show

that there is no difference, one can assume that such a difference would be minor and could likely be compensated for by

optimizing reconstruction parameters for the new slice thickness. Still, there is of course the possibility that slabbing

does impair image quality to an unacceptable degree. This could be construed as a limitation of the study, and of course

it is a serious problem if the image quality should be compromised. However, the scope of this study is about quantifying

the effect of slabbing on reading time, with the implication that although a larger image quality investigation is required

to implement thicker slabs in clinical practice; it is not pertinent to perform such a study if the gain in reading time is too

small to be relevant, though of course at what level a decrease in workload is relevant is an issue that has to be

considered by health economists.

Because of similar reasons to those mentioned above, no direct comparison between the quality of the two methods of

slabbing was performed.

Regarding the clinical images there was a large variation between the two radiologists both in absolute reading time and

changes in reading time, with one showing a significant improvement of reading time by 20% and the other showing no

applicable difference. However, by comparison the reading time differential of the unsorted images was very consistent,

with all reviewers showing a significant and substantial decrease. The average difference was 7.7 s or nearly 24%,

though it should be noted that there are substantial differences in reading time between different radiologists.

To put this in perspective, review of digital mammography is roughly 25-50 % faster than that of BT1-3. This implies that

there is a potential to speed up reading times by a meaningful amount. As noted before, there are individual variations in

reading time between readers and we theorize that such variations become more pronounced in a “busy” breast with

suspicious looking regions. Thus, even though the results for the clinical cases with suspicious findings were

Proc. of SPIE Vol. 8673 86731L-5

Downloaded From: http://spiedigitallibrary.org/ on 09/27/2013 Terms of Use: http://spiedl.org/terms

inconclusive, the fact that the vast majority of images do not have any suspicious findings and that there was, critically, a

reduction in reading time on the unsorted screening cases supports the idea that it could be beneficial to implement

slabbing in a screening program as it would speed up review of the majority of cases, those that show no suspicious

structures.

5. CONCLUSIONS

We conclude that it is feasible to use slabbing to reduce the review time of BT image sets to a level closer to that of

standard digital mammography. We cannot conclude whether slabbed images provide equivalent image quality (with

regards to lesion detection); but if not, the difference is likely minor. Thus, we consider slabbing to be a viable method to

reduce the workload of radiologists, though it requires thought and further investigation before it can be used as a

standard.

ACKNOWLEDGEMENTS

We would like to acknowledge Siemens AG Healthcare for the opportunity to use the statistical artifact reduction and

superresolution reconstructions and for financial support. We would also like to acknowledge all participating

radiologists at Unilabs AB and SUS Malmö.

REFERENCES

[1] Zuley ML, Bandos AI, Abrams GS, Cohen C, Hakim CM, Sumkin JH, Drescher J, Rockette HE, Gur D,

“Time to diagnosis and performance levels during repeat interpretations of digital breast tomosynthesis:

Preliminary observations,” Acad Radiol 17, 450-455 (2010)

[2] Good WF, Abrams GS, Catullo VJ, Chough DM, Ganott MA, Hakim CM, Gur D, “Digital breast

tomosynthesis: A pilot observer study,” Am J Roentgenol 190, 865-869 (2008)

[3] Gur D, Abrams GS, Chough DM, Ganott MA, Hakim CM, Perrin RL, Rathfon GY, Sumkin JH, Zuley

ML, Bandos AI, “Digital breast tomosynthesis: Observer performance study,” Am J Roentgenol 193, 586-591

(2009)

[4] Diekmann F, et al., “Thick Slices from Tomosynthesis Data Sets: Phantom Study for the Evaluation of

Different Algorithms,” Journal of Digital Imaging 22(5), 519-526 (2009)

[5] Mertelmeier T, et al., “Optimizing filtered backprojection reconstruction for a breast tomosynthesis

prototype device,” Proc. SPIE 6142, 131-142 (2006)

[6] Abdurahman S, Jerebko A, Mertelmeier T, Lasser T, Navab N, “Out-of-plane artifact reduction in

tomosynthesis based on regression modeling and outlier detection,” IWDM’12 Proc. of the 11th international

conference on Breast Imaging, 729-736 (2012)

[7] Båth M and Månsson L.G, “Visual grading characteristics (VGC) analysis: a non-parametric rank-

invariant statistical method for image quality evaluation,” Br J Radiol 80(951), 169-76 (2007)

[8] Månsson L.G, “Methods for the evaluation of image quality: A review,” Radiat Prot Dosimetry 90, 89-99

(2000)

[9] Rasband W.S, ImageJ, U. S. National Institutes of Health, Bethesda, Maryland, USA,

http://imagej.nih.gov/ij/, 1997-2012

[10] Håkansson M, Svensson S, Båth M, Månsson L.G, “ViewDEX – A Java-based software for presentation

and evaluation of medical images in observer performance studies,” Proc. SPIE 6509, (2007)

[11] Börjesson S, Håkansson M, Båth M, Kheddache S, Svensson S, Tingberg A, Grahn A, Ruschin M,

Hemdal B, Mattsson S, Månsson L.G, “A software tool for increased efficiency in observer performance

studies in radiology,” Radiat. Prot. Dosimetry 114, 45-52 (2005)

[12] D’Orsi CJ, Mendelson EB, Ikeda DM, et al. [Breast Imaging Reporting and Data System: ACR BI-

RADS – Breast Imaging Atlas], American College of Radiology, Reston, (2003)

Proc. of SPIE Vol. 8673 86731L-6

Downloaded From: http://spiedigitallibrary.org/ on 09/27/2013 Terms of Use: http://spiedl.org/terms

Evaluation of the possibility to use thick slabs of reconstructed outer breast tomosynthesis slice images

Conference Paper

Mar 2016

The large image volumes in breast tomosynthesis (BT) have led to large amounts of data and a heavy workload for breast radiologists. The number of slice images can be decreased by combining adjacent image planes (slabbing) but the decrease in depth resolution can considerably affect the detection of lesions. The aim of this work was to assess if thicker slabbing of the outer slice images (where lesions seldom are present) could be a viable alternative in order to reduce the number of slice images in BT image volumes. The suggested slabbing (an image volume with thick outer slabs and thin slices between) were evaluated in two steps. Firstly, a survey of the depth of 65 cancer lesions within the breast was performed to estimate how many lesions would be affected by outer slabs of different thicknesses. Secondly, a selection of 24 lesions was reconstructed with 2, 6 and 10 mm slab thickness to evaluate how the appearance of lesions located in the thicker slabs would be affected. The results show that few malignant breast lesions are located at a depth less than 10 mm from the surface (especially for breast thicknesses of 50 mm and above). Reconstruction of BT volumes with 6 mm slab thickness yields an image quality that is sufficient for lesion detection for a majority of the investigated cases. Together, this indicates that thicker slabbing of the outer slice images is a promising option in order to reduce the number of slice images in BT image volumes.

Improving digital breast tomosynthesis reading time: A pilot multi-reader, multi-case study using concurrent Computer-Aided Detection (CAD)

Article

Oct 2017
EUR J RADIOL

Purpose: Evaluate concurrent Computer-Aided Detection (CAD) with Digital Breast Tomosynthesis (DBT) to determine impact on radiologist performance and reading time. Materials and methods: The CAD system detects and extracts suspicious masses, architectural distortions and asymmetries from DBT planes that are blended into corresponding synthetic images to form CAD-enhanced synthetic images. Review of CAD-enhanced images and navigation to corresponding planes to confirm or dismiss potential lesions allows radiologists to more quickly review DBT planes. A retrospective, crossover study with and without CAD was conducted with six radiologists who read an enriched sample of 80 DBT cases including 23 malignant lesions in 21 women. Area Under the Receiver Operating Characteristic (ROC) Curve (AUC) compared the readings with and without CAD to determine the effect of CAD on overall interpretation performance. Sensitivity, specificity, recall rate and reading time were also assessed. Multi-reader, multi-case (MRMC) methods accounting for correlation and requiring correct lesion localization were used to analyze all endpoints. AUCs were based on a 0-100% probability of malignancy (POM) score. Sensitivity and specificity were based on BI-RADS scores, where 3 or higher was positive. Results: Average AUC across readers without CAD was 0.854 (range: 0.785-0.891, 95% confidence interval (CI): 0.769,0.939) and 0.850 (range: 0.746-0.905, 95% CI: 0.751,0.949) with CAD (95% CI for difference: -0.046,0.039), demonstrating non-inferiority of AUC. Average reduction in reading time with CAD was 23.5% (95% CI: 7.0-37.0% improvement), from an average 48.2 (95% CI: 39.1,59.6) seconds without CAD to 39.1 (95% CI: 26.2,54.5) seconds with CAD. Per-patient sensitivity was the same with and without CAD (0.865; 95% CI for difference: -0.070,0.070), and there was a small 0.022 improvement (95% CI for difference: -0.046,0.089) in per-lesion sensitivity from 0.790 without CAD to 0.812 with CAD. A slight reduction in specificity with a -0.014 difference (95% CI for difference: -0.079,0.050) and a small 0.025 increase (95% CI for difference: -0.036,0.087) in recall rate in non-cancer cases were observed with CAD. Conclusions: Concurrent CAD resulted in faster reading time with non-inferiority of radiologist interpretation performance. Radiologist sensitivity, specificity and recall rate were similar with and without CAD.

Personalized breast cancer screening with selective addition of digital breast tomosynthesis through artificial intelligence

Article

Jun 2023

Purpose: Breast cancer screening is predominantly performed using digital mammography (DM), but digital breast tomosynthesis (DBT) has higher sensitivity. DBT demands more resources than DM, and it might be more feasible to reserve DBT for women with a clear benefit from the technique. We explore if artificial intelligence (AI) can select women who would benefit from DBT imaging. Approach: We used data from Malmö Breast Tomosynthesis Screening Trial, where all women prospectively were examined with separately double read DM and DBT. We retrospectively analyzed DM examinations (n=14768) with a breast cancer detection system and used the provided risk score (1 to 10) for risk stratification. We tested how different score thresholds for adding DBT to an initial DM affects the number of detected cancers, additional DBT examinations needed, detection rate, and false positives. Results: If using a threshold of 9.0, 25 (26%) more cancers would be detected compared to using DM alone. Of the 41 cancers only detected on DBT, 61% would be detected, with only 1797 (12%) of the women examined with both DM and DBT. The detection rate for the added DBT would be 14/1000 women, whereas the false-positive recalls would be increased with 58 (21%). Conclusion: Using DBT only for selected high gain cases could be an alternative to complete DBT screening. AI can analyze initial DM images to identify high gain cases where DBT can be added during the same visit. There might be logistical challenges, and further studies in a prospective setting are necessary.

Detection of calcification clusters in digital breast tomosynthesis slices at different dose levels utilizing a SRSAR reconstruction and JAFROC

Article

Full-text available

Mar 2015

Purpose: To investigate detection performance for calcification clusters in reconstructed digital breast tomosynthesis (DBT) slices at different dose levels using a Super Resolution and Statistical Artifact Reduction (SRSAR) reconstruction method. Method: Simulated calcifications with irregular profile (0.2 mm diameter) where combined to form clusters that were added to projection images (1-3 per abnormal image) acquired on a DBT system (Mammomat Inspiration, Siemens). The projection images were dose reduced by software to form 35 abnormal cases and 25 normal cases as if acquired at 100%, 75% and 50% dose level (AGD of approximately 1.6 mGy for a 53 mm standard breast, measured according to EUREF v0.15). A standard FBP and a SRSAR reconstruction method (utilizing IRIS (iterative reconstruction filters), and outlier detection using Maximum-Intensity Projections and Average-Intensity Projections) were used to reconstruct single central slices to be used in a Free-response task (60 images per observer and dose level). Six observers participated and their task was to detect the clusters and assign confidence rating in randomly presented images from the whole image set (balanced by dose level). Each trial was separated by one weeks to reduce possible memory bias. The outcome was analyzed for statistical differences using Jackknifed Alternative Free-response Receiver Operating Characteristics. Results: The results indicate that it is possible reduce the dose by 50% with SRSAR without jeopardizing cluster detection. Conclusions: The detection performance for clusters can be maintained at a lower dose level by using SRSAR reconstruction.

Performance of one-view breast tomosynthesis as a stand-alone breast cancer screening modality: results from the Malmö Breast Tomosynthesis Screening Trial, a population-based study

Article

Full-text available

May 2015
EUR RADIOL

To assess the performance of one-view digital breast tomosynthesis (DBT) in breast cancer screening. The Malmö Breast Tomosynthesis Screening Trial is a prospective population-based one-arm study with a planned inclusion of 15000 participants; a random sample of women aged 40-74 years eligible for the screening programme. This is an explorative analysis of the first half of the study population (n = 7500). Participants underwent one-view DBT and two-view digital mammography (DM), with independent double reading and scoring. Primary outcome measures were detection rate, recall rate and positive predictive value (PPV). McNemar's test with 95 % confidence intervals was used. Breast cancer was found in sixty-eight women. Of these, 46 cases were detected by both modalities, 21 by DBT alone and one by DM alone. The detection rate for one-view DBT was 8.9/1000 screens (95 % CI 6.9 to 11.3) and 6.3/1000 screens (4.6 to 8.3) for two-view DM (p < 0.0001). The recall rate after arbitration was 3.8 % (3.3 to 4.2) for DBT and 2.6 % (2.3 to 3.0) for DM (p < 0.0001). The PPV was 24 % for both DBT and DM. Our results suggest that one-view DBT might be feasible as a stand-alone screening modality. • One-view DBT as a stand-alone breast cancer screening modality has not been investigated. • One-view DBT increased the cancer detection rate significantly. • The recall rate increased significantly but was still low. • Breast cancer screening with one-view DBT as a stand-alone modality seems feasible.

Artificial intelligence for digital breast tomosynthesis: Impact on diagnostic performance, reading times, and workload in the era of personalized screening

Article

Dec 2022
EUR J RADIOL

The ultimate goals of the application of artificial intelligence (AI) to digital breast tomosynthesis (DBT) are the reduction of reading times, the increase of diagnostic performance and the reduction of interval cancer rates. In this review, after outlining the journey from computer-aided detection/diagnosis systems to AI applied to digital mammography (DM), we summarize the results of studies where AI was applied to DBT, noting that long-term advantages of DBT screening and its crucial ability to decrease the interval cancer rate are still under scrutiny. AI has shown the capability to overcome some shortcomings of DBT in the screening setting by improving diagnostic performance and by reducing recall rates (from −2% to −27%) and reading times (up to −53%, with an average 20% reduction), but the ability of AI to reduce interval cancer rates has not yet been clearly investigated. Prospective validation is needed to assess the cost-effectiveness and real-world impact of AI models assisting DBT interpretation, especially in large-scale studies with low breast cancer prevalence. Finally, we focus on the incoming era of personalized and risk-stratified screening that will first see the application of contrast-enhanced breast imaging to screen women with extremely dense breasts. As the diagnostic advantage of DBT over DM was concentrated in this category, we try to understand if the application of AI to DM in the remaining cohorts of women with heterogeneously dense or non-dense breast could close the gap in diagnostic performance between DM and DBT, thus neutralizing the usefulness of AI application to DBT.

Mammography and Digital Breast Tomosynthesis: Technique

Chapter

Nov 2022

Ioannis Sechopoulos

The introduction of mammography as a radiographic imaging modality optimized for breast imaging revolutionized breast cancer care. Throughout the decades, conventional, screen-film-based mammography has given way to digital mammography, resulting in many benefits, including a streamlined workflow and improved performance in certain subgroups of patients. More importantly, the introduction of digital technology in mammographic imaging resulted in the development of even more advanced technologies, such as digital breast tomosynthesis. Tomosynthesis, with its ability to result in pseudo-tomographic imaging of the breast with a system that has the same footprint and workflow as mammography, has had an important impact in the breast imaging clinic.In this chapter, the basic concepts of X-ray-based breast imaging, common for both mammography and tomosynthesis, are reviewed. The major components of these imaging systems are described, and the resulting and potential clinical and screening performance of these modalities is discussed. Finally, considering their widespread use in asymptomatic women during screening, the dosimetry aspects of X-ray-based breast imaging are explained.

Comparing two visualization protocols for tomosynthesis in screening: specificity and sensitivity of slabs versus planes plus slabs

Article

Feb 2019

Objectives Tomosynthesis (DBT) has proven to be more sensitive than digital mammography, but it requires longer reading time. We retrospectively compared accuracy and reading times of a simplified protocol with 1-cm-thick slabs versus a standard protocol of slabs + 1-mm-spaced planes, both integrated with synthetic 2D. Methods We randomly selected 894 DBTs (including 12 cancers) from the experimental arm of the RETomo trial. DBTs were read by two radiologists to estimate specificity. A second set of 24 cancers (8 also present in the first set) mixed within 276 negative DBTs was read by two radiologists. In total, 28 cancers with 64 readings were used to estimate sensitivity. Radiologists read with both protocols separated by a 3-month washout. Only women that were positive at the screening reading were assessed. Variance was estimated taking into account repeated measures. Results Sensitivity was 82.8% (53/64, 95% confidence interval (95% CI) 67.2–92.2) and 90.6% (95% CI 80.2–95.8) with simplified and standard protocols, respectively. In the random screening setting, specificity was 97.9% (1727/1764, 95% CI 97.1–98.5) and 96.3% (95% CI 95.3–97.1), respectively. Inter-reader agreement was 0.68 and 0.54 with simplified and standard protocols, respectively. Median reading times with simplified protocol were 20% to 30% shorter than with standard protocol. Conclusions A simplified protocol reduced reading time and false positives but may have a negative impact on sensitivity. Key Points • The adoption of digital breast tomosynthesis (DBT) in screening, more sensitive than mammography, could be limited by its potential effect on the radiologists’ workload, i.e., increased reading time and fatigue. • A DBT simplified protocol with slab only, compared to a standard protocol (slab plus planes) both integrated with synthetic 2D, reduced time and false positives but had a negative impact on sensitivity.

Image Quality of Thick Average Intensity Pixel Slabs Using Statistical Artifact Reduction in Breast Tomosynthesis

Conference Paper

Jun 2014

Digital Breast Tomosynthesis (DBT) has the potential to replace or supplement Digital Mammography (DM). Studies have shown that it takes radiologists more time to read DBT examinations compared with DM. The slice separation of image volumes has been set to 1 mm on most systems. By using thicker slices review time could be reduced. This paper investigates the possibility of using 2 mm Average Intensity Pixel (AIP) slabs for image review. The thicker slabs were created using a method based on statistical artifact reduction and super-resolution. Six radiologists were presented with 20 sets of images containing 16 tumor masses and 8 micro-calcification clusters. They ranked 2 mm slabbed sets relative to standard 1 mm. Visibility (P = .0044) of micro-calcifications improved and there was no significant effect on mass visibility (P = .46). The results indicate that it is possible to review DBT-volumes with 2 mm slabs without compromising image quality.

Methods for the Evaluation of Image Quality: A Review

Article

Full-text available

Aug 2000

Lars Gunnar Månsson

In medical imaging, information about the patient and possible abnormalities is transferred to the radiologist in two major steps: (i) data acquisition and image formation, and (ii) processing and display. Step one is mainly dependent on technical and physical characteristics of the equipment. Step two includes the vital importance of the performance of the radiologist; i.e. how he or she detects and interprets the structures in the image. The quality of a radiographical procedure must therefore be described with regard to both these steps. The spectrum of possible evaluation methods of importance will be described. The principles, benefits and drawbacks of some of these methods will be given together with examples of their use.

A software tool for increased efficiency in observer performance studies in radiology

Article

Full-text available

Feb 2005

Observer performance studies are time-consuming tasks, both for the participating observers and for the scientists collecting and analysing the data. A possible way to optimise such studies is to perform them in a completely digital environment. A software tool-ViewDEX (Viewer for Digital Evaluation of X-ray images)-has been developed in Java, enabling it to function on almost any computer. ViewDEX is designed to handle several types of studies, such as visual grading analysis (VGA), image criteria scoring (ICS) and receiver operating characteristics (ROC). The results from each observer are saved in a log file, which can be exported for further analysis in, for example, a special software for analysing ROC results. By using ViewDEX for an ROC experiment, an evaluation rate of approximately 200 images per hour can be achieved, compared to approximately 25 images per hour using hard copy evaluation. The results are obtained within minutes of completion of the viewing. The risk of human errors in the process of data collection and analysis is also minimised. The viewer has been used in a major trial containing approximately 2700 images.

Visual grading characteristics (VGC) analysis: A non-parametric rank-invariant statistical method for image quality evaluation

Article

Full-text available

Apr 2007
BRIT J RADIOL

Visual grading of the reproduction of important anatomical structures is often used to determine clinical image quality in radiography. However, many visual grading methods incorrectly use statistical methods that require data belonging to an interval scale. The rating data from the observers in a visual grading study with multiple ratings is ordinal, meaning that non-parametric rank-invariant statistical methods are required. This paper describes such a method for determining the difference in image quality between two modalities called visual grading characteristics (VGC) analysis. In a VGC study, the task of the observer is to rate his confidence about the fulfilment of image quality criteria. The rating data for the two modalities are then analysed in a manner similar to that used in receiver operating characteristics (ROC) analysis. The resulting measure of image quality is the VGC curve, which--for all possible thresholds of the observer for a fulfilled criterion--describes the relationship between the proportions of fulfilled image criteria for the two compared modalities. The area under the VGC curve is proposed as a single measure of the difference in image quality between two compared modalities. It is also described how VGC analysis can be applied to data from an absolute visual grading analysis study.

Thick Slices from Tomosynthesis Data Sets: Phantom Study for the Evaluation of Different Algorithms

Article

Full-text available

Nov 2007
J DIGIT IMAGING

PURPOSE Tomosynthesis is a 3-dimensional mammography technique that generates thin slices separated one to the other by typically 1 mm from source data sets. The relatively high image noise in these thin slices raises the value of 1-cm thick slices computed from the set of reconstructed slices for image interpretation. In an initial evaluation, we investigated the potential of different algorithms for generating thick slices from tomosynthesis source data (maximum intensity projection—MIP; average algorithm—AV, and image generation by means of a new algorithm, so-called softMip). The three postprocessing techniques were evaluated using a homogeneous phantom with one textured slab with a total thickness of about 5 cm in which two 0.5-cm-thick slabs contained objects to simulate microcalcifications, spiculated masses, and round masses. The phantom was examined by tomosynthesis (GE Healthcare). Microcalcifications were simulated by inclusion of calcium particles of four different sizes. The slabs containing the inclusions were examined in two different configurations: adjacent to each other and close to the detector and with the two slabs separated by two 1-cm thick breast equivalent material slabs. The reconstructed tomosynthesis slices were postprocessed using MIP, AV, and softMip to generate 1-cm thick slices with a lower noise level. The three postprocessing algorithms were assessed by calculating the resulting contrast versus background for the simulated microcalcifications and contrast-to-noise ratios (CNR) for the other objects. The CNRs of the simulated round and spiculated masses were most favorable for the thick slices generated with the average algorithm, followed by softMip and MIP. Contrast of the simulated microcalcifications was best for MIP, followed by softMip and average projections. Our results suggest that the additional generation of thick slices may improve the visualization of objects in tomosynthesis. This improvement differs from the different algorithms for microcalcifications, speculated objects, and round masses. SoftMip is a new approach combining features of MIP and average showing image properties in between MIP and AV.

Lecture Notes in Computer Science

Conference Paper

Jul 2012

We propose a method for out-of-plane artifact reduction in digital breast tomosynthesis reconstruction. Because of the limited angular range acquisition in DBT, the reconstructed slices have reduced resolution in z-direction and are affected by artifacts. The out-of-plane blur caused by dense tissue and large masses complicates reconstruction of thick slices volumes. The streak-like out-of-plane artifacts caused by calcifications and metal clips distort the shape of calcifications which is regarded by many radiologists as an important malignancy predictor. Small clinical features such as micro-calcifications could be obscured by bright artifacts. The proposed technique involves reconstructing a set of super-resolution slices and predicting the artifact-free voxel intensity based on the corresponding set of projection pixels using a statistical model learned from a set of training data. Our experiments show that the resulting reconstructed images are de-blurred and streak-like artifacts are reduced, visibility of clinical features, contrast and sharpness are improved and thick-slice reconstruction is possible without the loss of contrast and sharpness.

Optimizing filtered backprojection reconstruction for a breast tomosynthesis prototype device - art. no. 61420F

Article

Mar 2006
Proceedings of SPIE

Digital breast tomosynthesis is a new technique intended to overcome the limitations of conventional projection mammography by reconstructing slices through the breast from projection views acquired from different angles with respect to the breast. We formulate a general theory of filtered backprojection reconstruction for linear tomosynthesis. The filtering step consists of an MTF inversion filter, a spectral filter, and a slice thickness filter. In this paper the method is applied first to simulated data to understand the basic effects of the various filtering steps. We then demonstrate the impact of the filter functions with simulated projections and with clinical data acquired with a research breast tomosynthesis system.** With this reconstruction method the image quality can be controlled regarding noise and spatial resolution. In a wide range of spatial frequencies the slice thickness can be kept constant and artifacts caused by the incompleteness of the data can be suppressed.

ViewDEX - A Java-based software for presentation and evaluation of medical images in observer performance studies. - art. no. 65091R

Article

Mar 2007
Proceedings of SPIE

Observer performance studies are time-consuming tasks, both for the participating observers and for the scientists collecting and analyzing the data. A possible way to optimize such studies is to perform the study in a completely digital environment. A software tool - ViewDEX (Viewer for Digital Evaluation of X-ray images) - has been developed in Java, enabling it to function on almost any computer. ViewDEX is a DICOM-compatible software tool that can be used to display medical images with simultaneous registration of the observer's response. ViewDEX is designed so that the user in a simple way can alter the types of questions and images presented to the observers, enabling ROC, MAFC and visual grading studies to be conducted in a fast and efficient way. The software can also be used for bench marking and for educational purposes. The results from each observer are saved in a log file, which can be exported for further analysis. The software is freely available for non-commercial purposes.

Time to Diagnosis and Performance Levels during Repeat Interpretations of Digital Breast Tomosynthesis. Preliminary Observations

Article

Apr 2010

To compare time to interpretation and diagnostic performance levels during repeat readings of full-field digital mammography (FFDM) and digital breast tomosynthesis (DBT) in a retrospective study. Three experienced radiologists twice interpreted 125 selected examinations, 35 with verified cancers and 90 negative for cancer during a period of 22 months using FFDM alone followed by a combined FFDM + DBT mode. Changes in time to "review and rate" these examinations as well as in diagnostic performance levels where assessed. A fixed-effect analysis accounting for cross-correlation due to the review of the same examinations by the same readers was performed. The total (combined) time to review and rate an examination increased on average by 33% between the first and second readings of the same examinations (P < .001). Radiologists reduced their time to review FFDM before making the DBT available for viewing. However, they spent more time reviewing the combined FFDM + DBT mode. The recall rates for examinations depicting cancer remained largely unchanged. Among the groups of examinations with concordant and discordant recall recommendations during the two readings only the group examinations that were "newly recalled" during repeat reading, took significantly longer (P < .01). DBT-based breast imaging may ultimately result in a substantial increase in performance; however, without efficiency improvements DBT may take longer to interpret. Addition of "false-positive recalls" was most strongly associated with increase in interpretation time while elimination of "false-positive recalls" did not require longer interpretation time.

Digital Breast Tomosynthesis: Observer Performance Study

Article

Sep 2009
AM J ROENTGENOL

The purpose of this study was to compare in a retrospective observer study the diagnostic performance of full-field digital mammography (FFDM) with that of digital breast tomosynthesis. Eight experienced radiologists interpreted images from 125 selected examinations, 35 with verified findings of cancer and 90 with no finding of cancer. The four display conditions included FFDM alone, 11 low-dose projections, reconstructed digital breast tomosynthesis images, and a combined display mode of FFDM and digital breast tomosynthesis images. Observers rated examinations using the screening BI-RADS rating scale and the free-response receiver operating characteristic paradigm. Observer performance levels were measured as the proportion of examinations prompting recall of patients for further diagnostic evaluation. The results were presented in terms of true-positive fraction and false-positive fraction. Performance levels were compared among the acquisitions and reading modes. Time to view and interpret an examination also was evaluated. Use of the combination of digital breast tomosynthesis and FFDM was associated with 30% reduction in recall rate for cancer-free examinations that would have led to recall if FFDM had been used alone (p < 0.0001 for the participating radiologists, p = 0.047 in the context of a generalized population of radiologists). Use of digital breast tomosynthesis alone also tended to reduce recall rates, an average of 10%, although the observed decrease was not statistically significant (p = 0.09 for the participating radiologists). There was no convincing evidence that use of digital breast tomosynthesis alone or in combination with FFDM results in a substantial improvement in sensitivity. Use of digital breast tomosynthesis for breast imaging may result in a substantial decrease in recall rate.

Digital Breast Tomosynthesis: A Pilot Observer Study

Article

Apr 2008
AM J ROENTGENOL

The objective of our study was to assess ergonomic and diagnostic performance-related issues associated with the interpretation of digital breast tomosynthesis-generated examinations. Thirty selected cases were read under three different display conditions by nine experienced radiologists in a fully crossed, mode-balanced observer performance study. The reading modes included full-field digital mammography (FFDM) alone, the 11 low-dose projections acquired for the reconstruction of tomosynthesis images, and the reconstructed digital breast tomosynthesis examination. Observers rated cases under the free-response receiver operating characteristic, as well as a screening paradigm, and provided subjective assessments of the relative diagnostic value of the two digital breast tomosynthesis-based image sets as compared with FFDM. The time to review and diagnose each case was also evaluated. Observer performance measures were not statistically significant (p > 0.05) primarily because of the small sample size in this pilot study, suggesting that showing significant improvements in diagnosis, if any, will require a larger study. Several radiologists did perceive the digital breast tomosynthesis image set and the projection series to be better than FFDM (p < 0.05) for diagnosing this specific case set. The time to review, interpret, and rate the examinations was significantly different for the techniques in question (p < 0.05). Tomosynthesis-based breast imaging may have great potential, but much work is needed before its optimal role in the clinical environment is known.

A Study of the Feasibility of using slabbing to reduce Tomosynthesis Review Time

Abstract and Figures

Recommended publications

Digital Breast Tomosynthesis: Observer Performance of Clustered Microcalcification Detection on Brea...

Digital Breast Tomosynthesis (DBT): Observer Performance Study of Microcalcification Cluster Detecti...

Image Quality of Thick Average Intensity Pixel Slabs Using Statistical Artifact Reduction in Breast...

Evaluation of the possibility to use thick slabs of reconstructed outer breast tomosynthesis slice i...

A human observer study for evaluation and optimization of reconstruction methods in breast tomosynth...

The effect of reduced breast compression in breast tomosynthesis: Human observer study using clinica...