Flowchart of process for identification of artificial intelligence abstracts.

Flowchart of process for identification of artificial intelligence abstracts.

Source publication
Article
Full-text available
Artificial intelligence (AI) research is transforming the range tools and technologies available to pathologists, leading to potentially faster, personalized and more accurate diagnoses for patients. However, to see the use of tools for patient benefit and achieve this safely, the implementation of any algorithm must be underpinned by high quality...

Context in source publication

Context 1
... available from Modern Pathology via https://www.nature.com/modpathol/articles?type=abstracts-collection&year=2019. 27 Abstracts for the 31st European Congress of Pathology were available through Virchow's Archiv via https://link. springer.com/article/10.1007/s00428-019-02631-8. 28 One reviewer (CM) identified abstracts using the process shown in Fig. 1. The documents were available as PDF files and were searched for key terms using the electronic search function. Additionally, manuscript titles were screened manually for potentially relevant, missed ...

Citations

... In digital pathology, high-resolution whole slide images (WSIs) of the entire tissue section are generated using specialized scanning devices, usually at 0.25 or 0.50 microns per pixel (MPP) corresponding to apparent magnification of 40x or 20x, respectively [10]. Digital pathology opens opportunities to employ image analysis algorithms and artificial intelligence techniques to aid pathologists in diagnosing and predicting outcomes in liver histopathology [11]. ...
... Machine learning-based and image analysis tools can improve quantitative tissue characterisation upon human pathological evaluation by increasing reproducibility, identifying features associated with clinical outcomes, and providing a platform for rigorous and consistent assessment of disease regression following treatment [11]. In this study, we developed an open-source tool to quantitate two pathological features of interest in the MASLD. ...
Preprint
Introduction: The histological assessment of liver biopsies by pathologists serves as the gold standard for diagnosing metabolic dysfunction-associated steatotic liver disease (MASLD) and staging disease progression. Various machine learning and image analysis tools have been reported to automate the quantification of fatty liver and enhance patient risk stratification. However, the current software is either not open-source or not directly applicable to the whole slide images (WSIs). Methods: In this paper, we introduce "Liver-Quant," an open-source Python package designed for quantifying fat and fibrosis in liver WSIs. Employing colour and morphological features, Liver-Quant measures the Steatosis Proportionate Area (SPA) and Collagen Proportionate Area (CPA). The method's accuracy and robustness were evaluated using an internal dataset of 424 WSIs from adult patients collected retrospectively from the archives at Leeds Teaching Hospitals NHS Trust between 2016 and 2022 and an external public dataset of 109 WSIs. For each slide, semi-quantitative scores were automatically extracted from free-text pathological reports. Furthermore, we investigated the impact of three different staining dyes including Van Gieson (VG), Picro Sirius Red (PSR), and Masson's Trichrome (MTC) on fibrosis quantification. Results: The Spearman rank coefficient (ρ) was calculated to assess the correlation between the computed SPA/CPA values and the semi-quantitative pathologist scores. For steatosis quantification, we observed a substantial correlation (ρ=0.92), while fibrosis quantification exhibited a moderate correlation with human scores (ρ=0.51). To assess stain variation on CPA measurement, we collected N=18 cases and applied the three stains. Employing stain normalisation, an excellent agreement was observed in CPA measurements among the three stains using Bland-Altman plots. However, without stain normalisation, PSR emerged as the most effective dye due to its enhanced contrast in the Hue channel, displaying a strong correlation with human scores (ρ=0.9), followed by VG (ρ=0.8) and MTC (ρ=0.59). Additionally, we explored the impact of the apparent magnification on SPA and CPA. High-resolution images collected at 0.25 microns per pixel (MPP) [apparent magnification = 40x] or 0.50 MPP [apparent magnification = 20x] were found to be essential for accurate SPA measurement, whereas for CPA measurement, low-resolution images collected at 10 MPP [apparent magnification = 1x] were sufficient. Conclusion: The Liver-Quant package offers an open-source solution for rapid and precise liver quantification in WSIs applicable to multiple histological stains.
... Instead, it highlighted that the literature consists of a vast number of small studies evaluating a single AI tool performing a narrowly defined task. In such studies, the technical accuracy of tools is well documented, using a wide range of robust performance metrics including sensitivity and specificity, positive and negative predictive values and area under the receiver operating characteristic curve, or area under the precision-recall curve 20 . Calculating the area under these curves each gives a single powerful metric that measures a ML tool's ability to differentiate between two binary groups 21 . ...
Article
Full-text available
An increasing number of artificial intelligence (AI) tools are moving towards the clinical realm in histopathology and across medicine. The introduction of such tools will bring several benefits to diagnostic specialities, namely increased diagnostic accuracy and efficiency, however, as no AI tool is infallible, their use will inevitably introduce novel errors. These errors made by AI tools are, most fundamentally, misclassifications made by a computational algorithm. Understanding of how these translate into clinical impact on patients is often lacking, meaning true reporting of AI tool safety is incomplete. In this Perspective we consider AI diagnostic tools in histopathology, which are predominantly assessed in terms of technical performance metrics such as sensitivity, specificity and area under the receiver operating characteristic curve. Although these metrics are essential and allow tool comparison, they alone give an incomplete picture of how an AI tool’s errors could impact a patient’s diagnosis, management and prognosis. We instead suggest assessing and reporting AI tool errors from a pathological and clinical stance, demonstrating how this is done in studies on human pathologist errors, and giving examples where available from pathology and radiology. Although this seems a significant task, we discuss ways to move towards this approach in terms of study design, guidelines and regulation. This Perspective seeks to initiate broader consideration of the assessment of AI tool errors in histopathology and across diagnostic specialities, in an attempt to keep patient safety at the forefront of AI tool development and facilitate safe clinical deployment.
... Laboratory safety and process improvements are among the most complicated aspects a medical director needs to manage. Numerous initiatives tackle diagnostic quality at the international, national, regional, local, and individual test levels [39,51,[105][106][107]. The terms safety culture and systemlevel thinking are clearly aimed to ensure appropriate function for patients and clinicians; however, developing and sustaining the systems for reliable diagnostics require time, effort, and dedicated resources [27,39,52,57,80,92,95,97,108,109]. ...
... Second, the framework cannot replace qualified personnel or competent decision making in an ongoing operation. Third, numerous professional organizations created tools, guidelines, and checklists [26,29,31,57,64,83,106,114]. For example, the Association for Molecular Pathology has introduced the concept laboratory-developed procedure (as opposed to test) [57]. ...
Article
Full-text available
Background: Laboratory medicine has reached the era where promises of artificial intelligence and machine learning (AI/ML) seem palpable. Currently, the primary responsibility for risk-benefit assessment in clinical practice resides with the medical director. Unfortunately, there is no tool or concept that enables diagnostic quality assessment for the various potential AI/ML applications. Specifically, we noted that an operational definition of laboratory diagnostic quality - for the specific purpose of assessing AI/ML improvements - is currently missing. Methods: A session at the 3rd Strategic Conference of the European Federation of Laboratory Medicine in 2022 on "AI in the Laboratory of the Future" prompted an expert roundtable discussion. Here we present a conceptual diagnostic quality framework for the specific purpose of assessing AI/ML implementations. Results: The presented framework is termed diagnostic quality model (DQM) and distinguishes AI/ML improvements at the test, procedure, laboratory, or healthcare ecosystem level. The operational definition illustrates the nested relationship among these levels. The model can help to define relevant objectives for implementation and how levels come together to form coherent diagnostics. The affected levels are referred to as scope and we provide a rubric to quantify AI/ML improvements while complying with existing, mandated regulatory standards. We present 4 relevant clinical scenarios including multi-modal diagnostics and compare the model to existing quality management systems. Conclusions: A diagnostic quality model is essential to navigate the complexities of clinical AI/ML implementations. The presented diagnostic quality framework can help to specify and communicate the key implications of AI/ML solutions in laboratory diagnostics.
... In previous work, our group published a study examining reporting of AI diagnostic accuracy studies in pathology conference abstracts which demonstrated that reporting was suboptimal, reporting guidance was not used or endorsed and that work was needed to address areas of potential bias in AI studies. 40 A study by Hogan et al. in 2020 examined reporting of diagnostic accuracy studies more generally in pathology journals and showed that better enforcement of reporting guideline use was needed, as incomplete reporting was prevalent. 41 Radiology shares some similarities with digital pathology, in terms of providing diagnoses and using medical imaging systems. ...
... To address these concerns and given the issues described earlier with incomplete reporting, summarising, and categorising the range of resources available may be helpful to those conducting and reporting studies within this field. 40,41 It must also be stated that whilst usually it will be very clear which is the most appropriate guideline to follow, reviewing the options and selecting the most appropriate guidance to fit the context may be required by the researcher. A selection of key recommended guidance for each research stage described is outlined in Table 8. ...
Preprint
Full-text available
The application of new artificial intelligence (AI) discoveries is transforming healthcare research. However, the standards of reporting are variable in this still evolving field, leading to potential research waste. The aim of this work is to highlight resources and reporting guidelines available to researchers working in computational pathology. The EQUATOR Network library of reporting guidelines and extensions was systematically searched up to August 2022 to identify applicable resources. Inclusion and exclusion criteria were used and guidance was screened for utility at different stages of research and for a range of study types. Items were compiled to create a summary for easy identification of useful resources and guidance. Over 70 published resources applicable to pathology AI research were identified. Guidelines were divided into key categories, reflecting current study types and target areas for AI research: Literature & Research Priorities, Discovery, Clinical Trial, Implementation and Post-Implementation & Guidelines. Guidelines useful at multiple stages of research and those currently in development were also highlighted. Summary tables with links to guidelines for these groups were developed, to assist those working in cancer AI research with complete reporting of research. Issues with replication and research waste are recognised problems in AI research. Reporting guidelines can be used as templates to ensure the essential information needed to replicate research is included within journal articles and abstracts. Reporting guidelines are available and useful for many study types, but greater awareness is needed to encourage researchers to utilise them and for journals to adopt them. This review and summary of resources highlights guidance to researchers, aiming to improve completeness of reporting.
... However, as the AI-driven diagnoses are not evidence-based, the pathologists are not necessarily provided with reasonings for AIgenerated diagnoses. 15 Moreover, the clinicians cannot interact with machines and their logic and therefore few opportunities exist for mutual learning. Such approaches could reflect a pervasive implicit assumption among some AI developers, who treat domain experts as "non-essential" and conceive their expertise as needed only in service of building and optimizing algorithms in a utilitarian manner. ...
Article
Full-text available
Pathology is a fundamental element of modern medicine that determines the final diagnosis of medical conditions, leads medical decisions, and portrays the prognosis. Due to continuous improvements in AI capabilities (e.g., object recognition and image processing), intelligent systems are bound to play a key role in augmenting pathology research and clinical practices. Despite the pervasive deployment of computational approaches in similar fields such as radiology, there has been less success in integrating AI in clinical practices and histopathological diagnosis. This is partly due to the opacity of end-to-end AI systems, which raises issues of interoperability and accountability of medical practices. In this article, we draw on interactive machine learning to take advantage of AI in digital pathology to open the black box of AI and generate a more effective partnership between pathologists and AI systems based on the metaphors of parameterization and implicitization.
Article
Background Diagnostic neuroimaging plays an essential role in guiding clinical decision-making in the management of patients with cerebral aneurysms. Imaging technologies for investigating cerebral aneurysms constantly evolve, and clinicians rely on the published literature to remain up to date. Reporting guidelines have been developed to standardise and strengthen the reporting of clinical evidence. Therefore, it is essential that radiological diagnostic accuracy studies adhere to such guidelines to ensure completeness of reporting. Incomplete reporting hampers the reader’s ability to detect bias, determine generalisability of study results or replicate investigation parameters, detracting from the credibility and reliability of studies. Objective The purpose of this systematic review was to evaluate adherence to the Standards for Reporting of Diagnostic Accuracy Studies (STARD) 2015 reporting guideline amongst imaging diagnostic accuracy studies for cerebral aneurysms. Methods A systematic search for cerebral aneurysm imaging diagnostic accuracy studies was conducted. Journals were cross examined against the STARD 2015 checklist and their compliance with item numbers was recorded. Results The search yielded 66 articles. The mean number of STARD items reported was 24.2 ± 2.7 (71.2% ± 7.9%), with a range of 19 to 30 out of a maximum number of 34 items. Conclusion Taken together, these results indicate that adherence to the STARD 2015 guideline in cerebral aneurysm imaging diagnostic accuracy studies was moderate. Measures to improve compliance include mandating STARD 2015 adherence in instructions to authors issued by journals.
Article
Objective This systematic review assesses the reporting quality and risk of bias in studies evaluating the diagnostic test accuracy (DTA) of clinical decision support systems (CDSS). Study Design and Setting The Cochrane Library, PubMed/MEDLINE, Scopus and Web of Science were searched for studies, published between January 1, 2016 and May 31, 2021, evaluating the DTA of CDSS for human patients. Articles using a patient’s self-diagnosis, assessing disease severity, focusing on treatment/follow-up, or comparing pre-post CDSS implementation periods were excluded. All eligible studies were assessed for reporting quality using STARD 2015 and for risk of bias using QUADAS-2. Item ratings were presented using heat maps. This study was reported according to PRISMA-DTA. Results In total, 158 of 2,820 screened articles were included in the analysis. The studies were heterogeneous in terms of study characteristics, reporting quality, risk of biases, and applicability concerns with few highly rated studies. Mostly the overall quality was deficient for items addressing the domains ‘methodology’, ‘results’, and ‘other information’. Conclusion Our analysis revealed shortcomings in critical domains of reporting quality and risk of bias, indicating the need for additional guidance and training in an interdisciplinary scientific field with mixed biostatistical expertise.