Confusion matrix of the best result for our three classes ViT with SPT model. Diagnostics 2023, 13, x FOR PEER REVIEW 16 of 18

Confusion matrix of the best result for our three classes ViT with SPT model. Diagnostics 2023, 13, x FOR PEER REVIEW 16 of 18

Source publication
Article
Full-text available
One of the most common types of cancer among in women is cervical cancer. Incidence and fatality rates are steadily rising, particularly in developing nations, due to a lack of screening facilities, experienced specialists, and public awareness. Visual inspection is used to screen for cervical cancer after the application of acetic acid (VIA), hist...

Citations

... More than 75% of images are classified correctly as the corresponding cervix type (54.0% Type 1 accuracy, 80.6% Type 2 accuracy, and 76.9% Type 3 accuracy), which is much better than the other classifiers on the same dataset [84] ...
Article
Full-text available
Cervical cancer is a major health concern worldwide, highlighting the urgent need for better early detection methods to improve outcomes for patients. In this study, we present a novel digital pathology classification approach that combines Low-Rank Adaptation (LoRA) with the Vision Transformer (ViT) model. This method is aimed at making cervix type classification more efficient through a deep learning classifier that does not require as much data. The key innovation is the use of LoRA, which allows for the effective training of the model with smaller datasets, making the most of the ability of ViT to represent visual information. This approach performs better than traditional Convolutional Neural Network (CNN) models, including Residual Networks (ResNets), especially when it comes to performance and the ability to generalize in situations where data are limited. Through thorough experiments and analysis on various dataset sizes, we found that our more streamlined classifier is highly accurate in spotting various cervical anomalies across several cases. This work advances the development of sophisticated computer-aided diagnostic systems, facilitating more rapid and accurate detection of cervical cancer, thereby significantly enhancing patient care outcomes.
... In this work, we developed a Transformer based framework inspired by VATT [15] for the laparoscopic surgery videos phase recognition. This framework model incorporates state-of-art methods such as Vision Transformer (ViT) [16] and Bidirectional Encoder Representations from Transformers (BERT) [17] for the extraction of image and text embeddings from surgical videos and their text descriptions, respectively. The core of this framework is that it uses two Transformer encoders that are trained separately using different input modalities in order to extract modality-specific representations or feature vectors of every input modality. ...
... The Vision Transformer [16] is based on the Transformer architecture [21], which is a type of neural network that is particularly well-suited for processing sequential data. A pre-trained version of the BERT model [17] was used directly to extract the text embeddings. On the other hand, for image embedding extraction, a pre-trained version of ViT was fine-tuned. ...
Article
Full-text available
The determination of the potential role and advantages of artificial intelligence-based models in the field of surgery remains uncertain. This research marks an initial stride towards creating a multimodal model, inspired by the Video-Audio-Text Transformer, that aims to reduce negative occurrences and enhance patient safety. The model employs text and image embedding state-of-the-art models (ViT and BERT) to assess their efficacy in extracting the hidden and distinct features from the surgery video frames. These features are then used as inputs for convolution-free Transformer architectures to extract comprehensive multidimensional representations. A joint space is then used to combine the text and image features extracted from both Transformer encoders. This joint space ensures that the relationships between the different modalities are preserved during the combination process. The entire model was trained and tested on laparoscopic cholecystectomy (LC) videos encompassing various levels of complexity. Experimentally, a mean accuracy of 91.0%, a precision of 81%, and a recall of 83% were reached by the model when tested on 30 videos out of 80 from the Cholec80 dataset.
Article
Full-text available
Background: Cervical cancer is one of the most common malignant tumors in the world, and it is the fourth leading cause of cancer in women. The morbidity and mortality of cervical cancer in the developing countries are distinctly higher than those in the developed countries. Computer-assisted diagnosis is key for scaling up cervical cancer screening, but current algorithms perform poorly on whole slide image analysis and generalization. The aim: This study aims to show about computed-assisted diagnosis in diagnostic cervical cancer imaging. Methods: By comparing itself to the standards set by the Preferred Reporting Items for Systematic Review and Meta-Analysis (PRISMA) 2020, this study was able to show that it met all of the requirements. So, the experts were able to make sure that the study was as up-to-date as it was possible to be. For this search approach, publications that came out between 2014 and 2024 were taken into account. Several different online reference sources, like Pubmed and SagePub, were used to do this. It was decided not to take into account review pieces, works that had already been published, or works that were only half done. Result: In the PubMed database, the results of our search brought up 153 articles, whereas the results of our search on SagePub brought up 180 articles. The results of the search conducted for the last year of 2014 yielded a total 53 articles for PubMed and 72 articles for SagePub. The result from title screening, a total 18 articles for PubMed and 27 articles for SagePub. In the end, we compiled a total of 10 papers. We included five research that met the criteria. Conclusion: Computed-assisted medical diagnosis can successfully complete a variety of medical tasks by efficiently exploring the essence of a large amount of clinical data. The colposcopy-guided cervical biopsy is essential for detecting CIN in cervical cancer screening, but there are difficulties with increasing sensitivity globally.