Article

ChatGPT and conversational artificial intelligence: Friend, foe, or future of research?

May 2023
The American Journal of Emergency Medicine 70(1):81-83

May 2023
70(1):81-83

DOI:10.1016/j.ajem.2023.05.018

Authors:

Michael Gottlieb

Rush University Medical Center

Jeffrey A Kline

Indiana University School of Medicine

Alexander J Schneider

Wendy C Coates

University of California, Los Angeles

Artificial intelligence (AI) and machine learning are increasingly utilized across healthcare. More recently, there has been a rise in the use AI within research, particularly through novel conversational AI platforms, such as ChatGPT. In this Controversies paper, we discuss the advantages, limitations, and future directions for ChatGPT and other forms of conversational AI in research and scholarly dissemination.

From virtual assistant to writing mentor: Exploring the impact of a ChatGPT-based writing instruction protocol on EFL teachers' self-efficacy and learners' writing skill

Article

Apr 2024
LANG TEACH RES

Language teaching is a highly emotional profession that can affect the teachers' well-being and learners' achievement. However, studies have yet to explore the potential of positive psychology interventions and artificial intelligence (AI) tools to promote the psycho-emotional aspects of second language (L2) teachers and learners. Further, studies regarding the effectiveness of AI in promoting the learners' language skills could have been expansive. Responding to these gaps, researchers chose ChatGPT, an AI-powered chatbot capable of generating natural and coherent texts, as a potential tool to foster positive emotions and interactions between Iranian English language teachers (n = 12) and learners (n = 48) in the L2 writing context. We operationalized ChatGPT in a three-phased writing instruction protocol (CGWIP): (1) a planning phase, where teachers used ChatGPT to brainstorm ideas and generate outlines for each session; (2) an instruction phase, where teachers used ChatGPT to engage the learners in writing process, analyse and reflect on their drafts, and (3) an assessment phase, where teachers used ChatGPT to simulate IELTS writing exam and provided detailed and constructive feedback to the learners. We further tested the effectiveness of CGWIP on teachers' self-efficacy and learners' writing skills before and after a 10-week instruction program. The Independent Samples t-test results showed that CGWIP significantly enhanced teachers' self-efficacy compared to the control group. Also, the results of One Way ANCOVA revealed that CGWIP significantly improved learners' writing skills and that these effects persisted over time. The study implied that the protocol can nurture teachers' efficiency by helping them in various aspects of L2 writing instruction, including brainstorming, revising, providing feedback, and assessment, which in turn, improves learners' writing skills.

The Ethics of ChatGPT in Medicine and Healthcare: A Systematic Review on Large Language Models (LLMs)

Preprint

Full-text available

Mar 2024

Background: With the introduction of ChatGPT, Large Language Models (LLMs) have received enormous attention in healthcare. Despite their potential benefits, researchers have underscored various ethical implications. While individual instances have drawn much attention, the debate lacks a systematic and comprehensive overview of practical applications currently researched and ethical issues connected to them. Against this background, this work aims to map the ethical landscape surrounding the current stage of deployment of LLMs in medicine and healthcare. Methods: Electronic databases and commonly used preprint servers were queried using a comprehensive search strategy which generated 796 records. Studies were screened and extracted following a modified rapid review approach. Method-ological quality was assessed using a hybrid approach. For 53 records, a meta-aggregative synthesis was performed. Results: Four general fields of applications emerged and testify to a vivid phase of exploration. Advantages of using LLMs are attributed to their capacity in data analysis, personalized information provisioning, and support in decision-making or mitigating information loss and enhancing medical information accessibility. However, our study also identifies recurrent ethical concerns connected to fairness, bias, non-maleficence, transparency , and privacy. A distinctive concern is the tendency to produce harmful misinformation or convincingly but inaccurate content. A recurrent plea for ethical guidance and human oversight is evident. Discussion: Given the variety of use cases, it is suggested that the ethical guidance debate be reframed to focus on defining what constitutes acceptable human oversight across the spectrum of applications. This involves considering the diversity of setting, varying potentials for harm, and different acceptable thresholds for performance and certainty in diverse healthcare settings. In addition, a critical inquiry is necessary to determine the extent to which the current experimental use of LLMs is both necessary and justified.

Chat GPT-4: Potentials, barriers, and future directions for newer medical researchers

Article

Mar 2024
AM J MED SCI

In its broadest sense, Artificial Intelligence (AI) describes any machine or computer capable of carrying out operations that normally demand human intellect, such as comprehension, perception, problem-solving skills, and judgement. In the emerging disciplines of generative artificial intelligence (AI), several large language models (LLMs) have evolved as promising tools in the recent decade. ChatGPT is an advanced development in Large Language Model (LLM) technology that employs Deep Learning (DL) techniques to generate human-like responses to natural language inputs. ChatGPT is among the most extensive language models made accessible to the public, and belongs to the family of OpenAI's Generative Pre-trained Transformer (GPT) models. With the help of an extensive text database, ChatGPT can produce reasonable and contextually relevant responses to a wide range of inquiries by comprehending the intricacies and complexity of human language, and having interactive debates. In March 2023, an open AI recently introduced GPT-4 as the latest version of the fine-tuned ChatGPT that can execute an array of real-world tasks significantly faster than an individual by replicating distinct human cognitive abilities including rapid computation, scientific reasoning, visuospatial ability, memory, image analysis as well as comprehension proficiencies.

), special issue

Article

Full-text available

Dec 2023

Patrick Renatus Manyengo

Managing Research in Higher Learning Institutions (HLIs) in Tanzania : A Systematic Review on the best Practices for using Artificial Intelligence

Article

Full-text available

Jan 2024

Patrick Renatus Manyengo

This paper reports on the findings of a systematic review in relation to the research management practices in Higher Learning Institutions through the use of Artificial intelligence (AI) technologies such as ChatGPT in Tanzania. AI technologies have gained significant popularity in recent times. However, their integration into academic settings raises concerns, especially in terms of potential ethical considerations. The systematic review at hand used the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines to retrieve English records in Google Scholar under the phrase "ChatGPT in research¨. Eligibility criteria included the published research papers on ChatGPT and research practices. A total of 28 documents were retrieved. Only 20 documents met the inclusion criteria after full screening. The findings indicate that setting a code of ethics for using AI is paramount. Further research is needed in order to gain detailed insights into this new innovation and technology. It was concluded that ChatGPT in research has to be validated with other methods.

Evidence-based Medicine: A Narrative Review on the Evolving Opportunities and Challenges

Article

Full-text available

Dec 2023

Evidence-based medicine (EBM) undeniably classifies as a pre-eminent advance in the clinical approach to decision-making. Although EBM as a topic has been discussed at length, it is more about the process of integrating EBM into practice, wherein the actual debate becomes even more interesting with unique roadblocks cropping up at the very end of the translational highway. Meanwhile, the core concept of EBM has stood firm over decades; it is likely the research landscape and the corresponding intricacies continue to evolve at a rather rampant pace. Evidence-based practice is thus best elaborated in close conjunction with the recent advent of precision medicine, the impact of the coronavirus disease 2019 pandemic, and the ever-compounding present-age research concerns. In this reference, the randomized controlled trials and now the meta-analysis (second-order analysis of analyses) are also being increasingly scrutinized for the contextual veracities and how the quality of the former can be rendered more robust to strengthen our epic pyramid of EBM. Withstanding, the index narrative article is a modern-day take on EBM keeping abreast of the evolving opportunities and challenges, with the noble objective of deliberating a standpoint that aims to potentially bridge some of the existing gaps in the translation of research to patient care and outcome improvement, at large. Keywords: Coronavirus disease 2019, Evidence-based medicine, Meta-analysis, Precision medicine, Randomized controlled trials, Systematic reviews, Research

ChatGPT usage in oral oncology: Considering the pros and cons!

Article

Full-text available

Feb 2024

ChatGPT: The catalyst for teacher-student rapport and grit development in L2 class

Article

Feb 2024
System

Mohammad Ghafouri

Language education as a dynamic field of study requires constant innovations to meet the L2 needs in the classroom milieu. Additionally, the surge of technological advancements warns us about the significance of studying the possible beneficial roles of artificial intelligence in language teaching and learning about which the field is in its infancy stage. In this vein, the present study examined the effectiveness of a four-staged ChatGPT-based rapport-building protocol (CGRBP) on teacher-student rapport and L2 grit to not only profile a non-correlational evidence to L2 emotion studies, but also to link the realm of artificial intelligence with positive psychology in order to find practical ways for cultivating an emotionally supportive learning context. To do so, 30 intermediate-level Iranian EFL learners participated in experimental (n = 15) and control (n = 15) groups in a 16-week instruction program. Data gathered from a pre-test post-test experimental design was analyzed by One-Way ANCOVA and the analyses showed that students who were taught English through CGRBP outperformed the students in control group on L2 grit. The results verified the mediating role of CGRBP in the L2 context by suggesting that the application of a well-structured and staged ChatGPT-based instruction would possibly lead to enhanced L2 grit. Since grit is an integral part of one's positive psycho-emotional network, several theoretical and pedagogical implications were discussed and directions for future explorations were suggested.

Brain versus bot: Distinguishing letters of recommendation authored by humans compared with artificial intelligence

Article

Full-text available

Nov 2023

Objectives Letters of recommendation (LORs) are essential within academic medicine, affecting a number of important decisions regarding advancement, yet these letters take significant amounts of time and labor to prepare. The use of generative artificial intelligence (AI) tools, such as ChatGPT, are gaining popularity for a variety of academic writing tasks and offer an innovative solution to relieve the burden of letter writing. It is yet to be determined if ChatGPT could aid in crafting LORs, particularly in high‐stakes contexts like faculty promotion. To determine the feasibility of this process and whether there is a significant difference between AI and human‐authored letters, we conducted a study aimed at determining whether academic physicians can distinguish between the two. Methods A quasi‐experimental study was conducted using a single‐blind design. Academic physicians with experience in reviewing LORs were presented with LORs for promotion to associate professor, written by either humans or AI. Participants reviewed LORs and identified the authorship. Statistical analysis was performed to determine accuracy in distinguishing between human and AI‐authored LORs. Additionally, the perceived quality and persuasiveness of the LORs were compared based on suspected and actual authorship. Results A total of 32 participants completed letter review. The mean accuracy of distinguishing between human‐ versus AI‐authored LORs was 59.4%. The reviewer's certainty and time spent deliberating did not significantly impact accuracy. LORs suspected to be human‐authored were rated more favorably in terms of quality and persuasiveness. A difference in gender‐biased language was observed in our letters: human‐authored letters contained significantly more female‐associated words, while the majority of AI‐authored letters tended to use more male‐associated words. Conclusions Participants were unable to reliably differentiate between human‐ and AI‐authored LORs for promotion. AI may be able to generate LORs and relieve the burden of letter writing for academicians. New strategies, policies, and guidelines are needed to balance the benefits of AI while preserving integrity and fairness in academic promotion decisions.

ChatGPT has the potential to enhance antiretroviral therapy adherence among adolescents with HIV in sub-Saharan Africa

Article

Full-text available

Aug 2023
Med Educ Online

Samuel Kizito

Examining the Validity of ChatGPT in Identifying Relevant Nephrology Literature: Findings and Implications

Article

Full-text available

Aug 2023

Literature reviews are valuable for summarizing and evaluating the available evidence in various medical fields, including nephrology. However, identifying and exploring the potential sources requires focus and time devoted to literature searching for clinicians and researchers. ChatGPT is a novel artificial intelligence (AI) large language model (LLM) renowned for its exceptional ability to generate human-like responses across various tasks. However, whether ChatGPT can effectively assist medical professionals in identifying relevant literature is unclear. Therefore, this study aimed to assess the effectiveness of ChatGPT in identifying references to literature reviews in nephrology. We keyed the prompt “Please provide the references in Vancouver style and their links in recent literature on… name of the topic” into ChatGPT-3.5 (03/23 Version). We selected all the results provided by ChatGPT and assessed them for existence, relevance, and author/link correctness. We recorded each resource’s citations, authors, title, journal name, publication year, digital object identifier (DOI), and link. The relevance and correctness of each resource were verified by searching on Google Scholar. Of the total 610 references in the nephrology literature, only 378 (62%) of the references provided by ChatGPT existed, while 31% were fabricated, and 7% of citations were incomplete references. Notably, only 122 (20%) of references were authentic. Additionally, 256 (68%) of the links in the references were found to be incorrect, and the DOI was inaccurate in 206 (54%) of the references. Moreover, among those with a link provided, the link was correct in only 20% of cases, and 3% of the references were irrelevant. Notably, an analysis of specific topics in electrolyte, hemodialysis, and kidney stones found that >60% of the references were inaccurate or misleading, with less reliable authorship and links provided by ChatGPT. Based on our findings, the use of ChatGPT as a sole resource for identifying references to literature reviews in nephrology is not recommended. Future studies could explore ways to improve AI language models’ performance in identifying relevant nephrology literature.

AI vs Humans: The Future of Academic Peer Review in Public Administration

Preprint

Full-text available

Jul 2023

In the ever-evolving landscape of academia, artificial intelligence (AI) presents promising opportunities for enhancing the academic review process. In this study, we evaluated the proficiency of Bard and GPT-4, two of the most advanced AI models, in conducting academic reviews. Bard and GPT-4 were compared to human reviewers, highlighting their capabilities and potential areas for improvement. Through a mixed-methods approach of quantitative scoring and qualitative thematic analysis, we observed a consistent performance of the AI models surpassing human reviewers in comprehensibility, clarity of review, the relevance of feedback, and accuracy of technical assessments. Qualitative analysis revealed nuanced proficiency in evaluating structure, readability, argumentation, narrative coherence, attention to detail, data analysis, and implications assessment. While Bard exhibited exemplary performance in basic comprehension and feedback relevance, GPT-4 stood out in detailed analysis, showcasing impressive attention to minor discrepancies and meticulous scrutiny. The results underscore the potential of AI as an invaluable tool in the academic review process, capable of complementing human reviewers to improve the quality, efficiency, and effectiveness of reviews. However, we also identified areas where human reviewers excel, particularly in understanding complex academic language and intricate logical progressions, offering crucial insights for future AI model training and development.

AI vs Humans: The Future of Academic Review in Public Administration

Preprint

Full-text available

Jul 2023

A guide to creating a high‐quality cover letter

Article

Apr 2024

Comparison of ChatGPT knowledge against 2020 consensus statement on ankyloglossia in children

Article

Apr 2024

Improving EFL students’ cultural awareness: Reframing moral dilemmatic stories with ChatGPT

Article

Apr 2024

Classroom teachers are usually responsible for creating materials to meet students' needs and course requirements. The arrival of generative AI, such as ChatGPT, offers EFL teachers an opportunity to regularly collaborate with AI chatbots to create new source materials. The writing experiments in this study considered how to effectively and ethically work with generative AI to produce culturally appropriate EFL teaching materials. The experiments involved co-producing moral dilemmatic stories with ChatGPT to support a new Chinese EFL curriculum unit on “Morals and Virtues”. We draw on Lo's (2023a, 2023b) CLEAR framework for prompt engineering to generate appropriate stimulus materials. We refer to Durkheim's (1961) sociological theory on morality and Bernstein's (1971, 1981, 2003) concept of framing to unpack how moral decisions might be framed as dilemmas in particular sociocultural contexts. We then mobilize Martin and White's (2005) appraisal framework to uncover the patterns of cultural biases embedded in the AI-generated text. We propose a two-step “Navigation and Generation” method for effective prompt engineering with generative AI: first navigating the AI chatbot to a clear and consistent positioning around a concept, and then requesting the AI chatbot to generate text based on the clarifications generated in the first step. Our appraisal analysis indicates that WEIRD (western, educated, industrial, rich, and democratic) cultural values are embedded in moral dilemmatic stories generated by ChatGPT. EFL Teachers need to be aware of how these values are presented to encourage critical cultural awareness in their students. [Full text (open access) available at: https://doi.org/10.1016/j.caeai.2024.100223]

Utilizing Natural Language Processing and Large Language Models in the Diagnosis and Prediction of Infectious Diseases: A Systematic Review

Article

Apr 2024
AM J INFECT CONTROL

From Triage to Treatment: Scoping the Role of Large Language Models in Transforming Emergency Medicine (Preprint)

Article

Oct 2023

Background Artificial intelligence (AI), more specifically large language models (LLMs), holds significant potential in revolutionizing emergency care delivery by optimizing clinical workflows and enhancing the quality of decision-making. Although enthusiasm for integrating LLMs into emergency medicine (EM) is growing, the existing literature is characterized by a disparate collection of individual studies, conceptual analyses, and preliminary implementations. Given these complexities and gaps in understanding, a cohesive framework is needed to comprehend the existing body of knowledge on the application of LLMs in EM. Objective Given the absence of a comprehensive framework for exploring the roles of LLMs in EM, this scoping review aims to systematically map the existing literature on LLMs’ potential applications within EM and identify directions for future research. Addressing this gap will allow for informed advancements in the field. Methods Using PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for Scoping Reviews) criteria, we searched Ovid MEDLINE, Embase, Web of Science, and Google Scholar for papers published between January 2018 and August 2023 that discussed LLMs’ use in EM. We excluded other forms of AI. A total of 1994 unique titles and abstracts were screened, and each full-text paper was independently reviewed by 2 authors. Data were abstracted independently, and 5 authors performed a collaborative quantitative and qualitative synthesis of the data. Results A total of 43 papers were included. Studies were predominantly from 2022 to 2023 and conducted in the United States and China. We uncovered four major themes: (1) clinical decision-making and support was highlighted as a pivotal area, with LLMs playing a substantial role in enhancing patient care, notably through their application in real-time triage, allowing early recognition of patient urgency; (2) efficiency, workflow, and information management demonstrated the capacity of LLMs to significantly boost operational efficiency, particularly through the automation of patient record synthesis, which could reduce administrative burden and enhance patient-centric care; (3) risks, ethics, and transparency were identified as areas of concern, especially regarding the reliability of LLMs’ outputs, and specific studies highlighted the challenges of ensuring unbiased decision-making amidst potentially flawed training data sets, stressing the importance of thorough validation and ethical oversight; and (4) education and communication possibilities included LLMs’ capacity to enrich medical training, such as through using simulated patient interactions that enhance communication skills. Conclusions LLMs have the potential to fundamentally transform EM, enhancing clinical decision-making, optimizing workflows, and improving patient outcomes. This review sets the stage for future advancements by identifying key research areas: prospective validation of LLM applications, establishing standards for responsible use, understanding provider and patient perceptions, and improving physicians’ AI literacy. Effective integration of LLMs into EM will require collaborative efforts and thorough evaluation to ensure these technologies can be safely and effectively applied.

Artificial intelligence and point-of-care ultrasound: Benefits, limitations, and implications for the future

Article

Mar 2024
AM J EMERG MED

Science in the era of ChatGPT, large language models and generative AI: Challenges for research ethics and how to respond

Chapter

Full-text available

Nov 2023

Evangelos Pournaras

How do artificial neural networks and other forms of artificial intelligence interfere with methods and practices in the sciences? Which interdisciplinary epistemological challenges arise when we think about the use of AI beyond its dependency on big data? Not only the natural sciences, but also the social sciences and the humanities seem to be increasingly affected by current approaches of subsymbolic AI, which master problems of quality (fuzziness, uncertainty) in a hitherto unknown way. But what are the conditions, implications, and effects of these (potential) epistemic transformations and how must research on AI be configured to address them adequately?

Exploring the role of ChatGPT in patient care (diagnosis and treatment) and medical research: A systematic review

Article

Full-text available

Sep 2023

Background: ChatGPT is an artificial intelligence-based tool developed by OpenAI (California, USA). This systematic review examines the potential of ChatGPT in patient care and its role in medical research. Methods: The systematic review was done according to the PRISMA guidelines. Embase, Scopus, PubMed, and Google Scholar databases were searched. We also searched preprint databases. Our search was aimed to identify all kinds of publications, without any restrictions, on ChatGPT and its application in medical research, medical publishing and patient care. We used search term “ChatGPT”. We reviewed all kinds of publications including original articles, reviews, editorial/ commentaries, and even letter to the editor. Each selected records were analysed using ChatGPT and responses generated were compiled in a table. The word table was transformed in to a PDF and was further analysed using ChatPDF. Results: We reviewed full texts of 118 articles. ChatGPT can assist with patient enquiries, note writing, decision-making, trial enrolment, data management, decision support, research support, and patient education. But the solutions it offers are usually insufficient and contradictory, raising questions about their originality, privacy, correctness, bias, and legality. Due to its lack of human-like qualities, ChatGPT’s legitimacy as an author is questioned when used for academic writing. ChatGPT-generated content has concerns with bias and possible plagiarism. Conclusion: Although it can help with patient treatment and research, there are issues with accuracy, authorship, and bias. ChatGPT can serve as a “clinical assistant” and be a help in research and scholarly writing.

ChatGPT and future of research: Comment

Article

Jun 2023
AM J EMERG MED

ChatGPT and conversational artificial intelligence: Ethics in the eye of the beholder

Article

Jun 2023
AM J EMERG MED

Can ChatGPT draft a research article? An example of population-level vaccine effectiveness analysis

Article

Full-text available

Feb 2023

We reflect on our experiences of using Generative Pre-trained Transformer ChatGPT, a chatbot launched by OpenAI in November 2022, to draft a research article. We aim to demonstrate how ChatGPT could help researchers to accelerate drafting their papers. We created a simulated data set of 100 000 health care workers with varying ages, Body Mass Index (BMI), and risk profiles. Simulation data allow analysts to test statistical analysis techniques, such as machine-learning based approaches, without compromising patient privacy. Infections were simulated with a randomized probability of hospitalisation. A subset of these fictitious people was vaccinated with a fictional vaccine that reduced this probability of hospitalisation after infection. We then used ChatGPT to help us decide how to handle the simulated data in order to determine vaccine effectiveness and draft a related research paper. AI-based language models in data analysis and scientific writing are an area of growing interest, and this exemplar analysis aims to contribute to the understanding of how ChatGPT can be used to facilitate these tasks.

ChatGPT and other artificial intelligence applications speed up scientific writing

Article

Full-text available

Feb 2023

Tzeng-Ji Chen

Exploring ChatGPT for information of cardiopulmonary resuscitation

Article

Full-text available

Feb 2023
RESUSCITATION

Chiwon Ahn

To the Editor: Recently, there has been significant interest in ChatGPT, an artificial intelligence (AI) program from OpenAI that is making its way into various fields, including medicine.1, 2 The program has been used to write a scientific article3 and even pass Master of Business Administration Course,4 the law-school entrance exam,5 and the medical license test,6 showcasing its remarkable capabilities and usability compared to previous AI models. Large language models, such as ChatGPT, are capable of learning and analyzing vast amounts of language data from various sources, and generating outputs in a human-like manner. Unlike traditional AI that simply analyzes objects and identifies patterns, ChatGPT can create new and unique objects and effects, making it a powerful generative AI. ChatGPT is a program equipped with a language model called Generative Pre-trained Transformer-3.5, which can output common knowledge quite accurately, and can also provide tailored responses based on detailed questions related to resuscitation medicine (Table 1). This is based on online articles, books, etc. that present cardiopulmonary resuscitation (CPR) guidelines. CPR guidelines should be easily accessible and include recommendations and content related to cardiopulmonary resuscitation for healthcare professionals and the general public. However, it can be difficult for people who have not received basic life support education and do not have specialized knowledge to understand the guidelines. While it is possible to search for resuscitation methods via web surfing, incorrect knowledge may be acquired due to unreliable information and unclear sources. Compared to web surfing, ChatGPT allows you to quickly receive well-tailored answers to the desired questions and receive AI-based medical decision making based on the latest research and guidelines. Even without reviewing the guidelines and papers one by one, you can easily check the information summarized and extracted by AI. Of course, according to “Garbage in, garbage out,” an algorithm approached based on inaccurate information can output inaccurate information, and the output information is solely the result of a sophisticated algorithm without human interaction, which can be dangerous in medical situations. Despite this, its utilization value can be very high due to its ability to provide personalized interaction and quick response time. Information about CPR is gaining increasing interest among the general public. Previously, information about CPR was provided in the form of guidelines and articles, which were difficult for the general public to access, and could only be accessed through professional education. However, it is thought that the development of chat-style programs that can provide the latest information about CPR in a way that is easily accessible and understandable to the general public will be possible. Therefore, it is suggested that various options for quickly adopting and utilizing this technology in provision of information on resuscitation and CPR education be explored.

Comparing scientific abstracts generated by ChatGPT to original abstracts using an artificial intelligence output detector, plagiarism detector, and blinded human reviewers

Preprint

Full-text available

Dec 2022

Background Large language models such as ChatGPT can produce increasingly realistic text, with unknown information on the accuracy and integrity of using these models in scientific writing. Methods We gathered ten research abstracts from five high impact factor medical journals (n=50) and asked ChatGPT to generate research abstracts based on their titles and journals. We evaluated the abstracts using an artificial intelligence (AI) output detector, plagiarism detector, and had blinded human reviewers try to distinguish whether abstracts were original or generated. Results All ChatGPT-generated abstracts were written clearly but only 8% correctly followed the specific journal’s formatting requirements. Most generated abstracts were detected using the AI output detector, with scores (higher meaning more likely to be generated) of median [interquartile range] of 99.98% [12.73, 99.98] compared with very low probability of AI-generated output in the original abstracts of 0.02% [0.02, 0.09]. The AUROC of the AI output detector was 0.94. Generated abstracts scored very high on originality using the plagiarism detector (100% [100, 100] originality). Generated abstracts had a similar patient cohort size as original abstracts, though the exact numbers were fabricated. When given a mixture of original and general abstracts, blinded human reviewers correctly identified 68% of generated abstracts as being generated by ChatGPT, but incorrectly identified 14% of original abstracts as being generated. Reviewers indicated that it was surprisingly difficult to differentiate between the two, but that the generated abstracts were vaguer and had a formulaic feel to the writing. Conclusion ChatGPT writes believable scientific abstracts, though with completely generated data. These are original without any plagiarism detected but are often identifiable using an AI output detector and skeptical human reviewers. Abstract evaluation for journals and medical conferences must adapt policy and practice to maintain rigorous scientific standards; we suggest inclusion of AI output detectors in the editorial process and clear disclosure if these technologies are used. The boundaries of ethical and acceptable use of large language models to help scientific writing remain to be determined.

Rapamycin in the context of Pascal’s Wager: generative pre-trained transformer perspective

Article

Full-text available

Dec 2022

Large language models utilizing transformer neural networks and other deep learning architectures demonstrated unprecedented results in many tasks previously accessible only to human intelligence. In this article, we collaborate with ChatGPT, an AI model developed by OpenAI to speculate on the applications of Rapamycin, in the context of Pascal's Wager philosophical argument commonly utilized to justify the belief in god. In response to the query "Write an exhaustive research perspective on why taking Rapamycin may be more beneficial than not taking Rapamycin from the perspective of Pascal's wager" ChatGPT provided the pros and cons for the use of Rapamycin considering the preclinical evidence of potential life extension in animals. This article demonstrates the potential of ChatGPT to produce complex philosophical arguments and should not be used for any off-label use of Rapamycin.

Performance of ChatGPT on USMLE: Potential for AI-Assisted Medical Education Using Large Language Models

Preprint

Full-text available

Dec 2022

We evaluated the performance of a large language model called ChatGPT on the United States Medical Licensing Exam (USMLE), which consists of three exams: Step 1, Step 2CK, and Step 3. ChatGPT performed at or near the passing threshold for all three exams without any specialized training or reinforcement. Additionally, ChatGPT demonstrated a high level of concordance and insight in its explanations. These results suggest that large language models may have the potential to assist with medical education, and potentially, even clinical decision-making.

Guidelines and quality criteria for artificial intelligence-based prediction models in healthcare: a scoping review

Article

Full-text available

Dec 2022

While the opportunities of ML and AI in healthcare are promising, the growth of complex data-driven prediction models requires careful quality and applicability assessment before they are applied and disseminated in daily practice. This scoping review aimed to identify actionable guidance for those closely involved in AI-based prediction model (AIPM) development, evaluation and implementation including software engineers, data scientists, and healthcare professionals and to identify potential gaps in this guidance. We performed a scoping review of the relevant literature providing guidance or quality criteria regarding the development, evaluation, and implementation of AIPMs using a comprehensive multi-stage screening strategy. PubMed, Web of Science, and the ACM Digital Library were searched, and AI experts were consulted. Topics were extracted from the identified literature and summarized across the six phases at the core of this review: (1) data preparation, (2) AIPM development, (3) AIPM validation, (4) software development, (5) AIPM impact assessment, and (6) AIPM implementation into daily healthcare practice. From 2683 unique hits, 72 relevant guidance documents were identified. Substantial guidance was found for data preparation, AIPM development and AIPM validation (phases 1–3), while later phases clearly have received less attention (software development, impact assessment and implementation) in the scientific literature. The six phases of the AIPM development, evaluation and implementation cycle provide a framework for responsible introduction of AI-based prediction models in healthcare. Additional domain and technology specific research may be necessary and more practical experience with implementing AIPMs is needed to support further guidance.

The effectiveness of artificial intelligence conversational agents in healthcare: a systematic review (Preprint)

Article

Full-text available

May 2020
J MED INTERNET RES

Background The high demand for health care services and the growing capability of artificial intelligence have led to the development of conversational agents designed to support a variety of health-related activities, including behavior change, treatment support, health monitoring, training, triage, and screening support. Automation of these tasks could free clinicians to focus on more complex work and increase the accessibility to health care services for the public. An overarching assessment of the acceptability, usability, and effectiveness of these agents in health care is needed to collate the evidence so that future development can target areas for improvement and potential for sustainable adoption. Objective This systematic review aims to assess the effectiveness and usability of conversational agents in health care and identify the elements that users like and dislike to inform future research and development of these agents. Methods PubMed, Medline (Ovid), EMBASE (Excerpta Medica dataBASE), CINAHL (Cumulative Index to Nursing and Allied Health Literature), Web of Science, and the Association for Computing Machinery Digital Library were systematically searched for articles published since 2008 that evaluated unconstrained natural language processing conversational agents used in health care. EndNote (version X9, Clarivate Analytics) reference management software was used for initial screening, and full-text screening was conducted by 1 reviewer. Data were extracted, and the risk of bias was assessed by one reviewer and validated by another. Results A total of 31 studies were selected and included a variety of conversational agents, including 14 chatbots (2 of which were voice chatbots), 6 embodied conversational agents (3 of which were interactive voice response calls, virtual patients, and speech recognition screening systems), 1 contextual question-answering agent, and 1 voice recognition triage system. Overall, the evidence reported was mostly positive or mixed. Usability and satisfaction performed well (27/30 and 26/31), and positive or mixed effectiveness was found in three-quarters of the studies (23/30). However, there were several limitations of the agents highlighted in specific qualitative feedback. Conclusions The studies generally reported positive or mixed evidence for the effectiveness, usability, and satisfactoriness of the conversational agents investigated, but qualitative user perceptions were more mixed. The quality of many of the studies was limited, and improved study design and reporting are necessary to more accurately evaluate the usefulness of the agents in health care and identify key areas for improvement. Further research should also analyze the cost-effectiveness, privacy, and security of the agents. International Registered Report Identifier (IRRID) RR2-10.2196/16934

To ChatGPT or not to ChatGPT? The Impact of Artificial Intelligence on Academic Publishing

Article

Feb 2023
PEDIATR INFECT DIS J

Nigel Curtis

Letter to Editor: NLP systems such as ChatGPT cannot be listed as an author because these cannot fulfill widely adopted authorship criteria

Article

Feb 2023

This letter to the editor suggests adding a technical point to the new editorial policy expounded by Hosseini et al. on the mandatory disclosure of any use of natural language processing (NLP) systems, or generative AI, in writing scholarly publications. Such AI systems should naturally also be forbidden from being named as authors, because they would not have fulfilled prevailing authorship guidelines (such as the widely adopted ICMJE authorship criteria).

Generating scholarly content with ChatGPT: ethical challenges for medical publishing

Article

Feb 2023

ChatGPT: five priorities for research

Article

Feb 2023

Conversational AI is a game-changer for science. Here’s how to respond. Conversational AI is a game-changer for science. Here’s how to respond.

ChatGPT listed as author on research papers: many scientists disapprove

Article

Jan 2023

Chris Stokel-Walker

At least four articles credit the AI tool as a co-author, as publishers scramble to regulate its use. At least four articles credit the AI tool as a co-author, as publishers scramble to regulate its use. Credit: Iryna Imago/Shutterstock Hands typing on a laptop keyboard with screen showing artificial intelligence chatbot ChatGPT Hands typing on a laptop keyboard with screen showing artificial intelligence chatbot ChatGPT

Open Artificial Intelligence Platforms in Nursing Education: Tools for Academic Progress or Abuse?

Article

Dec 2022
Nurse Educ Pract

Siobhan O'Connor

Educator's blueprint: A how‐to guide for creating a high‐quality infographic

Article

Aug 2022

Infographics are a valuable tool for increasing knowledge translation and dissemination. They can be used to simplify complex topics and supplement the written text of a study. This Educator's Blueprint paper will provide 10 strategies for creating high‐quality infographics. These strategies include selecting appropriate content, defining the target audience, considering the format, selecting the software, using consistent font and color schemes, increasing image utilization, ensuring a consistent flow of ideas, avoiding copyright and HIPAA violations, getting feedback from others, and utilizing effective dissemination strategies. These strategies will help guide educators to increase their ability to create more effective infographics.

The Accuracy of Google Translate for Abstracting Data From Non-English-Language Trials for Systematic Reviews

Article

Jul 2019
ANN INTERN MED

Lessons for artificial intelligence from the study of natural stupidity

Article

Apr 2019

Artificial intelligence and machine learning systems are increasingly replacing human decision makers in commercial, healthcare, educational and government contexts. But rather than eliminate human errors and biases, these algorithms have in some cases been found to reproduce or amplify them. We argue that to better understand how and why these biases develop, and when they can be prevented, machine learning researchers should look to the decades-long literature on biases in human learning and decision-making. We examine three broad causes of bias—small and incomplete datasets, learning from the results of your decisions, and biased inference and evaluation processes. For each, findings from the psychology literature are introduced along with connections to the machine learning literature. We argue that rather than viewing machine systems as being universal improvements over human decision makers, policymakers and the public should acknowledge that these system share many of the same limitations that frequently inhibit human judgement, for many of the same reasons. Artificial intelligence and machine learning systems may reproduce or amplify biases. The authors discuss the literature on biases in human learning and decision-making, and propose that researchers, policymakers and the public should be aware of such biases when evaluating the output and decisions made by machines.

Guidelines and quality criteria for artificial intelligence-based prediction models in healthcare: a scoping review

Jan 2022
2

Aah De Hond
A M Leeuwenberg
L Hooft
Imj Kant
Swj Nijman
Hja Van Os

de Hond AAH, Leeuwenberg AM, Hooft L, Kant IMJ, Nijman SWJ, van Os HJA, et al. Guidelines and quality criteria for artificial intelligence-based prediction models in healthcare: a scoping review. NPJ Digit Med. 2022 Jan 10;5(1):2.

Remarks by the President in State of the Union Address

Jan 2015

The White House

The White House. Remarks by the President in State of the Union Address. Available at: https://obamawhitehouse.archives.gov/the-press-office/2015/01/20/remarkspresident-state-union-address-January-20-2015; January 20, 2015. Last accessed 4/27/2023.

Instructions for Authors

Jan 2023

JAMA. Instructions for Authors. Updated January 30, 2023. Available at: https:// jamanetwork.com/journals/jama/pages/instructions-for-authors. Last accessed 4/ 27/2023.

Comparing scientific abstracts generated by ChatGPT to original abstracts using an artificial intelligence output detector, plagiarism detector, and blinded human reviewers

Jan 2022

ChatGPT and conversational artificial intelligence: Friend, foe, or future of research?

Abstract

No full-text available

Recommended publications

Insights from ‘Unlocking COVID-19 current realities, future opportunities: Artificial intelligence i...