Virtual Pharmacy Patient system

Source publication

An Efficient Virtual Patient Image Model Interview Training in Pharmacy

Article

Full-text available

Dec 2013

This paper presents the development of a virtual patient simulation by a 3D talking head and its use by pharmacy students as a training aid for patient consultation. The paper concentrates on the virtual patient modeling, its synthesis with a speech engine and facial expression interaction. The virtual patient model is developed in three stages: bu...

Context 1

... new heads. iii. Mouth shape generation Many mouth shapes of alphabets are quite similar to each other, and all mouth shapes of alphabets can be simulated by combining basic mouth shapes. In our model, 10 basic mouth shapes are adopted, such as basic shapes formed by ai, cdg, e, fv, l, mbp, o, u and wq. iv. Synthesizing facial expressions As mentioned above, various facial expressions can be synthesized by mapping part of the original texture to specific polygons that are defined by control vertices. Putting the texture mapped mesh model and the background together, the resulting image scene looks just like the original face with some specific facial expressions. The first step to animate the facial expressions is to define the key frames that make up the major facial expression feature changes. The neutral face without any facial expressions can be thought as a key frame that contains a neutral facial expression and this is the base image that is varied to produce specific facial expressions. v. Generic teeth model The VPP model has one teeth model (Fig 1(f)) over the lips. The one inner mouth model has both the upper and lower teeth appearing as flat rows behind the lips. The generic inner mouth model can be resized according to the dimensions of the mouth size in the neutral face image. This would be more realistic if all inner mouth models were employed so that they could be moved separately – for instance the upper teeth size and shape could change to be more of a curve when smiling. III. SPEECH DRIVEN FACE SYNTHESIS After the 3D face mesh is adjusted, it can be used to animate facial expressions driven by speech. To synthesize animations of facial expressions synchronized with speech data, we must know which phonemes appear in the input data. In addition, the start and stop time of a certain phoneme should be obtained to synchronize the mouth shapes with speech wave data. For example, assume that the system is required to speak the sentence, “How are you?”, the system invokes a speech engine and finds that from StartTime to TimeA is silence; TimeA to TimeB should be the interval taken to speak “How”; TimeB to TimeC should be the interval taken to speak “are”; and TimeC to EndTime should be the interval taken to speak “you”. The system then translates these results into neutral (from time 0 to TimeA), How (from TimeA to TimeB), are (from TimeB to TimeC), you (from TimeC to EndTime) and appropriate key frames are fetched from the expression pool to represent these lip movements. Speech is usually treated in a different way to the animation of facial expressions because simple keyframe- based approaches to the animation typically provide a poor approximation to real speech dynamics. Text-to-Speech (TTS) functionality allows the models to speak any text dynamically with lip-synching in real time. Fig 2 is the flow diagram of our synchronization of facial expressions and speech. Firstly, text data, which is input to TTS engine, has been generated by the reasoning and assessment modules as a response by the VPP system to a pharmacy student’s question. The engine compares the input text data with phonemes in a database and then sends the phonemes of the synthesized speech and interval information for each phoneme to a face expression controller. The expression controller translates the phonemes and timing information into face expression parameters (mouth shapes). Thus, we can get basic facial expressions according to the input text data. With this information, facial animations are synchronized with the input text data. For example, a word “how” pronounced as /hau/ is converted to be /h/ +/au/ and the corresponding mouth shape is from “h” then gradually morphed to “au”. TTS capabilities refer to its ability to play back text as a spoken voice. The VPP used two kinds of TTS engines. This was due to the different operating systems and platforms on which the evaluation was carried out in the three universities. The first TTS engine is a generic one that supports Microsoft API from Microsoft SAPI 5.1 (Microsoft Speech Application Interface version 4.0); the other TTS engine is the Java Speech API (JSAPI). JSAPI was used for the male virtual patient model’s voice, and SAPI was used or the female patient’s voice. JSAPI allows Java applications to incorporate speech technology into their user interface. It defines a cross-platform API to support command and control recognizers, dictation systems and speech synthesizers. JSAPI can be used by FreeTTS which is an open source speech synthesizer written entirely in the Java programming language. SAPI offers a general framework for building speech synthesis systems. SAPI contains many interfaces and classes for managing speech. For TTS, the base class is SpVoice. The C++ Win32 console application produced one Windows Dynamic link library (DLL) file, which allows a Java program to access TTS functionality provided by the SAPI. IV. RESULTS For the pilot evaluation study (Fig 3), students from three universities (The University of Newcastle, Monash University and Charles Stuart University) were assessed in a randomized controlled trial over three days [13]. They were assessed on skills such as their coverage, convergence and style of investigative questions that they used to diagnose three clinical scenarios: a cough, Gastro-oesophageal reflux disease (GORD), and constipation. Each condition was presented with 3 levels of severity: mild, moderate and severe. At each assessment session the VPP system presented a student with all 3 conditions at randomly selected severities before re- assessing the same conditions at different levels of severity at an assessment session held on another day. The students who consented to be in the pilot study and who used the VPP system in the trial comprised of: The University of Newcastle, 15 of 83 eligible students; Monash University, 15 of 220 eligible students; and Charles Stuart University, 3 of 110 eligible students. Students evaluated their experience with the VPP on several different levels, the software, the appearance, the learning outcomes, etc. Twenty- two students used the VPP system answered questions relating to their interaction with the virtual patient in the final survey. The students felt the VPP system helped them to identify areas of their communication that they could work on (100% vs 56% agreeing/strongly agreeing), and that using the virtual patient will improve their confidence with real patients (90% vs 56% agreeing/strongly agreeing). With regards to the appearance of the virtual patient, respondents were generally negative, in particular with regards to the voice. Some of the results for the physical visual appearance and the voice of the VPP are reproduced in Table 1. The responses for domestic versus international students (see Table 2) indicated that domestic students were overall more positive about the virtual patient than the international students. Respondents indicated that the limited number of facial expressions and visual clues from the VPP system was a drawback to realistic interaction. As indicated in Newby et al [13], this made it difficult for students to judge the effect of their questioning, that is, whether their questions were appropriate or not: V. DISCUSSIONS This paper presented the visual and vocal aspects of an automated system that allowed pharmacy students to interact with a VPP. The VPP allows students to explore the full patient consultation and to let them practice interpersonal skills before working with actual patients. The open source code of the VPP system can be downloaded from resweb.newcastle.edu.au/VirtualPatient/private/uploads on the authors’ permission. To cater for many students interacting with the patient at the same time and in the same room, students talked with the virtual patient by typing questions into an interface and, to simulate listening to real patient conversation, the virtual patient responded verbally to students via earphones. Typing questions rather than speaking directly to the VPP also reduced the ambiguities that might have been introduced by voice recognition systems. User responses (Section 4) indicated that the appearance and the voice of the virtual patient need to be improved. As indicated in Section 2.2.4, the freeware software (Blender) used for the facial modelling required a lot of effort and the number of facial expressions were limited due to the time needed to construct the expressions. In the pilot only three expressions were used: neutral, smile and laugh. There is another problem to be solved. The facial expressions are changed by morphing between expressions, but VPP system does not facilitate the change of expression while the VPP is speaking. The expressions are changed at the start and the end of speaking. Both of these factors contributed to the unnatural appearance of the facial responses of the VPP system, particularly when it was responding to student questions. In an updated version of the 3D face model that is generated by the commercial software FaceGen, much quicker development is possible by utilizing through default expressions and visually adjusting parameters in real time to generate multiple facial expressions. They can then be exported to the virtual patient software so that the software can morph between exported expressions to create expression transitions, such as neutral to smile and smile to angry. The inner mouth of the model is separated into three parts: the upper teeth, the lower teeth and the tongue. The upper teeth model is moved according to the control vertex at the philtrum, and the lower one is moved according to the control vertex at the chin. This inner mouth model solves an unnatural view problem when the lips move to talk or to smile. This also allows the breaking up of the VPP speech output to smaller segments to link the morphing of facial expressions to the duration of the spoken output, ...

View in full-text

Development of a Framework for Problem Domain Transference in Health-Related Problem Based Learning and Assessment

Article

Full-text available

Oct 2021

Purpose Investigate the capability of a knowledge-based framework and architecture, used in a specific health domain problem that can utilise transfer learning, to speed virtual patient development for problem-based training and assessment in other health domains. Methods Analysis of a case study, based on a virtual patient used in the training of pharmacy students, to discover the viability of using generic, ontological knowledge capable of transfer to virtual patients in other health domains. Results Areas of the virtual pharmacy patient knowledge-base were identified, along with corresponding expected student questions, that are generic to other health domains. Using the framework from the case study to develop a new virtual patient for problem-based learning and assessment in a new health domain, these generic target questions could be utilised to speed up the development of other learning stimuli in future projects involving different health domains, such as nurse training in pain management. Conclusions With some modification, the framework of the case-study virtual patient was found to be capable of supporting generic expected student questions capable of re-use in virtual patients with new clinical conditions.

Virtual Pharmacy Patient system

Context in source publication

Citations