Fig 3 - uploaded by Kilian Kappert
Content may be subject to copyright.
An illustration of a participant's head and all landmarks and tongue maneuvers. The tongue maneuvers are visualized as black arrows. Markers on were placed the head, chin and nose of the participant in order to track them. The caruncles of the eyes do not require external markers as they are a distinguishable facial feature. (Illustration designed by Vectorpouch / Freepik). https://doi.org/10.1371/journal.pone.0221593.g003

An illustration of a participant's head and all landmarks and tongue maneuvers. The tongue maneuvers are visualized as black arrows. Markers on were placed the head, chin and nose of the participant in order to track them. The caruncles of the eyes do not require external markers as they are a distinguishable facial feature. (Illustration designed by Vectorpouch / Freepik). https://doi.org/10.1371/journal.pone.0221593.g003

Source publication
Article
Full-text available
Purpose: Tongue mobility has shown to be a clinically interesting parameter on functional results after tongue cancer treatment which can be objectified by measuring the Range Of Motion (ROM). Reliable measurements of ROM would enable us to quantify the severity of functional impairments and use these for shared decision making in treatment choice...

Contexts in source publication

Context 1
... marker-design features a 3D paper cube to ensure that the marker is visible from every angle. Additional markers were placed on the glabella, apex of the nose and mental region to enable tracking of the head and jaw (Fig 3). The caruncles of the eyes did not require external markers as they are distinguishable landmarks. ...
Context 2
... participants were instructed to always protrude the tongue as far as possible in all directions (Fig 3). ...
Context 3
... user interface was developed to extract and process the ROM trajectory from the three videos cameras. First, the locations of six landmarks were selected: the caruncles of the eyes, glabella, nose point, tongue point and mental region (Fig 3). Using the Lucas Kanade tracking algorithm implemented by Matlab (MatWorks, 2018b) the six points were tracked until the end of the video. ...

Citations

... The range of motion (ROM) of the tongue could be a promising measure to predict speech outcomes in ITOC, since a more mobile tongue (i.e., larger ROM values) usually leads to better and more intelligible speech posttreatment (Bressmann et al., 2004;Chepeha et al., 2016;Lam & Samman, 2013;Pauloski et al., 1998;van Dijk et al., 2016). However, studies that have assessed the ROM of the tongue following OSCC treatment have done so with nonspeech tasks, such as maximal tongue protrusion or lateral movement, quantified by Likert scales, ruler-based measurements, or three-dimensional camera tracking (Chepeha et al., 2016;Kappert et al., 2019;Lazarus et al., 2014;Speksnijder et al., 2011). Methods exploring maximum movement portray the maximum efforts in terms of anatomical capability, which may require different motor demands compared to speech tasks (Bressmann et al., 2004;Schuster & Stelzle, 2012). ...
... Granting that this eliminates the need for physical contact, specificity may be lost if only three categories are used and no measurements on a continuous scale are made. In response, a recently developed method used three-dimensional camera tracking in order to quantify the ROM of the tongue noninvasively on a continuous scale by attaching a tongue marker, which is subsequently tracked by three cameras (Kappert et al., 2019;van Dijk et al., 2016). One remaining disadvantage is that this setup can only capture tongue movements when the mouth is open as the cameras cannot track the marker if the tongue is not visible. ...
... The first aim of our study was to quantify the degree to which the AWS and the ROM in the anteroposterior and superior-inferior dimensions were reduced in ITOC compared to controls. We hypothesized that the AWS of ITOC would be smaller compared to controls based on previous work measuring nonspeech ROM (Bressmann et al., 2004;Chepeha et al., 2016;de Groot et al., 2020;Kappert et al., 2019;Speksnijder et al., 2011). Our second hypothesis was that ITOC would show a reduced AP-ROM as acoustic studies found reduced F 2 for /i/ and center of gravity for /s/ (i.e., acoustic correlates of reduced anteroposterior tongue movement; Acher et al., 2014;de Bruijn et al., 2009;Laaksonen et al., 2011;Takatsu et al., 2017;Tienkamp et al., 2023;van Dijk et al., 2016;Zhou et al., 2013). ...
Article
Full-text available
Purpose The purpose of this study was to quantify sentence-level articulatory kinematics in individuals treated for oral squamous cell carcinoma (ITOC) compared to control speakers while also assessing the effect of treatment site (jaw vs. tongue). Furthermore, this study aimed to assess the relation between articulatory–kinematic measures and self-reported speech problems. Method Articulatory–kinematic data from the tongue tip, tongue back, and jaw were collected using electromagnetic articulography in nine Dutch ITOC and eight control speakers. To quantify articulatory kinematics, the two-dimensional articulatory working space (AWS; in mm ² ), one-dimensional anteroposterior range of motion (AP-ROM; in mm), and superior–inferior range of motion (SI-ROM in mm) were calculated and examined. Self-reported speech problems were assessed with the Speech Handicap Index (SHI). Results Compared to a sex-matched control group, ITOC showed significantly smaller AWS, AP-ROM, and SI-ROM for both the tongue tip and tongue back sensor, but no significant differences were observed for the jaw sensor. This pattern was found for both individuals treated for tongue and jaw tumors. Moderate nonsignificant correlations were found between the SHI and the AWS of the tongue back and jaw sensors. Conclusions Despite large individual variation, ITOC showed reduced one- and two-dimensional tongue, but not jaw, movements compared to control speakers and treatment for tongue and jaw tumors resulted in smaller tongue movements. A larger sample size is needed to establish a more generalizable connection between the AWS and the SHI. Further research should explore how these kinematic changes in ITOC are related to acoustic and perceptual measures of speech.
... Methods of automatically determining the RF have been developed for brain CSD. 35,36 These methods are designed to circumvent the fact that in the brain few voxels contain a single fiber direction, while these voxels are more prevalent in muscle diffusion imaging. Nonetheless, we recommend adapting such an automatic method to muscle CSD, because of the less user input required and possibly increased reproducibility of RF estimation due to variation between tongue muscles. ...
... The ROM of the tongue was obtained by optical tracking of a marker on the tip of the tongue using a 3D camera. The volunteers were asked to perform four different tongue movements: left, right, down, and protrusion as described in the paper by Kappert et al. 36 The up-movement was left out since it was proven to be unreliable. Written informed consent was obtained from all volunteers before inclusion. ...
... Previously, the precision of the ROM measurements, quantified by the standard deviation, was determined to a range from 2.3 mm to 3.2 mm. 36 We, therefore, assumed a precision of 3 mm (3.2 mm rounded off) for all ROM measurements. If a predicted ROM fell within the 95% confidence interval (CI), i.e., within two times the standard deviation, we judged the measurement to be correct. ...
... This potential prediction model can be used in combination with or instead of more advanced techniques such as biomechanical modeling or optical tracking. These options might take longer to implement and validate in a clinical setting [29,30]. Also when the genetics and physiology behind the ability to perform different features can be clarified, it might increase insights in the prediction of postoperative reduced tongue mobility or may be helpful in oral rehabilitation. ...
Article
Full-text available
The importance of tongue mobility on speech, oral food transport, and swallowing is well recognized. However, whether the individual tongue mobility influences postoperative function in oral cancer treatment remains to be elucidated. This study assesses the ability to perform five tongue movements as rolling, twisting (two sides), folding, and the 'cloverleaf' in a healthy population. Because a tumor in oral cancer patients often restricts the mobility of the tongue, it might be helpful to know if it is possible to recall any of those movements without demonstrating it. Two observers asked 387 Dutch healthy adults if they could perform one of the five specific tongue movements and were subsequently asked to demonstrate the five movements. The distribution in the Dutch population is: rolling: 83.7%, cloverleaf: 14.7%, folding: 27.5%, twisting left: 36.1% and twisting right: 35.6%. The percentage of people that can fold their tongue is almost ten times higher (3% versus 27.5%) than in previous research, and it was found that the ability to roll the tongue is not a prerequisite for folding of the tongue. A relationship between gender or right-handedness and the ability to perform certain tongue movements could not be found. Of the participants, 9.9% and 13.1% incorrectly assumed that they could demonstrate tongue rolling and cloverleaf. Tongue folding and twisting (left or right) were incorrectly assumed in 36.9%, 24.1%, and 25.4% of the cases. Rolling and cloverleaf are preferred for future prediction models because these movements are easy to recall without demonstrating.
... By mapping the tongue of an individual to the atlas tongue, we hypothesized that the segmented fiber tracks of the atlas could be morphed back to an individual's space and that, subsequently, from these segmented fiber tracks a personalized biomechanical model could be created. The effect of this personalization step was evaluated by comparing models with both personalized and generic muscles bundles to the predicted range of motion (ROM) of the tongue measured in vivo using 3D optical tracking (Kappert et al. 2019a). ...
... The ROM of the tongue was obtained by optical tracking of a marker on the tip of the tongue using a 3D camera. The volunteers were asked to perform four different tongue movements: left, right, down, and protrusion as described in the paper by Kappert et al. (2019a). The upmovement was left out since it was proven to be unreliable. ...
... Only when the personalized models perform better than the atlas, we can conclude that personalization improves the ROM prediction. Previously, the precision of the ROM measurements, quantified by the standard deviation, was determined to a range from 2.3 to 3.2 mm (Kappert et al. 2019a). We, therefore, assumed a precision of 3 mm (3.2 mm rounded off) for all ROM measurements. ...
Article
Full-text available
For advanced tongue cancer, the choice between surgery and organ-sparing treatment is often dependent on the expected loss of tongue functionality after treatment. Biomechanical models might assist in this choice by simulating the post-treatment function loss. However, this function loss varies between patients and should, therefore, be predicted for each patient individually. In the present study, the goal was to better predict the postoperative range of motion (ROM) of the tongue by personalizing biomechanical models using diffusion-weighted MRI and constrained spherical deconvolution reconstructions of tongue muscle architecture. Diffusion-weighted MRI scans of ten healthy volunteers were obtained to reconstruct their tongue musculature, which were subsequently registered to a previously described population average or atlas. Using the displacement fields obtained from the registration, the segmented muscle fiber tracks from the atlas were morphed back to create personalized muscle fiber tracks. Finite element models were created from the fiber tracks of the atlas and those of the individual tongues. Via inverse simulation of a protruding, downward, left and right movement, the ROM of the tongue was predicted. This prediction was compared to the ROM measured with a 3D camera. It was demonstrated that biomechanical models with personalized muscles bundles are better in approaching the measured ROM than a generic model. However, to achieve this result a correction factor was needed to compensate for the small magnitude of motion of the model. Future versions of these models may have the potential to improve the estimation of function loss after treatment for advanced tongue cancer.
... Apart from improving survival rates (mortality), research attention has shifted to improving the quality of life after surgery [2]. Oral cancer survivors can suffer from several problems affecting their quality of life: difficulty swallowing [3,4], decreased tongue mobility [5] and impaired speech intelligibility [3]. The latter is the focus of our paper. ...
Preprint
Full-text available
Oral cancer speech is a disease which impacts more than half a million people worldwide every year. Analysis of oral cancer speech has so far focused on read speech. In this paper, we 1) present and 2) analyse a three-hour long spontaneous oral cancer speech dataset collected from YouTube. 3) We set baselines for an oral cancer speech detection task on this dataset. The analysis of these explainable machine learning baselines shows that sibilants and stop consonants are the most important indicators for spontaneous oral cancer speech detection.
Article
Full-text available
In this paper, we introduce a new corpus of oral cancer speech and present our study on the automatic recognition and analysis of oral cancer speech. A two-hour English oral cancer speech dataset is collected from YouTube. Formulated as a low-resource oral cancer ASR task, we investigate three acoustic modelling approaches that previously have worked well with low-resource scenarios using two different architectures; a hybrid architecture and a transformer-based end-to-end (E2E) model: (1) a retraining approach; (2) a speaker adaptation approach; and (3) a disentangled representation learning approach (only using the hybrid architecture). The approaches achieve a (1) 4.7% (hybrid) and 7.5% (E2E); (2) 7.7%; and (3) 2.0% absolute word error rate reduction, respectively, compared to a baseline system which is not trained on oral cancer speech. A detailed analysis of the speech recognition results shows that (1) plosives and certain vowels are the most difficult sounds to recognise in oral cancer speech - this problem is successfully alleviated by our proposed approaches; (3) however these sounds are also relatively poorly recognised in the case of healthy speech with the exception of/p/. (2) recognition performance of certain phonemes is strongly data-dependent; (4) In terms of the manner of articulation, E2E performs better with the exception of vowels - however, vowels have a large contribution to overall performance. As for the place of articulation, vowels, labiodentals, dentals and glottals are better captured by hybrid models, E2E is better on bilabial, alveolar, postalveolar, palatal and velar information. (5) Finally, our analysis provides some guidelines for selecting words that can be used as voice commands for ASR systems for oral cancer speakers.
Article
Full-text available
Tongue cancer treatment often results in impaired speech, swallowing, or mastication. Simulating the effect of treatments can help the patient and the treating physician to understand the effects and impact of the intervention. To simulate deformations of the tongue, identifying accurate mechanical properties of tissue is essential. However, not many succeeded in characterizing in-vivo tongue stiffness. Those who did, measured the tongue At Rest (AR), in which muscle tone subsides even if muscles are not willingly activated. We expected to find an absolute rest state in participants ‘under General Anesthesia’ (GA). We elaborated on previous work by measuring the mechanical behavior of the in-vivo tongue under aspiration using an improved volume-based method. Using this technique, 5 to 7 measurements were performed on 10 participants both AR and under GA. The obtained Pressure-Shape curves were first analyzed using the initial slope and its variations. Hereafter, an inverse Finite Element Analysis (FEA) was applied to identify the mechanical parameters using the Yeoh, Gent, and Ogden hyperelastic models. The measurements AR provided a mean Young’s Modulus of 1638 Pa (min 1035 – max 2019) using the Yeoh constitutive model, which is in line with previous ex-vivo measurements. However, while hoping to find a rest state under GA, the tongue unexpectedly appeared to be approximately 2 to 2.5 times stiffer under GA than AR. Explanations for this were sought by examining drugs administered during GA, blood flow, perfusion, and upper airway reflexes, but neither of these explanations could be confirmed.