Architecture diagram of our unified CLUE model.

Architecture diagram of our unified CLUE model.

Source publication
Preprint
Full-text available
Predicting contextualised engagement in videos is a long-standing problem that has been popularly attempted by exploiting the number of views or the associated likes using different computational methods. The recent decade has seen a boom in online learning resources, and during the pandemic, there has been an exponential rise of online teaching vi...

Contexts in source publication

Context 1
... our novel modelling architecture depicted in Figure 1, from the text transcript, we predict í µí±‹ 1 by random forest and í µí±‹ 2 by BERT model, where í µí±ƒ1, í µí±ƒ2, í µí±ƒ3, í µí±ƒ4, í µí±ƒ5 is the output of text-based emotion. Using the audio feature we predict í µí±‹ 3 based on the probability of í µí±„1, í µí±„2, ..., í µí±„8 which represents speech-based emotion. ...
Context 2
... fine-tuning, we freeze some of the layers and fine-tune only specific layers which are needed for our task, for instance, in the pre-trained text language model, we only fine-tune the contextual layers, mainly, layer 12. Our framework is depicted in Figure 1 where we extract audio from video, and audio extraction of the speech to text is performed using the IBM Watson speech to text platform. After a speech to text, we have extracted 13 features based on their continued use in studies [3,9,15,27,37]. ...
Context 3
... engagement score shows the impact of individual model on the final output. Figure 10 shows the variation of speech-based emotion over the video length where "Happy", "Surprised", "Neutral", "Fear" are dominant. To generate this figure, we extracted 10 secs of speech with a moving window of 10 secs and a hop of 10 secs as well. ...
Context 4
... the model prediction probability for every emotion was used for plotting. Similarly, Figure 11 shows the variation of speech-based emotion over the other video where "Anger", "Sad", "Disgust", "Neutral" are dominant. ...
Context 5
... is also observed that variation of positive emotion increases engagement compared to negative emotion. Figure 10 shows the variation in emotion of speech over time where "Happy", "Surprised", "Neutral", "Fear" are dominant and Figure 11 shows the variation in emotion of speech over time where "Anger", "Sad", "Disgust", "Neutral" are dominant. Engagement score of video, Figure 10, were significantly better than the video in Figure 11. ...
Context 6
... í µí±‹ 3 , which is based on emotion decoding over speech reduced the predicted engagement score significantly. It is also observed that variation of positive emotion increases engagement compared to negative emotion. Figure 10 shows the variation in emotion of speech over time where "Happy", "Surprised", "Neutral", "Fear" are dominant and Figure 11 shows the variation in emotion of speech over time where "Anger", "Sad", "Disgust", "Neutral" are dominant. Engagement score of video, Figure 10, were significantly better than the video in Figure 11. ...
Context 7
... is also observed that variation of positive emotion increases engagement compared to negative emotion. Figure 10 shows the variation in emotion of speech over time where "Happy", "Surprised", "Neutral", "Fear" are dominant and Figure 11 shows the variation in emotion of speech over time where "Anger", "Sad", "Disgust", "Neutral" are dominant. Engagement score of video, Figure 10, were significantly better than the video in Figure 11. Variation in the emotion of speech over time helps to increase the engagement score than having a single emotion tone for a longer period. ...
Context 8
... is also observed that variation of positive emotion increases engagement compared to negative emotion. Figure 10 shows the variation in emotion of speech over time where "Happy", "Surprised", "Neutral", "Fear" are dominant and Figure 11 shows the variation in emotion of speech over time where "Anger", "Sad", "Disgust", "Neutral" are dominant. Engagement score of video, Figure 10, were significantly better than the video in Figure 11. Variation in the emotion of speech over time helps to increase the engagement score than having a single emotion tone for a longer period. ...