Fig 4 - uploaded by Tim Brookes
Content may be subject to copyright.

Contexts in source publication

Context 1
... the present research, the system was controlled using MATLAB, through a custom-developed library that enabled the precise rotation of the device. Rotation was achieved through simple commands, e.g., turntable.rotate(5) %rotates 5 degrees clockwise, turntable.rotate(-5) %rotates 5 degrees counter-clockwise. Fig. 4 presents a flowchart of the communication between MATLAB and the microphone rotation ...
Context 2
... the present research, the system was controlled using MATLAB, through a custom-developed library that enabled the precise rotation of the device. Rotation was achieved through simple commands, e.g., turntable.rotate(5) %rotates 5 degrees clockwise, turntable.rotate(-5) %rotates 5 degrees counter-clockwise. Fig. 4 presents a flowchart of the communication between MATLAB and the microphone rotation ...

Citations

... The first contains recordings of 66 vintage microphone impulse responses [8]. The second dataset is generated from 25 professional microphones recorded at different angles and distances [9], amounting in a total of 8138 DIRs. ...
... For DIR augmentation, we use two dataset sources [8,9]. We resample both to 32kHz sampling rate. ...
... We resample both to 32kHz sampling rate. Further, we window the Multi DIRs dataset [9] to 1024 samples with a Kaiser window (β = 2). ...
Conference Paper
Full-text available
Acoustic Scene Classification poses a significant challenge in the DCASE Task 1 TAU22 dataset with a sample length of only a single second. The best performing model in the 2023 challenge achieves an accuracy of 62.7% with a gap to unseen devices of approximately 10%. In this study, we propose a novel approach using Inverse Con-trastive Loss to ensure a device class invariant latent representation and a better generalization to unseen devices. We evaluate the interaction of this contrastive learning approach with impulse response augmentation and show the effectiveness for suppressing device related information in the encoder structure. Results indicates that both, contrastive learning and impulse response augmentation, improves generalization to unseen devices. Further the impulse response dataset should have a balanced frequency response to work effectively. Combining contrastive learning and impulse response augmentation yields embeddings with least device related information , but does not improve scene classification accuracy when compared to augmentation alone.