A single frame of the video generated from Chopin Nocturne Op. 9 No. 1. The video shows the estimated line of music and uses a red cursor to indicate the predicted location.

A single frame of the video generated from Chopin Nocturne Op. 9 No. 1. The video shows the estimated line of music and uses a red cursor to indicate the predicted location.

Source publication
Article
Full-text available
This article studies the problem of generating a piano score following video from an audio recording in a fully automated manner. This problem contains two components: identifying the piece and aligning the audio with raw sheet music images. Unlike previous work, we focus primarily on working with raw, unprocessed sheet music from IMSLP, which may...

Context in source publication

Context 1
... we can create the score following video by (i) playing the audio in the background, (ii) showing the estimated line of sheet music at each time instant, and (iii) optionally overlaying a cursor on the sheet music line to show the exact predicted position in the score. Figure 5 shows an example frame from a video. ...

Similar publications

Article
Full-text available
This paper’s purpose was to investigate the correlation between components of the coordinative capacity and the technical elements of rhythmic gymnastics. In the research we investigated 14 gymnasts, through tests assessing the general and specific coordinative capacity (the Matorin, Bruininks-Oseretsky, Bass, Flamingo, spatial-temporal orientation...

Citations

... At the same time, the composing process is also complex; therefore, music lovers cannot obtain a high number of network-required composition resources that makes composing more difficult. For this reason, this paper uses deep learning algorithm in the context of piano score recognition to solve this problem [3]. ...
Article
Full-text available
Piano is used for music and comprises a stringed keyboard instrument wherein the strings are tapped by softer-coated wooden hammers. The score providing music for the piano, often a compressed transcription of orchestral music, is referred to as piano score. Presently, the Internet is overflowing with music score resources. Having so many music score resources available, professional learners and amateur music lovers are unable to identify and obtain music score information that matches their needs and wasting valuable time. Due to the rapid development of deep learning algorithms, some individuals utilize these algorithms to detect piano scores and construct composition systems, reducing the need of traditional machine learning algorithms on manual design and music knowledge guidelines. This paper uses the deep learning algorithm to construct piano score recognition framework based on K-Nearest Neighbor (KNN) algorithm and formulates the recognition system into multinote that significantly improves the recognition rate for the system. The self-attention mechanism is then introduced in order to build a composition system based on a deep learning algorithm in which composition training and processes are described. Finally, a comparative experiment is conducted to evaluate the recognition accuracy for the KNN-based piano score recognition system. The results show that highest recognition accuracy of this system is 67.5%. The effect of composition system is evaluated based on prediction accuracy of notes. Three experiments are conducted to train the composition notes. As a result, the prediction accuracy of experiments 1, 2, and 3 is 89.2%, 91.8%, and 92.7%, respectively, indicating that the system has a high prediction accuracy and a perfect composition effect.
... Clearly, this representation discards a lot of information contained in the sheet music, including non-filled noteheads (e.g., half notes or whole notes), note duration, accidentals, key signature, time signature, clef changes, and octave markings. Nonetheless, it has been used effectively for a wide range of tasks such as sheet-MIDI retrieval [25][26][27], sheet music identification [28,33], and audio-sheet synchronization [39,40]. Overview of constructing generalized n-gram fingerprints from sheet music. ...
Article
Full-text available
This paper studies the problem of identifying piano music in various modalities using a single, unified approach called marketplace fingerprinting. The key defining characteristic of marketplace fingerprinting is choice: we consider a broad range of fingerprint designs based on a generalization of standard n-grams, and then select the fingerprint designs at runtime that are best for a specific query. We show that the large-scale retrieval problem can be framed as an economics problem in which a consumer and a store interact. In our analogy, the runtime search is like a consumer shopping in the store, the items for sale correspond to fingerprints, and purchasing an item corresponds to doing a fingerprint lookup in the database. Using basic principles of economics, we design an efficient marketplace in which the consumer has many options and adopts a rational buying strategy that explicitly considers the cost and expected utility of each item. We evaluate our marketplace fingerprinting approach on four different sheet music retrieval tasks involving sheet music images, MIDI files, and audio recordings. Using a database containing approximately 375,000 pages of sheet music, our method is able to achieve 0.91 mean reciprocal rank with sub-second average runtime on cell phone image queries. On all four retrieval tasks, the marketplace method substantially outperforms previous methods while simultaneously reducing average runtime. We present comprehensive experimental results, as well as detailed analyses to provide deeper intuition into system behavior.