Title |
Study of Automatic Piano Transcription Algorithms based on the Polyphonic Properties of Piano Audio |
DOI |
https://doi.org/10.5573/IEIESPC.2023.12.5.412 |
Keywords |
Automatic transcription; Convolutional neural network; Piano audio; Polyphonic characteristics |
Abstract |
The polyphonic characteristics of piano audio make automatic transcription particularly challenging. This study briefly analyzed the polyphonic characteristics of piano audio and introduced three piano audio features: short-time Fourier transform (STFT), constant-Q transform (CQT), and variable-Q transform (VQT). An algorithm integrating a convolutional neural network (CNN) with a bidirectional gated recurrent unit (BiGRU) was developed and tested on the MAPS dataset to detect the note start and end points and fundamental tones of polyphone. The results showed that the combined algorithm performed better than STFT and CQT when VQT was used as input, and CNN-BiGRU outperformed CNN and CNN-GRU in terms of the P value, R-value, and F1-measure in the fundamental tone detection of 97.16%, 97.34%, and 97.25%, respectively. The experimental results of this paper confirmed that the designed automatic piano transcription algorithm is reliable and can be further adopted in the practical music field. |