Adapting Pitch-Based Self Supervised Learning Models for Tempo Estimation

Antonin Gagneré; Slim Essid; Geoffroy Peeters

doi:10.1109/ICASSP48485.2024.10447129

Conference Papers Year : 2024

Adapting Pitch-Based Self Supervised Learning Models for Tempo Estimation

(1, 2) , (1, 2) , (1, 2)

1
2

Antonin Gagneré

Function : Author

Signal, Statistique et Apprentissage

Département Images, Données, Signal

Slim Essid

Function : Author
PersonId : 181234
IdHAL : slimessid
ORCID : 0000-0002-0028-327X
IdRef : 11025130X

Signal, Statistique et Apprentissage

Département Images, Données, Signal

Geoffroy Peeters

Function : Author
PersonId : 6738
IdHAL : geoffroy-peeters
ORCID : 0000-0001-5255-3019
IdRef : 187470472

Signal, Statistique et Apprentissage

Département Images, Données, Signal

Abstract

Tempo estimation is the task of estimating the periodicity of the dominant rhythm pulse of a music audio signal. It has therefore a close relationship with dominant pitch estimation. Recently, both tasks have been addressed in a ssl fashion so as to leverage unlabelled data for training. In this work, we study the applicability of two successful pitch-based ssl models, SPICE and PESTO, for the purpose of tempo estimation. Both successfully exploit Siamese networks with a pitch-shifting view generation between the two branches. To apply these models for tempo estimation, we represent the audio signal by the cqt of its onset-strength-function and adapt their view generation using time-stretching (instead of pitch shifting), which is efficiently implemented by shifting the cqt. In a large experiment, we show that simply adapting PESTO in this way yields superior results than the previous ssl approach to tempo estimation for most datasets used in the reference benchmark. Further, since PESTO is light-weight, requiring only a few training data, we study a new learning scheme where the downstream datasets are processed directly in a ssl fashion (without access to labels) showing that this is an interesting alternative further improving the performance for some datasets.

Keywords

Tempo estimation Self-supervised-learning Training Adaptation models Zero-shot learning Estimation Training data Self-supervised learning Transforms

Domains

Artificial Intelligence [cs.AI]

Fichier principal

icassp__USING_PITCH_BASED_SUPERVISED_LEARNING_MODEL_FOR_TEMPO_ESTIMATION.pdf (396.71 Ko)

icassp__USING_PITCH_BASED_SUPERVISED_LEARNING_MODEL_FOR_TEMPO_ESTIMATION (1).pdf (396.71 Ko)

yfhkcyrykftdmzrkxgjfgbsgpvntbzcs.zip (1.78 Mo)

Origin : Files produced by the author(s)

Antonin Gagnere : Connect in order to contact the contributor

https://hal.science/hal-04544157

Submitted on : Friday, April 26, 2024-3:55:00 AM

Last modification on : Tuesday, April 30, 2024-11:54:00 AM

Dates and versions

hal-04544157 , version 1 (26-04-2024)

Identifiers

HAL Id : hal-04544157 , version 1
DOI : 10.1109/ICASSP48485.2024.10447129

Cite

Antonin Gagneré, Slim Essid, Geoffroy Peeters. Adapting Pitch-Based Self Supervised Learning Models for Tempo Estimation. ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Apr 2024, Seoul, South Korea. pp.956-960, ⟨10.1109/ICASSP48485.2024.10447129⟩. ⟨hal-04544157⟩

Export

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

INSTITUT-TELECOM GENCI LTCI IDS S2A IP_PARIS

1 View

0 Download

Adapting Pitch-Based Self Supervised Learning Models for Tempo Estimation

Abstract

Keywords

Domains

Dates and versions

Identifiers

Cite

Export

Collections

Altmetric

Share