Domain Adaptation of End-to-end Speech Recognition in Low-resource Settings

October 5, 2022

End-to-end automatic speech recognition (ASR) has simplified the traditional ASR system building pipeline by eliminating the need to have multiple components and also the requirement for expert linguistic knowledge for creating pronunciation dictionaries. Therefore, end-to-end ASR fits well when building systems for new domains. However, one major drawback of end-to-end ASR is that, it is necessary to have a larger amount of labeled speech in comparison to traditional methods. Therefore, in this paper, we explore domain adaptation approaches for end-to-end ASR in low-resource settings. We show that joint domain identification and speech recognition by inserting a symbol for domain at the beginning of the label sequence, factorized hidden layer adaptation and a domain-specific gating mechanism improve the performance of a low-resource target domain. Furthermore, we also show the robustness of proposed adaptation methods to an unseen domain, when only 3 hours of untranscribed data is available with improvements reporting up to 8.7% relative.

Domain Adaptation of End-to-end Speech Recognition in Low-resource Settings

Dr. Albert Lam

Chief Scientist & CTO

B.Eng. (2005), Ph.D. (2010), HKU. Senior Member of IEEE. Croucher research fellow. Adjunct Assistant Professor in EEE, HKU. Post-doc, UC Berkeley. Research Assistant Professor, HKBU and HKU.

Latest articles

Browse all articles

Newsroom

Fano Labs Secures IMDA Accreditation

May 16, 2024

Newsroom

Announcing our Series B funding round

May 2, 2024