The MLLP was at the session “Iberspeech-RTVE-Challenge 2018: desarrollo y análisis de resultados” in Madrid

The MLLP’s Alfons Juan attended the session “Iberspeech-RTVE-Challenge 2018: desarrollo y análisis de resultados“, organized the past 14 November at the Reial Acadèmia d’Enginyeria of Spain in Madrid with Radiotelevisión Española (RTVE) and Universidat de Zaragoza.

After seeing the full list of participants at the #IberspeechRTVEChallenge2018, we can report that it has been a success, with the participation of strong Spanish and international teams, from both industry and academia.

The MLLP research group of Universitat Politècnica de València participated in the #IberspeechRTVEChallenge2018 together with the RWTH Aachen University’s i6 Human Language Technology and Pattern Recognition Group.

The MLLP will be next week at IberSPEECH 2018 in Barcelona (November 21–23) to present the accepted paper explaining this joint effort: “MLLP-UPV and RWTH Aachen Spanish ASR Systems for the IberSpeech-RTVE 2018 Speech-to-Text Transcription Challenge”, by Javier Jorge, Adrià Martínez-Villaronga, Pavel Golik, Adrià Giménez, Joan Albert Silvestre-Cerdà, Patrick Doetsch, Vicent Andreu Císcar, Hermann Ney, Alfons Juan and Albert Sanchis. You can read here the paper’s abstract:

This paper describes the Automatic Speech Recognition systems built by the MLLP research group of Universitat Politècnica de València and the HLTPR research group of RWTH Aachen for the IberSpeech-RTVE 2018 Speech-to-Text Transcription Challenge. We participated in both the closed and the open training conditions. The best system built for the closed conditions was a hybrid BLSTM-HMM ASR system using one-pass decoding with a combination of an RNN LM and show-adapted n-gram LMs. It was trained on a set of reliable speech data extracted from the train and dev1 sets using the MLLP’s transLectures-UPV toolkit (TLK) and TensorFlow. This system achieved 20.0% WER on the dev2 set. For the open conditions, we used approx. 3800 hours of out-of-domain training data from multiple sources and trained a one-pass hybrid BLSTM-HMM ASR system using the open-source tools RASR and RETURNN developed at RWTH Aachen. This system scored 15.6% WER on the dev2 set. The highlights of these systems include robust speech data filtering for acoustic model training and show-specific language modelling.

We look forward to learning the final results of the challenge next week at IberSPEECH 2018! See you there!