Europarl-ST is a Multilingual Speech Translation Corpus which contains paired audio-text samples for Speech Translation, constructed using the debates carried out in the European Parliament in the period between 2008 and 2012. https://mllp.upv.es/europarl-st/
Gonçal V. Garcés Díaz-Munío e09567fc0a Add citation | %!s(int64=2) %!d(string=hai) anos | |
---|---|---|
README.md | %!s(int64=2) %!d(string=hai) anos |
Europarl-ST is a multilingual Spoken Language Translation corpus containing paired audio-text samples for SLT from and into 9 European languages, for a total of 72 different translation directions. This corpus has been compiled using the debates held in the European Parliament in the period between 2008 and 2012.
Citation:
@inproceedings{jairsan2020a,
author={J. {Iranzo-Sánchez} and J. A. {Silvestre-Cerdà} and J. {Jorge} and N. {Roselló} and A. {Giménez} and A. {Sanchis} and J. {Civera} and A. {Juan}},
title={Europarl-ST: A Multilingual Corpus for Speech Translation of Parliamentary Debates},
booktitle={Proc. of 2020 IEEE Intl. Conf. on Acoustics, Speech, and Signal Processing (ICASSP 2020)},
year={2020},
pages={8229-8233}
}
You can read more and download the corpus at: https://mllp.upv.es/europarl-st/