Recurent networks for spech recognition
Project Sonata 8 NCN 2014/15/D/ST6/04402 "Applications of recurrent and deep neural networks for the acoustic modelling of speech" realized 2015-2019.
The project aims to develop new speech recognition algorithms that can directly transcribe an audio utterance into a character sequence without a classically used speech pipeline, such as Hidden Markov Models.
Main project publications:
Jan K Chorowski, Dzmitry Bahdanau, Dmitriy Serdyuk, Kyunghyun Cho, and Yoshua Bengio. Attention-Based Models for Speech Recognition. In C. Cortes, N. D. Lawrence, D. D. Lee, M. Sugiyama, and R. Garnett, editors, Advances in Neural Information Processing Systems 28, pages 577–585. Curran Associates, Inc., 2015.
D. Bahdanau, J. Chorowski, D. Serdyuk, P. Brakel, and Y. Bengio. End-to-end attention-based large vocabulary speech recognition. In 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 4945–4949, March 2016.
Michał Zapotoczny, Paweł Rychlikowski, and Jan Chorowski. On Multilingual Training of Neural Dependency Parsers. In Kamil Ekštein and Václav Matoušek, editors, Text, Speech, and Dialogue, Lecture Notes in Computer Science, pages 326–334. Springer International Publishing, 2017.
Jan Chorowski, Adrian Łańcucki, Bartosz Kostka, and Michał Zapotoczny. Towards Using Context-Dependent Symbols in CTC Without State-Tying Decision Trees. In Interspeech 2019, pages 4385–4389. ISCA, September 2019.
Michał Zapotoczny, Piotr Pietrzak, Adrian Ła´ncucki, and Jan Chorowski. Lattice Generation in Attention-Based Speech Recognition Models. In Interspeech 2019, pages 2225–2229. ISCA, September 2019.