PROJECT TITLE
NORDTRANS – Technology for automatic speech transcription in selected Nordic languages
PROJECT CODE
TO01000027
GRANT
Norway Grants and TA CR
Funding: 31 710 241 CZK
About the project:
The goal of this project is to create and improve automatic speech recognition (also known as speech-to-text) technology for Norwegian and Swedish languages. The newly developed technology should operate with high accuracy in various applications including online broadcast monitoring (TV, radio, internet podcasts, etc.), transcription of speeches in parliaments and similar public institutions, as well as spoken archive mining.
2021
In the first year we have created a speech-to-text system for Norwegian, with the main challenge being the two co-existing written standards of Norwegian: Bokmål and Nynorsk. We have also improved the robustness of our system to environmental noise and cross-talk.
2022
In the second year, we added Swedish to our existing Norwegian speech recognition. Recent advances in machine learning have allowed us to apply the so-called end-to-end models for both languages, which, although requiring large amounts of data for training, significantly outperform the older models.
2023
In the third year, significant progress has been made in improving Swedish speech recognition, reaching a level that competes favorably on a global scale. Swedish and Norwegian speech-to-text even surpasses current benchmarks of other commercially available systems. Current efforts are focused on Danish speech recognition, with encouraging initial results, which lead us to believe that in all three Nordic languages our speech recognition system will be competitive and establish the consortium as a leader in this market.
2024
In the last year of the project, we have completed all project deliverables. The transcription technology in all three Nordic languages is now at a competitive level globally, in some cases even surpassing the current benchmarks of other commercially available systems (Microsoft, Google, Speechmatics or Whisper). The technology is available to the general public on the Beey platform (beey.io)
Beneficiary and project partners:
NEWTON Technologies, a.s.
Norges teknisk-naturvitenskapelige universitet (NTNU)
Technická univerzita v Liberci
The NORDTRANS project benefits from a € 1.2 mil. grant from Norway Grants and Technology Agency of the Czech Republic. The project is carried out under the KAPPA funding programme for applied research, experimental development and innovation, managed by the Technology Agency of the Czech Republic.