NORDTRANS - Technologická agentura ČR

PROJECT TITLE

NORDTRANS – Technology for automatic speech transcription in selected Nordic languages

PROJECT CODE

TO01000027

GRANT

Norway Grants and TA CR
Funding: 31 710 241 CZK

About the project:

The goal of this project is to create and improve automatic speech recognition (also known as speech-to-text) technology for Norwegian and Swedish languages. The newly developed technology should operate with high accuracy in various applications including online broadcast monitoring (TV, radio, internet podcasts, etc.), transcription of speeches in parliaments and similar public institutions, as well as spoken archive mining.

2021

In the first year we have created a speech-to-text system for Norwegian, with the main challenge being the two co-existing written standards of Norwegian: Bokmål and Nynorsk. We have also improved the robustness of our system to environmental noise and cross-talk.

2022

In the second year, we added Swedish to our existing Norwegian speech recognition. Recent advances in machine learning have allowed us to apply the so-called end-to-end models for both languages, which, although requiring large amounts of data for training, significantly outperform the older models.

2023

In the third year, significant progress has been made in improving Swedish speech recognition, reaching a level that competes favorably on a global scale. Swedish and Norwegian speech-to-text even surpasses current benchmarks of other commercially available systems. Current efforts are focused on Danish speech recognition, with encouraging initial results, which lead us to believe that in all three Nordic languages our speech recognition system will be competitive and establish the consortium as a leader in this market.

2024

In the last year of the project, we have completed all project deliverables. The transcription technology in all three Nordic languages is now at a competitive level globally, in some cases even surpassing the current benchmarks of other commercially available systems (Microsoft, Google, Speechmatics or Whisper). The technology is available to the general public on the Beey platform (beey.io)

Beneficiary and project partners:

NEWTON Technologies, a.s.
Norges teknisk-naturvitenskapelige universitet (NTNU)
Technická univerzita v Liberci

The NORDTRANS project benefits from a € 1.2 mil. grant from Norway Grants and Technology Agency of the Czech Republic. The project is carried out under the KAPPA funding programme for applied research, experimental development and innovation, managed by the Technology Agency of the Czech Republic.

STARFOS

Website

Social media