1. pl
  2. en
woman in black long sleeve shirt sitting on white wooden bench during daytime
06 August 2025

Does AI understand Polish?

Automatic transcription of therapy sessions and clinical interviews in the mental health industry – quality in non-English languages

 

The implementation of automatic speech recognition (ASR) in the field of mental health opens up new possibilities for documenting therapy sessions and clinical interviews. Especially in the context of languages other than English – such as Polish – the development and quality of speech recognition are of significant importance for the safety, efficiency, and standards of work of specialists.

 

Advances in speech recognition quality in languages other than English

 

Analyses and studies based on authentic Polish sources show that ASR technology is achieving increasingly better quality in recognizing spontaneous and specialized speech, which is crucial in therapy and diagnostics (Pawlik, 2022; AMU Repository, 2023). In particular:

  • The best ASR systems for Polish achieve a word error rate (WER) of 8–12% for read speech and 20–25% for spontaneous and conversational speech, which is similar to the level of commercial solutions for English (Pawlik, 2022).
  • The diversity of dialects, the inflectional nature of the Polish language, and the specificity of spontaneous speech (pauses, fillers) continue to pose challenges, but continuous refinement of models and their adaptation to local corpora allows for significant improvements in effectiveness (Juszczyk, 2024).
     

„According to research, for the three main Speech-to-Text platforms (Microsoft, Google, IBM) converting speech to text for English, the average word error rate (WER) was as high as 10.98%. Nevertheless, preliminary experiments conducted by the author of this work showed that for Polish, both for MST and GST, the average WER exceeded 16%” (Pawlik, 2022, p. 13).

 

The importance of local corpora and benchmarks

 

Available public benchmarks, such as BIGOS and Polish ASR Leaderboard, enable transparent and systematic comparisons of the quality of various speech recognition systems in Polish on many types of recordings (from studio recordings to multi-person conversations in natural conditions) (AMU, 2023; Huggingface, 2024).

  • Thanks to them, it is possible to quickly identify the models best suited to difficult clinical and therapeutic conditions.
  • Such tools also make it possible to monitor progress and the potential need for further personalization of models for specific applications.

 

Challenges and benefits in psychotherapy and mental health

 

In therapeutic applications, automatic transcription has unique requirements:

  • Spontaneous language, often with emotional expression and interrupted speech, requires models capable of handling irregularities and medical/psychological terminology.
  • High-quality transcription can reduce the therapist's documentation burden by as much as 50–70%, allowing more time for patient contact.

 

Commercial solutions offer tools for automatic diarization (speaker differentiation), correct interpretation of technical vocabulary, and integration with electronic medical record systems (Pawlik, 2022).

 

Recommendations and further research directions

 

ASR systems used in the context of languages other than English, such as Polish, should be:

  • regularly adapted to the specifics of a given industry and local speech,
  • supported by multimodal and multithreaded conversation corpora (Juszczyk, 2024),
  • Monitored using the available BIGOS and Polish ASR Leaderboard benchmarks to ensure high standards are maintained.

 

This enables the effective use of transcription in the field of mental health, which contributes to the optimization of therapeutic processes and improvement of service quality.

 

Summary

 

Commercial ASR models with support for languages other than English, optimized and tested on Polish corpora, guarantee high-quality transcription of therapy sessions and clinical interviews. Thanks to the development of benchmark infrastructure and solid research, the quality and availability of such solutions are growing rapidly, opening up a new level of quality in the documentation and analysis of mental health processes.

 

Bibliography

 

 

Emothly supports mental health specialists by offering innovative tools for transcription, analysis, and generation of clinical notes to improve patient care.

Media społecznościowe

Contact

+48 602 667 934

This website was made in WebWave website builder.