This study explores the challenges of corpus-based teaching of Serbian as a Foreign Language in the context of artificial intelligence technologies. While there are ongoing global discussions about the future of DDL in the AI era, the Serbian context demonstrates a disconnect between technological advancements, particularly those developed by IT experts from JeRTeh, and their application in foreign language teaching. The research focuses on specific linguistic areas that have been examined in previous corpus studies: polysemy (kuća–dom), lexical gaps (babine) and positive or negative word connotations (žena). The findings show that, in tasks such as distinguishing between closely related synonyms, corpus searches offer more precise and reliable results than AI, due to their ability to provide frequency-based data. Generative models are useful for lexical gaps, particularly for beginner learners, though corpusbased methods yield strong results when sufficient technical knowledge is available. However, when analysing word connotations, generative models demonstrate significant limitations, frequently avoiding negative content due to embedded safety filters, rendering them unsuitable for discourse analysis. The main issue is that corpora are not sufficiently integrated into foreign language teaching in Serbia, which limits the ability to make comprehensive comparisons with AI tools. The study recommends training more teachers in corpus-based methods and promoting integrative approaches that combine corpora and generative models. It also advocates cultivating critical thinking, which is central to DDL and involves learner engagement with search engines and AI tools, as well as corpora. Ultimately, this research offers a model for the thoughtful incorporation of AI into language learning, particularly for less commonly taught languages such as Serbian, while preserving the strengths of corpusbased methodology.

Korpusi za učenje srpskog jezika kao stranog u eri veštačke inteligencije

Olja Perisic
2025-01-01

Abstract

This study explores the challenges of corpus-based teaching of Serbian as a Foreign Language in the context of artificial intelligence technologies. While there are ongoing global discussions about the future of DDL in the AI era, the Serbian context demonstrates a disconnect between technological advancements, particularly those developed by IT experts from JeRTeh, and their application in foreign language teaching. The research focuses on specific linguistic areas that have been examined in previous corpus studies: polysemy (kuća–dom), lexical gaps (babine) and positive or negative word connotations (žena). The findings show that, in tasks such as distinguishing between closely related synonyms, corpus searches offer more precise and reliable results than AI, due to their ability to provide frequency-based data. Generative models are useful for lexical gaps, particularly for beginner learners, though corpusbased methods yield strong results when sufficient technical knowledge is available. However, when analysing word connotations, generative models demonstrate significant limitations, frequently avoiding negative content due to embedded safety filters, rendering them unsuitable for discourse analysis. The main issue is that corpora are not sufficiently integrated into foreign language teaching in Serbia, which limits the ability to make comprehensive comparisons with AI tools. The study recommends training more teachers in corpus-based methods and promoting integrative approaches that combine corpora and generative models. It also advocates cultivating critical thinking, which is central to DDL and involves learner engagement with search engines and AI tools, as well as corpora. Ultimately, this research offers a model for the thoughtful incorporation of AI into language learning, particularly for less commonly taught languages such as Serbian, while preserving the strengths of corpusbased methodology.
2025
South Slavic Languages in the Digital Environment JuDig
Belgrado
21-23 novembre 2024
Proceedings of the International Conference South Slavic Languages in the Digital Environment JuDig
University of Belgrade – Faculty of Philology Studenski Trg 3, Belgrade, Serbia
279
293
978-86-6153-791-2
https://doi.fil.bg.ac.rs/volume.php?pt=eb_ser&issue=judig-2025-1&i=16
corpora, Data Driven Learning, Serbian as a foreign language, JeRTeh, artificial intelligence, generative models
Olja Perisic
File in questo prodotto:
File Dimensione Formato  
judig-Perisic-2025-1_merged.pdf

Accesso aperto

Tipo di file: PDF EDITORIALE
Dimensione 618.7 kB
Formato Adobe PDF
618.7 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2318/2108637
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact