Competitiveness of Slovenian – but in whose account?
In the light of the development of a new large linguistic model for Slovene Gams, which will operate on the principle of Chatgpt, a more violent debate about the ethics of data mining, which seizes the author’s text without the consent of the author and without compensation.
Barbara Pogacnika representative of the Slovenian Writers’ Association at the European Writers Council, emphasizes that « the ethical standard here is the issue of legislation. The Slovenian ‘original sin’ occurred at the debate and implementation of the European Directive on the Single Digital Square of 2019. Even then, representative author’s societies were not invited to a public debate as they did not represent economic entity. «
Under the auspices of the Center for Language Resources and Technologies, the University of Ljubljana (CJVT UL), this moment is also developing a large language model for Slovene, which, if « NUK and other stakeholders, agree to allow us access to such works, » may also be taught by the copyrighted texts kept by NUK. As the leader of CJVT UL points out Simon Krekit is a collective decision of speakers and speakers of Slovene, « to which we want artificial intelligence to be able to know Slovenian or to know the Slovenian environment and culture. » The main purpose of the project is to « ensure that the Slovene language will have adequate support in the digital environment », but under the current European legislation, language models of languages are introduced into the digital environment at the expense of authors who are kept.