Natural language models are a technology associated with Artificial Intelligence (AI) that are increasingly being used within society to perform various tasks. While traditional tasks include spelling auto-correct, audio-to-text conversion, speech recognition and machine translation, the models are becoming increasingly powerful and their sphere of operation increasingly wider. These models are able to identify patterns and hidden insights in data sets too large for humans to manage. While these models can be put to good uses, such as extracting insights from health data, in the wrong hands or used for unintended purposes, they potentially also pose a danger to society.

Natural language models can be described as a field of study within Applied Language Technology, which concerns how computers and other digital devices analyse, produce, modify and respond to human texts and speech. At the heart of these models lie advanced algorithms that, having learned the rules associated with a specific natural language, are then able to apply them not only to predict text but even produce new text.

A language model that is gaining attention is called ‘GPT3’ (Generative Pre-trained Transformer 3). Developed by a private company this language model uses 175 billion machine learning parameters in its operations. The unique aspect of GPT3, besides extracting knowledge from texts, is its ability to produce texts of such a high quality that it is impossible to identify if written by human or machine. This can pose multiple challenges in many sectors. In the university context for example, how do we know that the texts students produce have not been written by a natural language model? Would plagiarism systems be able to detect this? To what extent could language models trick AI analytical tools being used in higher education to gauge the performance of students?

Helping us to answer these questions and many more are our distinguished speakers Jussi Karlgren and Magnus Sahlgren. They will help us to understand more about what natural language models are, how they work, what advantages they hold but also what potential risks they bring with their increased use within society.


Jussi Karlgren researches linguistic use and stylistic variation in language and how it can be represented and used as support to find what one wants to read or listen to. He is an associate professor of language technology at the University of Helsinki and a principal research scientist at Spotify.
Magnus Sahlgren is a computational linguist whose research is centered around questions about what it means to understand language, and how we can build machines with such capacity. Sahlgren has worked on computational models of meaning for the last 20 years, and he currently leads the research on natural language understanding and language models at RISE and at AI Sweden.


Please note that this seminar will now take place in a hybrid format and besides the opportunity to attend via Zoom, there will be a limited number of places for physical attendance as well. Should you wish to attend physically, please register this with Stanley Greenstein by 12:00 on Wednesday, October 27. We will be at the Department of Computer and Systems Sciences (DSV), the address is Borgarfjordsgatan 12, Kista (The Nod building). Please take elevator E to the 3rd floor and wait in the DSV waiting room to be let in.

This seminar is arranged by DHV-hub, a meeting place for those interested in the Digital Human Sciences at Stockholm University. The DHV seminars are inter-disciplinary in nature, and are open to all scholars interested in digital artefacts and environments and their significance for society and humanity. Please feel free to share this invitation to anyone that might be interested.