New personalities in digital languages

© Matheus Bertelli
© Matheus Bertelli


People tend to interact with digital entities as if they were at a teller window. This is why communication is expected to become more personal and human-like. Fabio Crestani, a professor at the Faculty of Informatics, discussed this concept in an article written in collaboration with laRegione.

It has been long since humans understood the technology they relied on for daily life. While they may not have grasped every detail, they had a general idea of how it worked. However, with the arrival of the modern age, this understanding did not vanish; instead, it grew increasingly complex and became divided into specialised fields. Today, our lives are filled with tools and technologies that we frequently use without fully understanding how they work-or sometimes even realising they exist. One notable example is Large Language Models (LLMs), a form of programming that predates artificial intelligence (AI) and is now utilised, often without our knowledge, across various sectors. These sectors include word processing, media and entertainment, as well as healthcare and finance, among others, serving a wide range of purposes. The developments in this field, as discussed in the interview with Fabio Crestani , a Professor at the USI Faculty of Informatics, are both surprising and imaginative. Professor Crestani’s main research areas include information retrieval, text mining, and digital libraries.

Let us start with the basics: what is a Large Language Model?

To simplify, a Large Language Model (LLM) is designed to generate sentences. When I start a sentence with two or three words, the model can predict what I want to say and uses mathematical structures to guess the next word. The longer the sequence of words, the more effective the model becomes. This is because one word can be paired with almost anything, but the options narrow significantly when two words are used, and even more so with three. The process is called ’autoregressive,’ meaning that each generated word becomes part of the context for predicting the next one. To create an LLM, extensive calculations are performed based on intensive training using a vast amount of written and spoken texts.

Training similar to that of Artificial Intelligence?

Yes, the Large Language Model (LLM) is a component of artificial intelligence, although LLMs were developed before the concept of AI became widespread. Initially, they were simply mathematical models used for speech recognition. They are still employed for this purpose; for example, if my system struggles to understand a word due to background noise, the LLM can suggest what the word might be, significantly narrowing down the options. In recent years, these models have evolved and can now perform a wide variety of tasks.

So when we use Siri on our mobile phone, does it use a large language model to understand what we say?

Today, complex models are increasingly used, all’of which are based on LLM. These models not only excel in speech recognition and writing but are also applied in many other areas, as nearly everything can be translated into a language format-essentially, into input and output or non-identical but similar models. One of my areas of research is information retrieval, i.e., the retrieval of information via search engines such as Google, Yahoo, and Bing. In the past few years, it has become clear that people enjoy using conversational systems. As a result, they have begun to adopt Large Language Models because they want searches to feel more like a conversation with another person. For example, when I request information, the system not only provides results but can also engage with me by asking for more details about what I’m looking for. Imagine I type "train to Milan." The system may not know my intended travel date, so it could ask me for that information. Additionally, it might suggest the cheapest train options, creating a dialogue similar to what I would experience at a ticket counter. While this example is straightforward, these interactions could become more complex in many ways.

And does it train like AI?

To create a LLM, we need complex mathematical models and powerful computing systems that can learn word associations using extensive volumes of data, regardless of the language. Currently, a project is being discussed to develop an LLM based on the national languages of Switzerland. To achieve this, we require Graphical Processing Units (GPUs), which are specialised computers designed for performing simple calculations at an astonishing scale, executing billions of them rapidly. Unfortunately, GPUs are expensive and generate significant heat, necessitating cooling components to manage their temperature. Fortunately, we have access to a high-performance system at the Supercomputing Centre (CSCS) in Lugano, which is among the best in the world.

Does this mean that the algorithm that dominates our digital lives today will disappear?

Every search engine consists of two main components: the part that retrieves information and the part that displays the results. The presentation of results has not seen significant development until recently. Companies have recognised that users are frustrated with sifting through numerous pages of results, especially since many now use mobile phones. Viewing multiple links on mobile devices can be cumbersome due to their small size, and users can only see a few at once. Additionally, many applications are now used with voice commands, often while driving, eliminating the need for a screen altogether. This creates a demand for a ’conversational’ interface, for which large language models are ideally suited.

Aren’t they a bit impersonal?

Yes, a little bit. However, at USI, we aim to personalise interactions so that when they converse with me, it will be different from how they converse with you. This personalisation will be based on what they learn about my interests, tastes, communication style, and preferences. The search engine won’t disappear; it will always work in the background to find the most relevant information. The Large Language Model will then interpret that information and present it to me in a way that is tailored to my needs, creating a more human-like conversation with the search engine.

A personal assistant or a good salesman?

I would say it leans more towards the former. The system will be able to learn about me, and the more information it gathers, the more accurate its responses will be for my specific needs. We are talking about information retrieval, which encompasses techniques used to manage the representation, storage, organisation, and access to information. This field combines various disciplines, including psychology, philosophy, linguistics, and semiotics. At USI, we have a project that aims to go even further: our goal is to create not only highly personalised conversations with users but also to enable the system to monitor my emotional state. This will help it better understand the nature of my needs.

We are talking about a machine, therefore theoretically devoid of emotions, while a human being is anything but impersonal. How does one bridge the gap?

Thanks precisely to LLModels. For years, we have been able to guess from how someone speaks what their psychological state is, whether they are showing signs of depression, anxiety, or happiness. This is what a psychologist does today. The LLM learns to recognise sequences of words and associate them with an emotion, a feeling or a psychological state. Starting from a general pattern, this language model can learn how a specific person communicates by receiving targeted training that adjusts the patterns accordingly. By observing the user’s speech patterns, the system can recognise their state of mind and may even be able to predict if they are entering a state of depression, anxiety, or stress.

Next developments?

I am discussing with a peer from the philosophy department at the University of Lausanne to see if we can do something very special. As mentioned, we can customise Large Language Models; consequently, we came up with the idea of taking all the writings of a great author from the past (Pascoli, Leopardi, Kafka...), not only his published work but also his personal writings, and building an LLM that writes/speaks like him or at least contains a reflection of him. It is not so much about hearing him speak-since large language models can’t replicate my voice, but rather how they express what I say. The real interest lies in observing the personality of these models, which can provide insight into the character of the author in question.

So in a few years, we will open our search engine, and there will be a voice, shop assistant style, saying: "Good morning, sir; how can I help you"- Will we ever get to a computer or language system with real interaction with people?

In my opinion, we are quite close to a significant technological advancement. Over the past few decades, we have evolved from computers to personal computers and now to smartphones, which accompany us everywhere and are aware of nearly everything we do. This level of personalisation in our interactions suggests that, at some point, people may not want to exchange their mobile phones. The reason is that the experiences we have through our devices will be closely tied to our personalities, making it difficult to adapt to a different phone and its unique interactions.

They won’t be people yet, but they will have their own personalities.

Yes, you could almost say they are developing it.

 written and published in cooperation with laRegione .