
This is a post by Patrick Connolly (University of Barcelona).
Tech innovation generally arrives on a predictable wave of hype and hyperbole, we’re used to this by now. It serves the economic interests of the companies proclaiming their latest developments to spread the idea that what they have created is a revolutionary development that will lead us towards utopia.
Indeed, the overriding economic model of tech is based upon generating publicity in order to encourage investment and the creation of multi-billion-dollar stock market valuations for companies that lose billions of dollars every year.
Sometimes amongst the hype, though, genuine revolutionary developments emerge. And large language models (LLMs), developed using parasitical trawls of billions of items of internet data, appear to be capable of some impressive feats. Not least in how they can sustain a conversation with a human.
Philosophers have long been interested in questions relating to
so-called ‘artificial intelligence’ and projects in the philosophies of mind, language and science, and in epistemology and ethics have emerged to examine them. Here I want to touch upon one such issue in what we might call the philosophy of conversation. The basic problem arises by considering LLM chatbots in two different ways. First, when we consider their design and function as disembodied data models, they don’t seem to be the type of thing that has what we might want to call intentions or beliefs. However, when we consider them as conversational participants, and in particular how they perform in conversation, they do seem to be capable of holding and sustaining interactions that look very much like conversations. The problem then arises because most of our conversational theories and frameworks, and how we explain conversation, are rooted in the expectation that its participants have capacities for intention, or belief, or knowledge. This problem has a number of extensions. There’s a navel-gazing import for those of us interested in theories of conversation in that it provides a challenge to underlying assumptions. But it also has wider implications at the societal level, because how we answer it will have a bearing on the social status of LLMs, and how we should regard chatbots in our wider social lives.
To flesh this out a little, lets first consider the first part of the problem – that the design and architecture of an LLM seems to preclude them from possessing capacities for intention, belief or knowledge. The LLM chatbots most of us are familiar with right now, such as GPT, Co-Pilot (a version of GPT-4 with access the internet), Claude and Gemini, are complex statistical machines which use neural networks and transformer architectures to process and generate human-like responses to prompts provided by a user. For its part, the LLM component predicts the most likely next word or sequence of words of a string based on statistical patterns in its training data, while the broader chatbot system combines the LLM’s outputs with user inputs to create conversation-like interactions. These models are mathematically explainable and so at the level of their underlying design, it seems at least plausible to say that the chatbots they are part of do are not the type of thing we would ordinarily expect to possess capacities like having intentions, beliefs, or what we’d consider to be knowledge. Now whether LLMs do have such capacities is by no means a settled question. It could be, perhaps, a lack of imagination about how such capacities could emerge from LLMs that results in such a conclusion. However, as things stand with what we know right now about their basic operations, remaining sceptical seems at least a fair position.
On the other hand, on benchmarking tests LLM chatbots perform many discrete linguistic and reasoning tasks to a high level, with results comparable to human performance. And also with a general trend of continuous improvement. We have good reasons to be sceptical about such tests taken individually and to be cautious about what they tell us. Developers of LLMs are aware of such tests and fine-tuning to meet specific benchmarks in newer iterations of models is therefore to be expected. That said, improvements in general performance across benchmark tests also seems to correlate with the phenomena we experience when interacting with chatbots. When interacting with state-of-the-art models it certainly seems (to some extent, at least) that chatbots are capable conversationalists. They can engage in coherent, turn-taking interactions, they can ask for clarification, initiate dialogue repair, and discuss a range of topics. If this is the case, we might therefore expect that our theories of conversation should be able to accommodate the language use of machine conversational participants. The problem therefore arises when considering conversation theories that rely on language users having certain cognitive capacities, like intentions and beliefs. Just to take one example, Paul Grice’s theory of conversational implicature requires that speakers and audiences have beliefs about each other’s mental states and intentions.
And so, if chatbots lack these capacities, it’s unclear how they could generate or comprehend implicatures within such a theory, yet when interacting with them, they often can.
If this is the case it leaves us with three potential directions to go in. First, we might want to reject the conjecture that chatbots lack intentions, beliefs, and knowledge. Maybe we might want to argue that such capacities emerge from the complexity of LLMs or perhaps one could argue that despite their underlying architecture, chatbots exhibit functional equivalents of these capacities that allow them to participate in meaningful conversation. Failing that we might want to reject the conjecture that chatbots are capable conversationalists.
It could be argued that any appearance of conversation is an illusion and that chatbot language use is ultimately meaningless. A final option might be to adjust our conception of the requirements for conversation. Rather than requiring full-blown human-like mental capacities, we could look for alternative accounts of how machines can participate in linguistic exchanges despite not possessing them. I offer no answers here, however I do think even just considering the problem itself can be insightful. It gives us a new way to reflect on one of the most fundamental activities of human sociability – conversation.
But it also offers the opportunity for us to consider the changing role of technology in our lives. The type of technology LLMs are a development of has, to date, had its most notable impact in how we access and structure information. It’s also provided new challenges for us in how we conceive of person-to-person communication. This new challenge, though, asks us to consider the role of a new type of actor in our conversational landscape.
To read more about this, check out Patrick’s chapter ‘Conversations with Chatbots’ in the edited volume Conversations Online: Explorations in Philosophy of Language which is forthcoming with Oxford University Press. A late version of the chapter can be found here.
Patrick Connolly is a Juan de la Cierva Postdoctoral Fellow working at the University of Barcelona and is a member of BIAP and LOGOS. He works mostly on issues in the philosophy of language and social philosophy as applied to online conversation and digital communication.