COMPLEXITY

Nature of Intelligence, Ep. 5: How do we assess intelligence?

Episode Summary

When it comes to assessing intelligence, people have all kinds of tests — the SAT, IQ tests, and so on. There’s controversy over how fairly these tests really measure human intelligence, but at the very least, we know that they correlate with some general reasoning skills when people take them. That assumption breaks down when we try to assess intelligence in non-humans. What does it mean when a large language model passes an intelligence test meant for humans? Does it actually have the same reasoning skills that a human does, or is it doing something else? In today’s episode, with guests Erica Cartmill and Ellie Pavlick, we investigate the best ways to assess intelligence in non-humans, whether animals or machines.

Episode Notes

Guests:

Erica Cartmill, Professor, Anthropology and Cognitive Science, Indiana University Bloomington
Ellie Pavlick, Assistant Professor, Computer Science and Linguistics, Brown University

Hosts: Abha Eli Phoboo & Melanie Mitchell

Producer: Katherine Moncure

Podcast theme music by: Mitch Mignano

More info:

Tutorial: Fundamentals of Machine Learning
Lecture: Artificial Intelligence
SFI programs: Education
Diverse Intelligences Summer Institute

Books:

Artificial Intelligence: A Guide for Thinking Humans by Melanie Mitchell

Talks:

How do we know what an animal understands by Erica Cartmill
The Future of Artificial Intelligence by Melanie Mitchell

Papers & Articles:

“Just kidding: the evolutionary roots of playful teasing,” in Biology Letters (September 23, 2020), doi.org/10.1098/rsbl.2020.0370
“Overcoming bias in the comparison of human language and animal communication,” in PNAS (November 13, 2023), doi.org/10.1073/pnas.22187991
“Using the senses in animal communication,” by Erica Cartmill, in A New Companion to Linguistic Anthropology, Chapter 20, Wiley Online Library (March 21, 2023)
“Symbols and grounding in large language models,” in Philosophical Transactions of the Royal Society A (June 5, 2023), doi.org/10.1098/rsta.2022.0041
“Emergence of abstract state representations in embodied sequence modeling,” in arXiv (November 7, 2023), doi.org/10.48550/arXiv.2311.02171
“How do we know how smart AI systems are,” in Science (July 13, 2023), doi: 10.1126/science.adj59

Episode Transcription

Complexity Season 2 Episode 5

Title: How do we assess intelligence?

Abha Eli Phoboo: The voices you’ll hear were recorded remotely across different countries, cities, and work spaces.

Erica Cartmill: I often think that humans are very egotistical as a species, right? So we're very good at particular things and we tend to place more value on the things that we're good at.

[THEME MUSIC]

Abha: From the Santa Fe Institute, this is Complexity.

Melanie Mitchell: I’m Melanie Mitchell.

Abha: And I’m Abha Eli Phoboo.

[THEME MUSIC FADES OUT]

Melanie: As we enter our fifth episode of this season on intelligence, we’ve explored quite a few complicated and controversial ideas. But one thing has become really clear: intelligence is a murky concept. And that’s the point of this series — it’s something that we think we know when we see it, but when we break it down, it’s difficult to define rigorously.

Abha: Today’s episode is about how we assess intelligence. When it comes to testing humans, we have all kinds of standardized measures: IQ tests, the SAT and so on. But these tests are far from perfect, and they’ve even been criticized as limited and discriminatory.

Melanie: To understand where our desire to test intelligence comes from — and also the way we talk about it as an inherent personality trait — it’s useful to look at the history of intelligence in Western society. In ancient Greece, the concept was described as “reason” or “rationality,” which then evolved into “intelligence” more broadly when the discipline of psychology arose. Philosophers like Socrates, Plato, and Aristotle highly valued one’s ability to think. And at first glance, that seems like a noble perspective.

Abha: But Aristotle took this a step further. He used the quote unquote “rational element,” as justification for a social hierarchy. He placed European, educated men at the top, and women, other races, and animals below them.

Melanie: Other Western philosophers like Descartes and Kant embraced this hierarchy too, and they even placed a moral value on intelligence. By claiming that a person or an animal wasn’t intelligent, it became morally acceptable to subjugate them. And we know how the rest of that European expansion story goes.

Abha: So today’s notions about intelligence can be traced in part to the ways men distinguished themselves from… non-men.

Melanie: Or, to give the philosophers a more generous interpretation, the history of thought around intelligence centers on the idea that it is fundamentally a human quality.

Abha: So if intelligence, in theory, stems from humanity, how do we decide the degree to which other entities, like animals and large language models, are intelligent? Can we rely on observations of their behavior? Or do we need to understand what’s going on under the hood — inside their brains or software circuits?

Melanie: One scientist trying to tackle such questions is Erica Cartmill.

Erica: So my name is Erica Cartmill. I'm a professor of cognitive science, animal behavior, anthropology, and psychology at Indiana University. You know, I really study cognition, particularly social cognition, and the kinds of cognition that allow communication to happen across a wide range of species.

Abha: Erica has extensive experience observing intelligent behavior in beings that are very different from humans.

Erica: So I got the animal bug when I was a kid. And we had a whole range of different kinds of animals. It's sort of a menagerie. We had horses, we had dogs, we had a turtle, we had a parrot. And I was always sort of out you know, watching, like watching lizards and butterflies and birds, mice in our barn. And sometimes it's like I would, you know, catch a lizard, put it in a terrarium for two days, observe it, let it go again. And that kind of wanting to like observe the natural world and then have an opportunity to more closely observe it, under you might say controlled circumstances, even as a child, and then release it back into its natural environment is really something that I've continued to do as an adult in my scientific career. And that's what I do mostly with my lab now, kind of split between studying great apes and human children. But I've done work on a range of other species as well, Darwin's finches in the Galapagos. I'm doing a project now that also includes dolphins and dogs and kea, which is a New Zealand parrot. And I'm starting a dog lab at IU. So I'm excited about some of those other species, but I would say the core of my work really focuses on comparing the cognitive and communicative abilities of great apes and humans.

Melanie: Much of Erica’s research has been on the evolution of language and communication. As we’ve said before, complex language is unique to our species. But other animals communicate in many ways, so researchers have been trying to narrow down what exactly makes our language so distinct.

Erica: So I think humans have always been really focused on this question of what separates us from other species. And for a long time, answers to that question centered around language as the defining boundary. And a lot of those arguments about language really focused on the structural features of language. And, you know, if you look at sort of the history of these arguments, you would see that every time a linguist proposed a feature of language that say, you know, human language is different because X then people would go out and study animals and they would say, “Well, you know, starlings have that particular feature” or, “A particular species of monkey has that feature.” And then linguists would sort of regroup and say, “Okay, well, actually this other feature is the real dividing line.” And, you know, I think probably the boring answer or interesting answer, depending on how you look at it, is that there probably isn't one feature. It's the unique constellation of features combined with a constellation of cognitive abilities that make language different and make it so powerful. But I will say in recent years, the focus of these arguments about “language is unique because” has shifted from language is unique because of some particular structural feature to language is unique because it is built on a very rich social understanding of other minds. It's built on inferences about others' goals, about what others know and don't know. It's built on what we call pragmatics and linguistics. So actually it's very unlike a structured program that you can sort of apply and run anywhere. It's actually something that relies on rich inferences about others' intentions.

Melanie: When we humans communicate, we’re often trying to convey our own internal thoughts and feelings, or we’re making inferences about someone else’s internal state. We naturally connect external behavior with internal processes. But when it comes to other beings, our ability to make judgments about intelligence isn’t as straightforward.

Abha: So today we’re going to first look at what we can learn from external behavior and applying human notions of intelligence to animals and machines. And in Part 2, we’ll focus on what researchers are doing to look under the hood in large language models, which can pass tests at levels that are deceptively similar to humans.

Abha: Part 1: Assessing Intelligence in Humans, Animals, and Machines

Abha: If you have a pet at home, you’ve probably had moments when you’ve wanted to know what it’s trying to say when it barks, meows, or squawks. We anthropomorphize pets all the time, and one of the ways we do that is by envisioning them saying things like, “I’m hungry!” or “I want to go outside!” Or we might wonder what they say to each other.

Melanie: Animals most definitely communicate with one another. But there’s been a lot of debate about how sophisticated their communications are. Does a chimp’s hoot or a bird’s squawk always mean the same thing? Or are these signals flexible, like human words, communicating different meanings depending on context, including the animal’s understanding of the state of its listeners’ minds? In her work, Erica has critiqued the assumptions people often make in experiments testing animal communication. She’s noted that the methods used won’t necessarily reveal the possible meaning of both vocal and other kinds of signals, especially if those meanings depend on particular contexts.

Erica: Authors recently, you know, ranging from cognitive scientists to philosophers to linguists have argued that human communication is unique because it relies on these very rich psychological properties that underlie it. But this in turn has now led to new arguments about the dividing line between humans and other animals. Which is that animals use communication that is very code-like, that you know, one animal will produce a signal and another animal will hear that signal or see that signal and decode its meaning. And that it doesn't rely on inferences about another's intentions or goals, that the signals kind of can be read into and out of the system. If you record, say, an auditory signal, like a bird call, and then you hide a speaker in a tree, and you play that call back, and you see how other birds respond. Right so this is called the playback method, unsurprisingly. And that's been one of the strongest things in the toolkit that animal communication researchers have to demonstrate that those calls in fact have particular meanings. That they're not just, I'm singing because it's beautiful, but that this call means go away and this other call means come and mate with me and this other call means there's food around, et cetera, et cetera. And so decontextualizing those signals and then presenting them back to members of the species to see how they respond is the dominant method by which scientists demonstrate that a call has a particular meaning. That's been incredibly important in you know, arguing that animals really are communicating things. But that method and the underlying model that is used to design experiments to ask questions about animal communication is also very limiting.

Abha: An auditory signal taken out of context, whether a word or an animal call — is a very narrow slice of all the different ways animals — and humans — communicate with each other.

Erica: So it's very good at demonstrating one thing, but it also closes off doors about the kinds of inferences that animals might be making. If Larry makes this call and I'm friends with Larry versus Bob makes that call and I'm enemies with Bob, how do I respond? Does Bob know that I'm there? Can he see me? Is he making that call because I am there and he sees me and he's directing that call to me? Versus, is he making that call to someone else and I'm eavesdropping on it. Those are kinds of inferences that animals can make. You know, I'm not saying all animals in all cases, but the ways that we ask questions about animal communication afford certain kinds of answers. And we need, I think, to be more, I don't know, humble is the right word? But we need to recognize the ways in which they limit the conclusions that we can draw, because this is very different from the way that we ask questions about human language. And so when we draw conclusions about the difference between human language and animal communication based on the results of studies that are set up to ask fundamentally different questions, I think that leaves a lot to be desired.

Abha: And focusing on abilities that are relevant to humans’ intelligence might mislead us in how we think about animal intelligence.

Erica: I often think that humans are very egotistical as a species, right? So we are good at, we're very good at particular things and we tend to place more value on the things that we're good at. And I think that in many cases, that's fine. But it also really limits, often limits the way that we ask questions and attribute kinds of intelligence to other species. So it can be quite difficult, I think, for humans to think outside of the things that we're good at or indeed outside of our own senses. I mean, sort of five senses, biological senses. So elephants… we've known for a long time that elephants are able to converge at a particular location, like show up, you know, far away at this tree on this day at this time from different starting points. And people really didn't know how they were doing it. They were starting too far apart to be able to hear one another. people were like, are they planning? Do they have the sense of two Tuesdays from now we're going to meet at the watering hole? And it wasn't until people said maybe they're using senses that fall outside of our own perceptual abilities. In particular, they measured very, very low frequencies and basically asked, okay, maybe they're vocalizing in a way that we can't perceive, right? And so once they did that and greatly lowered the frequency of their recording equipment, they found that elephants were in fact vocalizing at very, very long distances, but they were doing it through this rumble vocalization that actually propagates through the ground rather than through the air. And so they produce these, I can’t imitate it because you wouldn’t hear it even if I could, But they produce these very low rumbles that other elephants, you know, kilometers away, perceive not through their ears but perceive actually through specialized cells in the pads of their feet. And so I think this is a nice example of the way that, you know, we have to, in effect, not even necessarily think like an elephant, but imagine hearing like an elephant, having a body like an elephant, thinking, I like to call it thinking outside the human. Humans are good at particular things, we have particular kinds of bodies, we perceive things on particular time scales, we perceive things at particular light wavelengths and auditory frequencies. Let's set those aside for a second and think about, okay, what did that species evolve to do? What do its perceptual systems allow it to perceive and try to ask questions that are better tailored to the species that we're looking at.

Melanie: There's been a lot of work throughout the many decades on trying to teach human language to other species like chimps or bonobos or African gray parrots. And there's been so much controversy over what they have learned. What's the current thinking on the language abilities of these other species and those experiments in general?

Erica: It's almost hard to answer the question, what's the current thinking, because there's very little current research. A lot of that research was done, you know, 20 or even 40 years ago. Compared to the work that was being done 30 years ago, there's very little current work with apes and parrots and dolphins, all of which 30 years ago, everyone was trying to teach animals human language. And I think it was a really interesting area of inquiry. I would say people differ a little bit, but I think that probably the sort of most dominant opinion or maybe the discussion is best characterized by saying that people today, I think, believe that those animals, largely believe that those animals were able to understand, to learn, understand and productively use words, but that they were limited in the scope of the words they could learn and that they weren't combining them into productive sentences. And this was part of the argument that syntax, the combining of words according to particular rules, was something that human language did that was very different from what animals could produce. And so I think with the animal language studies that were showing largely that animals could learn words, they could sometimes produce words together, but they weren't doing it in reliable sentence-like structures.

Melanie: But do you think that the fact that we were trying to teach them human language in order to assess their cognitive abilities was a good approach to understanding animal cognition or is or should we more do what you said before, sort of take their point of view, try to understand what it's like to be them rather than train them to be more like us?

Erica: I think that's a great question. My answer probably hinges around the limitations of human imagination. Where I think that teaching animals to communicate on our terms allows us to ask better questions and better interpret their answers than us trying to fully understand their communication systems. People certainly are using things like machine learning to try to quote unquote “decode” whale song or bird song. I think that those approaches, which is more sort of on the animals’ terms or using their natural communication. And I think that those are very interesting approaches. I think they'll be good at finding patterns in what animals are producing. The question I think still remains whether animals themselves are perceiving those patterns and are using them in ways that have meaning to them.

Abha: And the way we’ve tried to assess intelligence in today’s AI systems also hinges around the limitations of human imagination, perhaps even more so than animals, given that by default, LLMs speak our language. We’re still figuring out how to evaluate them.

Ellie Pavlick: Yeah, I mean, I would say they're evaluated very… You know, I would say badly.

Abha: This is Ellie Pavlick. Ellie’s an assistant professor of computer science and linguistics at Brown University. Ellie has done a lot of work on trying to understand the capabilities of large language models.

Ellie: They're evaluated right now using the things that we can conveniently evaluate, right? It is very much a, what can we measure? And that's what we will measure. There's a lot of repurposing of existing kind of evaluations that we use for humans. So things like the SAT or the MCAT or something like that. And so it's not that those are like completely uncorrelated with the things we care about, but they're not very deep or thoughtful diagnostics. Things like an IQ test or an SAT have long histories of problems for evaluating intelligence in humans. But they also just weren't designed with models of this type being the subjects. I think what it means when a person passes the MCAT or scores well on the SAT is not the same thing as what it might mean when a neural network does that. We don't really know what it means when a neural network does it, and that's part of problem.

Melanie: So why do you think it's not the same thing? I mean, what's the difference between humans passing a bar exam and a large language model?

Ellie: Yeah, I mean, that's a pretty deep question, right? So like I think, like I said, I tend to actually, I would say compared to lot of my peers, be like, not as quick to say the language models are obviously not doing what humans do, right? Like I tend to reserve some space for the fact that they might actually be more human-like than we want to admit. You know, a lot of times the processes that people might be using to pass these exams might not be as deep as we like to think. So when a person, say, scores well on the SAT, we might like to think that there's like some more general mathematical reasoning abilities and some general verbal reasoning abilities. And then that's going to be predictive of their ability to do well in other types of tasks. That's why it's useful for college admission. But we know in practice that humans often are just learning how to take an SAT, right? And I think we very much would think that these large language models are mostly learning how to take an SAT.

Melanie: So just to clarify, when you say, I mean, I know what it means when a human is learning how to pass a test, but how does a language model learn how to pass a test?

Ellie: Yeah, so we can imagine like this simple setting, I think people are better at thinking about, which is like, let's pretend we just trained the language model on lots of examples of SATs. They're going to learn certain types of associations that are not perfect, but very reliable. And I used to always have this joke with my husband when we were in college about like, how you could pass a multiple choice test without having ever taken the subject. And we would occasionally try to, like I would try to pass his qualifying exams in med school, I think he took an econ exam with me. So there's certain things like, whenever there's something like “all of the above” or “none of the above,” that's more likely to be the right answer than not, because it's not always there. So it's only there when that's the right thing. Or it's a good way for the professor to test that you know all three of these things efficiently. Similarly, when you see answers like “always” or “never” in them, those are almost always wrong because they're trying to test whether you know some nuanced thing. Then there's some like, and none of these is perfect, but you can get increasingly sophisticated kind of heuristics and things like, based on the words, like this, you know, this one seems like more or less related, this seems kind of topically off base, whatever. So you can imagine there's kind of like patterns that you can pick up on. And if you stitch many, many of them together, you can pretty quickly get to, possibly perfect performance, with enough of them. So I think that's a kind of common feeling about how language models could get away with looking like they know a lot more than they do by kind of stitching together a very large number of these kinds of heuristics.

Abha: Would it help if we knew what was going on under the hood, you know, with LLMs? We don't really actually know a whole lot about our brains either, and we don't know anything about LLMs, but would it help in any way if we sort of could look onto the hood?

Ellie: I mean, that's where I'm like placing my bets. Yeah.

Melanie: In Part 2, we’ll look at how researchers are actually looking under the hood. And many of them are trying to understand LLMs in a way that’s analogous to how neuroscientists understand the brain.

Melanie: Part 2: Going Under the Hood

Abha: Okay, so wait a minute. If we’re talking about mechanistic understanding in animals or humans — that is, understanding the brain circuits that give rise to behavior — it makes sense that it’s something we need to discover. It’s not obvious to us, in the same way that it’s not obvious how a car works if you just look at the outside of it. But we do know how cars work under the hood because they’re human inventions. And we’ve spent a lot of this season talking about how to learn more about artificial intelligence systems and understand what they’re doing. It’s a given that they’re so-called “black boxes.” But… we made AI. Human programmers created large language models. Why don’t we have a mechanistic understanding? Why is it a mystery. We asked Ellie what she thought.

Ellie: The program that people wrote was programmed to train the model, not the model itself, right? So the model itself is this series of linear algebraic equations. Nobody sat down and wrote like, “Okay, in the 118th cell of the 5,000th matrix, there'll be a point zero two,” right? Instead there's a lot of mathematical theory that says, why is this the right function to optimize? And how do we write the code? And how do we parallelize it across machines? There's a ton of technical and mathematical knowledge that goes into this. There's all of these other variables that kind of factor in, they’re very much part of this process, but we don't know how they map out in this particular thing. It's like you kind of set up some rules and constraints to guide a system. But the system itself is kind of on its own. So like if you, you're routing like a crowd through like a city or something for a parade, right? And now you come afterward and you're trying to figure out like why there's a particular cup on the ground in a particular orientation or something. It's like, but you set up, like you knew where the people were going to go. And it's like, yeah, kind of, but there's like all of this other stuff that like, it's constrained by what you set up, but that's not all that there is. There's many different ways to meet those constraints. And some of them will have some behavioral effects and others will have others, right? There's a world where everyone followed your rules there wasn't a cup there. You know, there's a rule where like, you know, that those cars crashed or didn't crash, like, and all of those other things are like, subject to other processes. So it's kind of an under specified problem, right, that was written down. And there are many ways to fill in the details. And we don't know why we got this one that we got.

Melanie: So when we’re assessing LLMs, it’s not quite the same as humans because we don’t know what happens between the constraints we set up and for example, ChatGPT’s SAT score at the end. And we don’t always know how individual people are passing the SAT either — how much someone’s score reflects their underlying reasoning abilities versus how much it reflects their ability to sort of “game” the test. But at the very least, when we see a SAT score on a college application, we do know that behind that SAT score, there’s a human being.

Ellie: We can take for granted that we all have a human brain. It's true. We have no idea how it works, but it is kind of a known entity because we've evolved dealing with humans. You live a whole life dealing with humans. So when you pick somebody to come to your university, you hire someone for a job, it's like, it's not just a thing that passes SAT, it's a human that passes SAT, right? Like that is one relevant feature. Presumably the more relevant feature is that it's a human. And so with that comes a lot of inferences you can make about what humans who pass the SAT or score a certain score probably also have the ability to do, right? It's a completely different ball game when you're talking about somebody who's not a human, because that's just not what we're used to working with. And so it's true, we don't know how the brain works, but now that you're in the reality of having another thing that's scoring well, and you have no idea how it works. To me, the only way to start to chip away at that is we need to ask if they're similar at a mechanistic level. Like asking whether like a score on the SAT means the same thing when an LLM achieves it as a human, it is like 100% dependent on how it got there.

Abha: Now, when it comes to assessing artificial intelligence, there’s another question here: How much do we need to understand how it works, or how intelligent it is, before we use it? As we’ve established, we don’t fully understand human intelligence or animal intelligence — people debate on how effective the SAT is for us — but we still use it all the time, and the students who take it go on to attend universities and have careers.

Ellie: We use medicines all the time that we don't understand the mechanisms that they work on. And that's true. And it's like, I don't think it's like we cannot deploy LLMs until we understand how they work under the hood. But if we're interested in these questions of, “Is it intelligent?” Like just the fact that we care about that question. Like answering that question probably isn't relevant for whether or not you can deploy it in some particular use case. Like if you have a startup for LLMs to handle customer service complaints, it's not really important whether the LLM is intelligent. You just care whether it can do this thing, right? But if you want to ask that question, we're opening up this very big can of worms and I don't think you can, you can't ask the big questions and then not be willing to do the big work, right.

Melanie: And answering the question of mechanistic understanding is really big work. As in other areas of science, you have to decide what level of understanding you’re actually aiming for.

Ellie: Right, I mean, this kind of idea of levels of description has existed in cognitive science. I think cognitive scientists talk about it a lot, which is like, kind of what is the right language for describing a phenomenon? And sometimes you can have simultaneous consistent accounts, and they really should be consistent with one another, but it doesn't make sense to answer certain types of questions at certain levels. And so I think a favorite example in cognitive sciences, I talk about like the quantum physics versus the classical mechanics, right? Like it would be really cumbersome and bizarre and highly unintuitive and we can't really do it to say like, if I roll this billiards ball into this billiards ball and try to describe it at the level of quantum mechanics, it would be an absurd thing to do and you would be missing a really important part of how physics works. And there's a lot of debate about whether you could explain the kind of billiards ball in the quantum mechanics. But the point is like there's laws at the lower level that tell you that the ball will exist. And now once you know that the ball is there, it makes sense to explain things in terms of the ball because the ball has the causal force in this thing, not the individual things that make up the ball. But you would want to have the rules that combine the small things together in order to get you to the ball. And then when you know that the ball is there, then you can just talk in terms of the ball and you don't have to appeal to the lower level things. And sometimes it just makes more sense to talk about the ball and not talk about the lower level things. And I think the feeling is we're looking for those balls within the LLM so that you can say, the reason the language model answered this way on this prompt, but when you change the period to have a space before it, it suddenly got the answer wrong. That's because it's thinking in terms of these balls, right? And if we're trying to understand it at the level of these low level things, it just seems random. If you're missing the key causal thing, it just seems random. It could be that there is no key causal thing, right? That's kind of part of the problem. I'm like thinking there is, and if we find it, this will be so cool. And the common, a legitimate point of skepticism is there might just not be one, right?

Abha: So we’re trying to find the shape and size of these “billiard balls” in LLMs. But as Ellie said, whether or not the billiard balls even exist is not certain. We’re assuming and hoping that they’re there and then going in and looking for them.

Melanie: And if we were to think about how these levels apply to humans, one way we try to gain mechanistic understanding of human intelligence is by looking inside our brains. If you think back to Ev Fedorenko’s work from our episode about language, Ev’s use of fMRI brain scanning is exactly this — she’s looked at the pathways in the brain that light up when we use language. But imagine if we were to try to go even further and describe human language in terms of the protons, electrons, and neutrons within our brain cells. If you go down to that level of detail, you lose the order that you can see in the larger brain structures. It’s not coherent.

Abha: LLMs work by performing vast numbers of matrix multiplications —- at the granular, detailed level, it’s all math. And we could look at those matrix operations, in the same way we can observe the quantum mechanics of billiard balls. And they’ll probably show us that something’s happening, but not necessarily what we’re looking for.

Ellie: And maybe part of when we're very frustrated with large language models and we're like, they seem like quote “black boxes” is because that's kind of what we're trying to do, right? Like we're trying to describe these higher level behaviors in terms of the matrix multiplications that implement them, which obviously they are implemented by matrix multiplications, but like, it doesn't correspond to anything that looks like anything that we can grab onto. So I think there's like this kind of higher level description that we all want. It's useful kind of for understanding the model for its own sake. It's also really useful for these questions about like similarity to humans, right? Because the humans aren't gonna have those exact same matrix multiplications. And so it's kind of like, what are the higher level abstractions that are being represented? How are they being operated on? And that's where the similarity is likely to exist. It's like we kind of need to invent fMRIs and EEGs and we got to figure out how to do that. And I think there's, there are some things that exist. They're good enough to start chipping away and we're starting to get some interesting converging results, but they're definitely not the last word on it. So I would say one of the most popular tools that we use a lot that I think was really invented maybe back around 2019, 2020 or something is called path patching, but that paper I think called it causal mediation analysis. I think there are a lot of papers that kind of have simultaneously introduced and perfected this technique. But it basically is saying like, try to find which components in the model are like, maximally contributing to the choice of predicting A over B. So that's been a really popular technique. There have been a lot of papers that have used it and it has made very reproducible types of results. And what you basically get is some kind of like, you can think of it, kind of like an fMRI, It's kind of like lights up parts of the network as saying these ones are highly active in this decision. These ones are less active.

Abha: So then, how do we get from path patching — this fMRI for large language models — to higher-level concepts like understanding, intentions, and intelligence? We often wonder if LLMs “understand,” but what it means to “understand” something can depend on how you define it.

Melanie: Let me jump up from the matrix multiplication discussion to the highest philosophical level. So there was a paper in 2022 that was a survey of the natural language processing community. And it asked people to agree or disagree with the following statement: “Some generative models trained only on text, given enough data and computational resources, could understand natural language in some non-trivial sense.” So this is sort of like in principle, trained only on language. So would you agree or disagree with that?

Ellie: I would say maybe I would agree. To me, it feels almost trivial because I think what's nice about this question is it doesn't treat understanding like a binary. And I think that's the first place where I usually start when people ask this question. To me, a lot of the debate we're having right now is not about large language models, it's about distributional semantics, and it's whether we thought distributional semantics could go this far.

Melanie: Can you explain what distributional semantics is?

Ellie: Yeah. You know, natural language processing has just been using text. And so using this idea that like the words that occur before and after a word are a really good signal of its meaning. And so if you look at, if you get a lot of text and you cluster things based on the words they co-occur with, then yeah, cat and dog and, or like maybe dog and puppy and Dalmatian will all occur together. Cat and dog and bird and other pets will co-occur together. Zebra and elephant, those will co-occur together. And as you get bigger models and more text, the structure becomes more sophisticated. So you can cut similarity along lots of different dimensions. It's not just on a one dimension, I've differentiated, you know, pets from zoo animals, but in this other dimension, I've just differentiated like carnivores from herbivores, right? So it’s obviously missing some stuff. You know, it might know a lot about cat and as it relates to other words, but it doesn't know what a cat actually is, right? Like it wouldn't be able to point out a cat. It can't see. So it doesn't know what cats look like and doesn't know what they feel like.

Melanie: So I think, you know, the results of that survey were interesting. That was in 2022. So it might be different now, but half the people agreed and half the people disagreed. And so the disagreement, I think, you know, the question was, could something trained only on language in principle understand language in a non-trivial sense? And I guess it's just a kind of a difference between how people interpret the word understand. And the people who disagreed, I would say that, like what you said, these systems know how to use the word cat, but they don't know what a cat is. Some people would say that's not understanding.

Ellie: Right, I think this gets down to like, yeah, people's definition of understand and people's definition of trivial. And I think this is where I feel like it's an interesting discussion to have like over drinks or something like that, but is it a scientific discussion right now? And I often find it's not a scientific discussion. Like some people just feel like this is not understanding and other people feel like sure it is. And there's no moving their opinions because like, I don't know how you speak to that. So the way you have to speak to it is try to figure out what's really going on in humans. Assuming we all agree that humans really understand and that's the only example we all agree on. We need to figure out whether it is. And then we have to figure out what's different in the LLMs and then we have to figure out whether those differences are important or not. And I think like, I don't know, that's just a really long game. So as much as I like kind of love this question, that I've increasingly gotten annoyed being having to answer it. Cause I'm like, I just don't feel like it's a scientific question, but it could be like, it's not like asking about the afterlife or something. It's not like outside of the realm of answerable questions.

Abha: In our previous episodes, we’ve talked about how one of the big questions around artificial intelligence is whether or not large language models have theory of mind, which researchers first started assessing with human psychology tests like the Sally-Anne scenario.

And a second question arose out of that process: if LLMs can pass our human theory of mind tests — if they pass Sally-Anne when the details and the names are changed — are they actually doing complicated reasoning, or are they just getting more sophisticated at matching patterns in their training data? As Ellie said, she cares that we’re intentional and scientific when we say things like, an LLM “understands” or “doesn’t understand.” And yet —

Ellie: They're learning much more interesting structure than I would have guessed. So I would say my general coming into this work, I would have called myself a neural network skeptic and I still kind of view myself as that, right? Like I'm very often like get annoyed when I hear people say stuff like they understand or they think. And yet I actually feel like I spend more of my time writing papers saying like, there is interesting structure here. They do have some notion of compositionality. Or they, And I actually do use those words a lot. I really try not to in papers, but when I'm talking, I'm like, I just don't have another word for it. And it is so inefficient for me to come up with some new jargon, so I anthropomorphize like crazy in my talks and it's terrible. And I apologize, blanket at the beginning, and I keep doing it. But I would say it's a one big takeaway is like, I'm not willing to say that they think or they understand or any of these other words, but I definitely have stopped making claims about what they obviously can't do or even obviously aren't doing, right? Because I had to eat my words a couple of times and I think it's just we understand so little that we should all just stop trying to call it and just like take a little bit of time to study it. I think that's like, okay, like we don't need an answer right now on whether they're intelligent or not. What is the point of that? You know, it's just guaranteed to be wrong. And so like, let's just take some time and figure out what we're trying to even do by asking that question and, you know, do it right. I think right now seeing LLMs on the scene, it's like too similar to humans in all the wrong kind of ways to make intelligence the right way to be thinking about this. And so I would be happy if we just could abandon the word. The problem, like I said, is then you get bogged down in a ton of jargon and like, I think we should all just be in agreement that we are in the process, and it might take a while, of redefining that word. I hope it'll get fractured up into many different words, and that a decade from now, you just won't even see that in the papers anywhere, but you will see other types of terms where people are talking about other kind of much more specific abilities.

Melanie: Well also just sort of willing to put up with uncertainty, which very few people in this field seem to be able to do.

Ellie: It would be nice if we could all just be like, let's all just wait a decade. Like I get the world wouldn't allow that, but like I wish we could just do that, right?

Abha: And Erica agrees. Her work with animals has made her pause before making assumptions about what other entities can and can’t do.

Erica: I keep going to new talks and I sort of have an opinion and I get a new talk and then I go, well, that's really interesting. And I have to kind of revise my opinion. And I push back a lot on human scientists moving the bar on, what makes humans unique? What makes human language unique? And then I sort of find myself doing that a little bit with LLMs. And so I need to have kind of a little bit of humility in that. So I don't think they have a theory of mind, but I think demonstrating one, that they don't and two, why they don't are not simple tasks. And it's important to me that I don't just sort of dogmatically say, “Well, I believe that they don't,” right? Because I think people believe a lot of stuff about animals and then go into it saying, “Well, I believe animals don't have concepts.” And then you say, “Well, why not?” “Well, because they don't have language.” And it's like, okay. So I think that LLMs are fundamentally doing next token prediction. And I know you can build them within systems that do more sophisticated things, but they're fundamentally, to the extent that my layperson understanding, I mean, I do not build these systems, and you know much more about this than I do. But I think that they're very good at predicting the ways that humans would answer those questions based on corpora of how humans answer either exactly those questions or questions that are similar in form, that are sort of analogous, structurally and logically similar. And I mean, I've been spending quite a bit of time trying to argue that chimpanzees have a theory of mind and people are, you know, historically, I mean, now I think they're becoming a little more open to it, but like historically have been quite opposed to that idea. But we'll very readily attribute those ideas to an LLM simply because they can answer verbal questions about it.

Abha: We’ll readily attribute human characteristics to LLMs because, unlike the chimpanzees Erica studies, they speak like us. They’re built on our language. And that makes them both more familiar to us on a surface level, and more alien when we try to figure out how they’re actually doing things.

Melanie: Earlier, Erica described a tradeoff in studying intelligence in animals: how much do we gain by using the metrics we’re familiar with, like human language, versus trying to understand animals on their own terms, like elephants that rumble through the ground to communicate?

Abha: And we asked Ellie how this applies to large language models. Does that tradeoff exist with them too?

Ellie: Yeah, totally. To me, from the point of view of LLMs, I actually think within our lab, we do a little bit of both of these. I often talk more about trying to understand LLMs in human terms. Definitely much more so than with animals. Like, LLMs were invented to communicate with us and do things for us. So it is not unreasonable or it's not unnatural to try to force that analogy, right? Unlike elephants, which existed long before us and are doing their own thing, and they could care less and would probably prefer that we weren't there at all, right?

Melanie: On the other hand, Erica finds them more difficult to interpret, because even though they can perform on our terms, the underlying “stuff” that they’re made of is less intuitive for her than animals.

Erica: You know, again, I'm not sure because An LLM is not fundamentally a single agent, right? It's a collective. It’s reflecting collective knowledge, collective information. I feel like I know much more how to interpret, you know, a single parrot or a single dolphin or a single, you know, orangutan, you know, performing on a task, right? How do they, sort of how do they interpret it? How do they respond? To me, that, that question is very intuitive. I know that mind might be very different from my own, but there is a mind there. There is a self. And whether that self is conscious, whether that self is aware of itself, those I think are big questions, but there is a self. There is something that was born into the world that has narrative continuity and one day will die, like we all will, right? LLMs don't have that. They aren't born into the world. They don't have narrative continuity and they don't die in the same way that we do. And so I think it's a collective of a kind that humans have never interacted with before. And I don't think that our thinking has caught up with the technology. So I just don't think that we're asking the right questions about them because I don't, these are, entities or collectives or, you know, programs unlike anything else that we have ever experienced in human history.

Abha: So Melanie, let's recap what we've done in this episode. We've looked at the notion of assessing intelligence in humans, non-human animals, and machines. The history of thought concerning intelligence is very much human centered. And our ideas about how to assess intelligence, it's always valued the things that are most human-like.

Melanie: Yeah, I really resonated with Erica's comment about our lack of imagination doing research on animals. And she showed us how a human-centered view has really dominated research in animal cognition and that it might be blinding us to important aspects of how animals think, not giving them enough credit.

Abha: But sometimes we give animals too much credit by anthropomorphizing them. Like when you make assumptions about what your dog or cat is quote unquote thinking or feeling, we project our emotions and our notions of the world onto them, right?

Melanie: Yeah, our human-centered assumptions can definitely lead us astray in many ways. But Ellie pointed out similar issues for assessing LLMs. We give them tests that are designed for humans, like the SAT or the bar exam, and then if they pass the test, we make the mistake of assuming the same things that we would for humans passing that test. But it seems that they can pass these tests without actually having the general underlying skills that these tests were meant to assess.

Abha: But Ellie also points out that humans often game these tests. Maybe it's not the tests themselves that are the problem. Maybe it's the humans or the animals or the machines that take them.

Melanie: Sure, our methods of assessing human intelligence have always been a bit problematic. But on the other hand, there's been decades of work on humans trying to understand what general abilities correlate with these test scores while we're just beginning to figure out how to assess AI systems like LLMs. Ellie's own work in trying to understand what's going on under the hood in AI systems, as we described before, is called mechanistic understanding or mechanistic interpretability.

Abha: The way I understood this is that, you know, she's looking at ways to understand LLMs at a higher level than just weights and activations in a neural network. It's analogous to what neuroscientists are after, right? And understanding the brain without having to look at the activation of every neuron or the strength of every synapse.

Melanie: Yeah, as Ellie said, we need something like fMRIs for LLMs. Or maybe we actually need something entirely different, since as Erica pointed out, an LLM might be better thought of as a collective kind of intelligence rather than an individual. But in any case, this work is really at its inception.

Abha: Yeah, and also as both Ellie and Erica pointed out, we need to understand better what we mean by words like intelligence and understanding, which are not yet rigorously defined, right?

Melanie: Absolutely not. And maybe instead of making grand proclamations like, LLMs understand the world or LLMs can't understand anything, we should do what Ellie urges us to do. That is be willing to put up with uncertainty.

Abha: In our final episode of the season, I’ll ask Melanie more about what she thinks about all these topics. You’ll hear about her background in the field of intelligence, her views on AGI and if we can achieve it, how sustainable the industry is, and if she’s worried about AI in the future. That’s next time, on Complexity. Complexity is the official podcast of the Santa Fe Institute. This episode was produced by Katherine Moncure. Our theme song is by Mitch Mignano, and additional music from Blue Dot Sessions.

I’m Abha, thanks for listening.