COMPLEXITY

David Wolpert on The No Free Lunch Theorems and Why They Undermine The Scientific Method

Episode Notes

On the one hand, we have math: a world of forms and patterns, a priori logic, timeless and consistent. On the other, we have physics: messy and embodied interactions, context-dependent and contingent on a changing world. And yet, many people get the two confused, including physicists and mathematicians. Where the two meet, and the nature of the boundary between them, is a matter of debate — one of the greatest puzzles known to science and philosophy — but some things can be said for sure about what can and cannot be accomplished in the search for ever-better models of our world. One is that every model must contain assumptions, and that there’s no way to prove a given strategy will outperform all others in all possible scenarios. This insight, captured in the legendary No Free Lunch theorems by SFI’s David Wolpert and William Macready, has enormous implications for the way think about intelligence, computers, and the living world. In the twenty-plus years since its publication, No Free Lunch has sparked intense debate about the kinds of claims we are, and are not, justified in making…

Welcome to COMPLEXITY, the official podcast of the Santa Fe Institute. I’m your host, Michael Garfield, and every other week we’ll bring you with us for far-ranging conversations with our worldwide network of rigorous researchers developing new frameworks to explain the deepest mysteries of the universe. This week we speak with SFI Professor David Wolpert about the No Free Lunch Theorems and what they mean for life, the universe, and everything…

Dive into David Wolpert’s website:

https://davidwolpert.weebly.com/

and Google Scholar page:

https://scholar.google.com/citations?user=PRjgI8kAAAAJ&hl=en

If you value our research and communication efforts, please consider making a donation at santafe.edu/give — and/or rating and reviewing us at Apple Podcasts. You can find numerous other ways to engage with us at santafe.edu/engage. Thank you for listening!

Join our Facebook discussion group to meet like minds and talk about each episode.

Podcast theme music by Mitch Mignano.

Episode Transcription

This is a machine-generated transcript provided by podscribe.ai with editorial help from Asha Singh. If you would like to volunteer to help edit this or future transcripts, please email michaelgarfield[at]santafe[dot]edu. Thanks and enjoy!

David Wolpert (0s):

Well, everybody in high school has had this little exercise of curve fitting. You draw down some graph paper, you draw some dots. You say, fit the curve. And you're like many other people. You might react with annoyance saying, “How are you going to prove to me that one fit is right, and the other one's wrong?” and No Free Lunch Theorem says, “You know, you annoying little brat kid who said that to the teacher, you were right!” In point of fact, if little Jack and little Jill are both doing curve fitting exercises, and you find that Jack does much, much better, in all the tests of curve fitting than Jill does, that doesn't mean that when they go out into the outside world, that Jack’s fits to curves are actually going to be any better than Jill’s fits, even though she won all the tests and exams inside the school. That's why you can't prove the scientific method works.

Michael Garfield (1m 12s):

On the one hand, we have math - a world of forms and patterns, a priori logic, timeless, and consistent. On the other, we have physics - messy and embodied interactions, context-dependent and contingent on a changing world. And yet many people will get the two confused, including physicists and mathematicians. Where the two meet and the nature of the boundary between them is a matter of debate, one of the greatest puzzles known to science and philosophy, but some things can be said for sure about what can and cannot be accomplished in the search forever better models of our world. One is that every model must contain assumptions and that there's no way to prove a given strategy will outperform all others in all possible scenarios. This insight captured in the legendary, No Free Lunch Theorems by SFI’s David Wolpert and William McCrady has enormous implications for the way we think about intelligence, computers and the living world. In the 20 plus years since its publication, No Free Lunch has sparked intense debate about the kinds of claims we are, and are not, justified in making.

Michael Garfield (2m 37s):

Welcome to Complexity, the official podcast of the Santa Fe Institute. I'm your host, Michael Garfield, and every other week, we'll bring you with us for far-ranging conversations with our worldwide network of rigorous researchers developing new frameworks to explain the deepest mysteries of the universe. This week, we speak with SFI professor David Wolpert about the No Free Lunch Theorems and what they mean for life, the universe and everything. For show notes research links, transcripts, and more, visit complexity.simplecast.com. If you value our research and communication efforts, please consider making a donation at santafe.edu/give and or rating and reviewing us at Apple podcasts. You can find numerous other ways to engage with SFI at santafe.edu/engage. Thank you for listening,

Michael Garfield (3m 33s):

David Wolpert, it is a pleasure to have you on complexity podcast. I am truly sitting at the feet of a master here for this episode. So I'm going to enter the dojo with beginner's mind and encourage all of the listeners to, to join me in shedding their priors, their assumptions coming into this conversation. And we'll just see if we can do something we don't usually do on this show, which is to look at an idea that you have developed over decades in your career and build it up from, sort of, the prima materia and then explore its implications.

David Wolpert (4m 20s):

Sounds cool. All kinds of metaphors that come to mind by the way, as somebody who's got true beginner's mind, true Buddha nature that you would like in the jujitsu match, I would be the one flipped out of the ring if you actually can achieve that state. So…

Michael Garfield (4m 35s):

Well, probably not. I'd like to start this conversation as we always do, but just a little bit about like a personal background of like how you became a scientist, because I know a little bird told me that there was a while there that you were actually considering pursuing a degree in creative writing.

David Wolpert (4m 56s):

Oh, geez. Yeah.

Michael Garfield (4m 58s):

I'm, I'm curious to know what got you onto the path that we're on today.

David Wolpert (5m 04s):

Okay, it's kind of funny. One of the places I've got to in my, in my journey, so to speak is that anything about me is irrelevant. It's all about what I can actually do for things outside. But for what it's worth, yeah, when I was an undergrad, I was thinking about math, physics and creative writing. That's kind of what I was thinking, coming in. Creative writing - I always gravitated, still gravitate, to the very short form. And I realized right off the bat that this was not a way I was going to make any kind of money. So scotch that. It was interesting, physics versus math, this is something I'm still not sure I understand in that I've always looked up to people who are mathematicians. They, from my perspective, at the feet of those daisies, they're able to do things that I could just dream of doing. But what I found to be strange when I was an undergrad, was that the people who are really, really good at math, who could kick my petunias out of the room in the math courses, they tripped and fell in the physics courses. And I never figured it out, but one way or another, after that kind of shaking and stirring, I ended up focusing mostly on physics.

Michael Garfield (6m 18s)

Maybe they just forgot there was a room.

David Wolpert (6m 21s)

It was very, very strange, I mean, it's something about the cliche about different minds, but, you know, there's like in math itself, there's geometric versus algebraic thinking, and, but there are also found that people that would, from the outside world might be viewed as almost synonymous, like Ed Witten is both a stunning mathematician and a stunning theoretical physicist. But when you get down like, many, many levels below Ed, there's actually a different kind of a mind that excels at math versus one that excels at physics. It's kind of peculiar. I'm not sure really what there is to that. But anyway,

Michael Garfield (6m 58s):

Maybe that's a David Kinney question. It kind of reminds me of, you know, where they say first, we must assume a perfectly spherical cow. And maybe it's a philosophical question. And to that I think, I hope that we get some time today to talk about implications for the philosophy of science, from your work, because that's key here, but what I want to talk to you about today with the assumption that we're going to get more of these, because the list of papers that you sent me by email that you have published over the years is just so vast in scope, that one episode is not going to do you justice. So I feel like today, what I want to do is I want to chunk out the No Free Lunch Theorem and walk people through what drew you in to this particular line of inquiry. And then we can, we can take it from there.

David Wolpert (7m 51s):

Okay, cool. So, I guess let’s start with the personal aspect of it, rather than it itself. One of the things I'm always concerned about in life is to make sure that you are very aware of your own defects and your own strengths so that you don't drink your own Kool-aid. One of my major character flaws, I have found over the years, trying to figure those things out about myself is that I am extremely sceptical. That's good for being a scientist, but it's not necessarily a good way to run through life. So whenever somebody would say A, I would always want to find a flaw in it, which is true for many cocky physicists and so on.

David Wolpert (8m 32s):

So anyway, I was working on what has since been, I'm going to be calling machine learning back in the early days before even that phrase was substantiated. And there were a bunch of people going around saying that we have first-principle proofs with no assumptions, that our particular algorithm is going to do better than other algorithms. So this is without any assumption about the world saying that um yes, um you should buy my thing because I can guarantee to you mathematically that my thing is going to outperform Mike Garfield's thing. I've got proof of it. It doesn't matter what the world is, this is mathematics. I don't need, you're talking about David Kinney and philosophy of science has all kinds of things they call about analytic versus synthetic philosophy, analytic truths being those that do not depend upon the actual state of the real physical world versus synthetic ones. And people were claiming that you could actually do what's called inductive inference, making predictions, doing machine learning, purely analytically, without worrying about the state of the real world.

David Wolpert (9m 38s):

So my character flaw got really all very stimulated by this and said, bull feces, no way shape or form can that possibly be true? It's obviously not true. I can have as much data that I want that makes me think that tomorrow the sun's gonna come up, but it might not come up, and you're not going to be able to give me a mathematical proof to the contrary.

David Wolpert (10m 03s):

It doesn't matter how many times I've been through it, sunrise. And we can't say that evolution managed to produce a version of me that is really able to do these predictions really well because the same argument holds at a higher level. Evolution in the past couple of billions of years, it's all been producing new organisms, new predicting machines that have been based upon conditions that for all we know might stop. It's like the warning message at the bottom of prospectuses for mutual funds. Past performance is no indicator of future performance. So anyway, um frankly, I tried to shut up these other people who are making claims about both in the context of optimization theory and in the context of machine learning that they could prove things from first principles. I wrote what I thought was a kind of a trivial theorems proving no, that's not true.

David Wolpert (11m 04s):

In a formal sense, well, it's kind of funny how people have misinterpreted the No Free Lunch Theorem I use as a trick, a mathematical trick saying, let's assume that basically all worlds are equally probable a priori, that there is no basis for one or the other, given that assumption you're able to prove - this is me and Bill McCrady and some other people I was collaborating with. He was also at the Santa Fe Institute at the time - using that trick, you're able to prove that point of fact, Mike's algorithm is going to completely fail relative to David's just as often as things go the other way around, that everything is completely even. that there are many worlds in which Mike kicks David as in which David kicks Mike. And the people misinterpreted it as though we were assuming that all worlds were equally likely, but it was actually just a mathematical tool to prove that there are in fact as many worlds in which the sun will not rise tomorrow as there are in which it will rise.

David Wolpert (12m 08s):

And you're not going to be able to use math to do that, it's a contingent fact based upon the nature of the real world, so it's kind of obvious. And I sort of, Bill and I, we publish a paper too, we assumed that it would, people would come back and say, yeah, yeah, yeah, that's what I meant all along, and we would just move on. But that's not what happened, there were major wars. I, we were just little, you know, little kids. We were clever, we were postdocs just like so many others at the SFI are, but in other words, we were not established with anything like tenure behind us and so on. And there were things like keynote speeches in conferences where I was called - keynote speech very first sentence - David Wolpert is a neanderthal on the postdoc. And there was a lot of this kind of stuff going on. So, actually I left academia for a while because of all of this. And I never really came back full hog to machine learning because it was so vicious. It was impressive.

Michael Garfield (13m 16s):

So the No Free Lunch piece of this, this, this, just for folks that don't get the reference. The, my understanding is that this is about bars in the 19th century, trying to lure in clientele with the offer of free food. When you purchase a drink,

David Wolpert (13m 33s):

Oh, you mean the phrase No Free Lunch.

Michael Garfield (13m 36s):

Yes, to, to provide like a, a street level example of what we're talking about here, we're talking about like, you, it would be impossible to find a restaurant in which you can acquire

David Wolpert (13m 52s):

Where you can have your cake and eat it too, without paying for it being the crucial thing. So actually that wasn't a phrase due to me or Bill, it was suggested by somebody else. Yes, and the general, the reference is that if you are going to be getting anything, you got to pay for it, you don't get something for free, you don't get a lunch for free. You might think that you do, but at the end of the day, the restauranteur or whoever's going to come to you and say, that was, glad you enjoyed the meal. Now we have an hour of mop work back in the kitchen that we need you to do to pay for it. And you always have that hour of mop work.

David Wolpert (14m 28s):

And so, in this context, the idea would be similarly, something along the lines of, yeah, you can get your algorithm to perform well, that's the lunch, but no you're going to have to pay for it, and that you are making assumptions. And to scientists working on machine learning, making assumptions about the real world is something that is a cost, you don't want to do that. You want to be able to, I can sell you a whole, much, many more autonomous vehicles if I tell you that I got mathematical proofs, that their AI algorithms are navigating without any assumptions based on the real world. I can prove it to you, just like I can prove the quadratic equation or anything else, that my autonomous vehicle is going to be doing better than somebody else's. That's the way to make a lot of money. People don't like it when you actually say no, that's not the case. But so that would be the, the free lunch, being able to have that kind of autonomous vehicle where it's, you don't have to pay with any kind of assumption about the real world that you can prove from first principles that this autonomous vehicle does better than some other one.

Michael Garfield (15m 32s):

So, the idea of an organism as an inference, you know, a hypothesis about its environment. This is kind of akin to, as I understand it, the statement out of evolutionary biology, that there is no evolutionary up, that there's contextual fitness, but like an alien when Ash says that the Xenomorph the alien monster is the perfect organism, that there is no such thing. That it’s counterfactual because you know, evolution could not produce something that would not be outperformed in some unanticipated…

David Wolpert (16m 07s):

That's the, that's the nub of it. And um, because the actual precise formulation is not that there could not be a perfect thing, but that you cannot prove it the way that you can start, like, I don't know what the precise, best way to pitch this is, but a lot of people have encountered Pythagorean theory. If I give you a triangle where it's perfectly 90 degrees and so on and so forth, we know that the length of the hypotenuse, its square is equal to the sum of the squares of the other two. And this gets into all kinds of extraordinarily deep math from that theorem and so on and so forth. But people have seen, when you were in high school, most people have seen actually proof of that, from first principles. The laws of physics can change. The state of civilization can change. The biosphere we live in can change. It won't matter that the Pythagorean theorem will still be true. You can prove it without having to be contingent on anything about the outside world. If there were an organism going through evolution, that for some reason or other, was going around saying, “The Pythagorean theorem! The Pythagorean theorem!”, it would always be correct. And it doesn't matter what ecosystem it finds itself in.

David Wolpert (17m 23s):

This is the basis of a lot of the stuff in SETI, the search for extra-terrestrial intelligence. People like to say that it doesn't matter how alien aliens are, they will still, for lack of a better word, and it's a very lowly one, they'll still believe math, they will still know math. They will still be able to grok, to use another very dated phrase, the Pythagorean theorem. That'll still be something that's in their little kit of truths. That is very, very different from contingent facts. Things like the laws of physics, things like the laws of biology. I can very easily imagine a biosphere in which it's not true that you have genotypes that code for phenotypes by going through messenger RNA and transfer RNA, that living systems work entirely differently.

David Wolpert (18m 12s):

In fact, that's what the whole field of artificial life is about, is trying to cook up such things and see where they take us. But you're not going to be able to cook up any such world in which the Pythagorean theorem fails. So, what the hell has that got to do with anything with the No Free Lunch? It's a no-go theorem. It's an impossibility theorem. It says that if somebody comes to you and says, “I've got an autonomous vehicle that, independent of the state of the world, just like I can prove to you the Pythagorean theorem, I can prove to you that this autonomous vehicle is safe”, don't believe them. If somebody comes up to you and says that “I can run an AI algorithm, it's going to predict the next election. And I can prove mathematically that it’s correct”, just like the Pythagorean theorem, don't believe them. Those things are always going to be contingent. And the No Free Lunch Theorem says that as soon as you get something else, it basically is a way of formalizing the dividing line between contingent truths, like who's gonna win. And here's where my prayer hands come out and so on, versus ones like the Pythagorean theorem. And dear God, I wish I could use the Pythagorean theorem to actually right now answer that first question. But you can't, and that's what the No Free Lunch Theorem tells you. Nope, we can't do that now.

Michael Garfield (19m 25s):

There's a paper that you wrote with Bill McCready on co-evolutionary free lunches. Again, you're talking about that dividing line between the sort of Imperion perfection of the abstract mathematical, like Sean Carroll multi-verse, in which we're having this conversation. And then the reality of the actual universe that we have, you know, the N equals one, this is our biosphere. This is what we have. So I'd like to talk about where there are free lunches.

David Wolpert (19m 53s):

There's a bunch of them. Yeah. And so this is what I think, I mean, this is, to be honest, I was really kind of disappointed in machine learning because to me, these kinds of issues are the skeleton upon which we could then figure out all kinds of very innocuous assumptions that everybody can agree on, that we could then use to close the skeleton a little bit and have those go into our autonomous vehicles. But I don't know, it threatened too many people's livelihoods or something, but there are free lunches. Like as Dr. Ivy had way back in the day, I had a pair of back-to-back papers, one called the lack of a priori distinctions between the algorithms and one called the existence of a priori distinction between algorithms. They were back-to-back.

Michael Garfield (20m 40s):

Is that a reproducibility crisis?

David Wolpert (20m 43s):

Yeah, the big four. So the No Free Lunch Theorems say that you cannot have a mathematical proof of a certain sort. In a certain sense you can view them as like Girdles and Completeness Theorem. Girdles and Completeness Theorem is also putting limitations on what you could ever mathematically prove, no matter how long you want to work at it. No Free Lunch Theorems are the same common thing. There are limitations, for example, way back when people, you know, all kinds of arguments about whether you should use genetic algorithms or simulate a kneeling or hill search or things like this, in doing optimization. And one of the very first papers that Bill and I did said, No Free Lunch, you cannot actually prove from first principles that any one of those search algorithms is any better than any other without making assumptions, the same kind of thing in evolution that you can't do it. So that's what the No Free Lunch Theorems were, but they were very, very finely tuned, like the Coevolution and Free Lunch Theorems, which have to do with natural selection, I mean, that's what Coevolution is all about. They were saying, expand your sandbox a little bit. Rather than just thinking about things like David and Michael going head-to-head and making predictions about the precise swing of the popular vote a month from now, let's say, that's what we're doodling about.

David Wolpert (22m 10s):

And I've got my algorithm that makes a prediction and you've got yours. And that's a very, very precise thing. And the No Free Lunch Theorem states, you cannot mathematically prove that Michael's is better than David's. Maybe it is, maybe it isn’t, it depends on the nature of the physical world. It turns out that there are other things where you can prove that, so to speak, David and Michael operating together as a team can beat certain other teams in a co-evolutionary context.

David Wolpert (22m 38s):

So for the listeners out there in podcast land, the difference between evolution and co-evolution - the notion of evolution, fitness landscapes, this goes back to someone called Ronald Fisher, who by the way, turned out to be a racist dude. And that's causing all kinds of problems because there are buildings that have his name on them, but he was also one of the most brilliant statisticians and theoretical biologists ever. There's something called the Fisher Information, for example, which is all about him, Fisher's Fundamental Theorem of biology.

David Wolpert (23m 11s):

So anyway, there was Fisher and Sewall Wright and so on. They had this notion of a fitness landscape, which is the idea that if David and Michael are both undergoing evolution, we have our offspring who had their own offspring and so on and so forth, that we're both trying to do a search over a fitness landscape. It's the same landscape. We both want to get up to the high peaks of that fitness landscape, and that's how we do well. That's sort of the early, very early 20th century, simple-minded, first path version of natural selection. That's called evolution. Co-evolution says, “Well, no, no, no, no, no!” What actually determines David's fitness landscape partially is whatever the hell Michael's doing and what determines Michael's landscape, partly it's whatever the hell David's doing. So it's not like we're both independent agents trying to find the best we can do over this landscape. It's a co-evolutionary process. We are making one another's fitness functions simultaneously.

David Wolpert (24m 11s):

There are similar things in economic systems. It's not that different companies are all independently trying to find the best widget that they can then sell to the populus. Instead, what the populus is actually going to want to buy from me is going to depend a bit on what the other companies are doing, because they're going to be affecting the preferences of the populus, and what kinds of things might actually be able to partner with my widgets and vice- versa. It’s all one big hairball - that's a polite way of phrasing it for a podcast! There are other phrases with biological connotations but it’s one big hairball. Everybody's affecting everybody else simultaneously, we’re not all independent. That's the essence of co-evolution versus evolution.

David Wolpert (24m 54s):

One of the huge understandings and appreciations of localized game theory, all kinds of stuff, all kinds of edifying stuff, it's all about co-evolution. So anyway, we can now ask the question, if David versus Michael, in a standard evolutionary situation, we're saying who's going to do better at getting to high fitness values where we're both evolving at this fixed independent fitness landscape, just like in Fisher and Sewall Wright, that's where No Free Lunch comes in and says that you cannot, without knowing the shape of that landscape, you cannot say who's going to do better - David or Michael. They’re two search algorithms, trying to get higher than that landscape.

David Wolpert (25m 33s):

But now let's say you've got a more complicated situation where David's landscape has been determined by Michael and vice-versa. Can it be the case that they can both behave in such a way so that both of them benefit and get to higher fitness values? Can you prove that in first principles, and there, the answer turns out to be yes. So another way to phrase it is that you can, from first principles, you can prove that certain structures of human societies where they're interacting with one another, yes you can prove that as just as the Pythagorean theorem holds, some of those structures will actually benefit the aggregate without any dependence on contingent facts of the world. Whereas everybody's in isolation, now that is going to depend upon contingent facts of the world. So I don't know, health draws a metaphor too many.

Michael Garfield (26m 23s):

Yeah, no, that was good. What got me interested in the work that SFI was doing was I found Martin Novak's papers with Natalia Komarova on the evolution of syntax and language in 2005. And didn't realize David Krakauer had been working with Novak on this stuff in like earlier papers. So it was a weird random walk directly to your faded address coming to work here. Yeah, that work was seemed immediately generalizable to stuff like the advent of multicellularity and organisms or the emergence of social primates. You know, there's a sense that there are, and I know that there's far more forms of co-evolutionary gaming than just endosymbiotic inclusion.

Michael Garfield (27m 17s):

I think in this paper that you wrote with McCready, it's, it's more about choosing a champion and like players training someone that they then sort of delegate, whereas like there's this other piece of it, which is, and I'm wondering if I'm getting this, that it seems as though there is a sort of, I guess the formal term would be like a macro-evolutionary ratchet into these more collective organisms.

David Wolpert (27m 46s):

So a couple of things going on there, one of them are kind of meta statement about science is that we had this general intuition, that co-evolutionary kinds of things were not going to be subject to the No Free Lunches extreme forum. So what we then did was try to find the simplest kind of a scenario that we could analyse, that would illustrate that. So that was too largely what was going on. Another one, that whole notion of choosing your champion, that actually is another aspect of kind of the sociology of science in all of these modern reinforcement-learning based game players, AlphaGo all of these other ones. So not, you know, Deep Blue, where you just do it all by hand, but the ones that are based upon learning. What's going on in all of them, what's ubiquitous is that you have the algorithm that's learning how to play, and then you've got another antagonist that's learning how to try to screw the odds with him. And that's what was what's called adversarial networks, it's one of the deep tricks in all machine learning these days. Now, if you trace back its evolutionary process, that idea, it traces back to some work primarily by a guy with David Fogel and way back when, when he actually designed checker players using what he called co-evolution.

David Wolpert (29m 16s):

So the idea was that on the one hand, you got a learning algorithm of your checker player, but you also got its opponent that you're also learning training independently, trying to have it find situations to screw the checker player. You're trying to come up with game situations that will screw your checker player and so on and so forth. And that was actually the major insight. Everything subsequent to then, as is the case in so much of machine learning, has been really machine learning has been window-dressing based upon elaborating and embellishing those deep insights. So anyway, part of the reason that we chose that particular champion model that you were talking about for the paper was we wanted something that was simple to analyse, but also at the time, this was one of the new ideas of them that was circulating, was co-evolution as it was being called in the sense of training checker-playing algorithms. And so that's why we were thinking in terms of training champions,

Michael Garfield (30m 15s):

Right? I'm thinking back to the conversation David Krakauer and I had about your piece for the SFI transmission series early back in the spring, you're talking about the Landauer Bound and how the virus sort of outsources its computational duties to the host organism. The conversation I had with Geoffrey West on the podcast back in like 35 and 36, when we were talking about his findings that the human being is way, way off of the line, when you look at where we should be for our body mass and our metabolism, and like the amount of energy used by a modern human being includes, you know, all of these remote server farms and all of this other stuff that we don't conventionally think of as bound within that particular individual.

Michael Garfield (31m 06s):

I guess when I hear this stuff about adversarial learning, it sounds like this is interrogating where we decide to draw the boundaries between individuals and maybe we're skirting over into some of the other work you've done on like information theory. But it suggests that even in antagonistic relations, that there is a sort of higher, like a coarser frame, where both parties, the components of a larger system that's outperforming because it has folded up opposition into itself.

David Wolpert (31m 42s):

Lennon and McCartney, Jagger and Richards, Picasso and Matisse – the whole notion that the best I can hope for, if I'm really good, if I want to be really good, I've got to have somebody that I can measure myself against, and who can spur me to do better things and I've got to get p****d off that they can outcompete me. And yeah, that's a dynamic that happens a lot, and that is very much part and parcel of the adversarial formulation. But they are finding the chinks in my armor and therefore, by fixing that armor, I'm actually making myself, you know, be far stronger and far better than if I don't have them to find those chinks in my armor, then I'm screwed. I need them to do that for me. And I then do it for them.

Michael Garfield (32m 31s):

So I guess really what I'm getting at is, is it distinct from a kind of a champion type scenario because both of those scenarios are team play, even if the agents in those different scenarios may not regard themselves as on the same team?

David Wolpert (32m 47s):

Yeah, so I think what you're getting at is where do you draw the dividing line? So if I want to look at the Red Sox and Yankees, if I take them as a unit, then No Free Lunch applies. If I take the outside landscape as a fixed thing that they are both trying to be the best joint baseball team, whatever that would mean, then you have no free lunch. But if they are instead spurring one another on, so in fact there are no other teams – so I mean I'm trying to sort of run with this contorted analogy I'm making with baseball - but how many teams are there even?

Michael Garfield (33m 25s)

They’re like monopolies, like corporate duopolies.

David Wolpert (33m 29s)

Sort of there's all other kinds of things that go on there, in terms of profit taking and so on, they don't really spur one another. In real economics, what duopolies do, all duopolies in general. is that they carve out niches and then they will not compete. But here's kind of what it would not do. How many baseball teams are there now, even? I don't know, make up a number…

Michael Garfield (33m 48s):

Wrong guy, dozens, 36.

David Wolpert (33m 50s):

So dozens, whatever, let's say, there's three dozen, that's 36. And here’s your Googlian cheater, talking about outsourcing your complex computation and so on, and where do you end? And what's the end of the individual and the extended phenotype and all that stuff. So the Red Sox and Yankees, if they were the only two teams in baseball, and they were spurring one another on. And what they wanted to do was - I'm not sure this metaphor is going to really work - but it's got to do with, as you were saying, in terms of where you draw the line between individual versus pairs, if the Red Sox and Yankees are all that they are, so they're setting one another's fitness function, then they can spur one another to do better and better in some kind of objective absolute sense.

David Wolpert (34m 35s):

Whereas, if they are instead embedded in a bigger league, then units of Yankees Red Sox can't actually do any better due to No Free Lunch than would any other arbitrary units, if that makes any kind of a sense. I mean, some of these things, I mean to riff on it, here's another example that's actually in many, many ways more striking, and it'll take a little bit of work to get this to a level that when we can convey it in a podcast, but there's this thing called the scientific method. And we all like the scientific method. It does great things. It gives you refrigerators like that one behind your head. And it helps us spread coronavirus across the world, much better than we would be able to if we hadn't had the scientific method giving us airplanes. All kinds of benefits that come with the scientific method.

David Wolpert (35m 26s):

Okay, cool. But let's be a little bit more, sort of head in the clouds, philosophical about it, we’re pointy head, geeky academics. Let's try to think - why might the scientific method work and how might I actually exploit the scientific method, if for example, I was doing machine learning? Well, that's something that might be cool if you want to make a ton of money in machine learning, the scientific method, that works well. I'm doing similar kinds of things - I need to make stuff that will be able to predict and do well, just like mathematical experience that can predict and do well. So can I exploit it? And so it turns out that you can, and the way that you can is with a technique called cross-validation.

David Wolpert (36m 05s):

So here's the basic idea behind cross-validation. Michael and David are two training algorithms that we're going to be training to do something like say, image recognition. We want to train David and Michael, both of them, they're going to be trained on a dataset of images that are either dogs or cats, and they've gotta be able to then be able to predict well for some completely new images they're not seeing whether it's a dog or a cat.

David Wolpert (36m 30s):

Now along comes David Krakauer, head of the SFI. And he says, “”=Well, I want to be able to choose whether I should now hire Michael or hire David as my machine that's going to go out and distinguish cat images from dog images, but I've only got this fixed set of images. How am I going to choose which one I want to hire? I better choose either David or Michael, but how am I going to choose?” Well, there's the obvious thing. Take my fixed set of images. Let David and Michael both learn as much as they can from a subset of those images. And now simply see which of them, David or Michael, who does better on the remaining images that they haven't seen? And use that to decide who I want to train on all of my images, and so who I'm going to hire and go out in the world. What’s really important to me is to be able to distinguish images of dogs versus cats.

David Wolpert (37m 25s):

That is in a certain sense, the scientific method. You're choosing between David and Michael - who's a better theorist - by saying, well, who, when they make theories. comes up with one that does better on experiments they've not yet seen? It's also, it turns out, a technique called cross validation, which is embedded in every single machine learning algorithm out there. It is how you choose among parameters in algorithms are true among algorithms to see which one does better. And it seems almost controvertible. That's got to do well, scientific method has to do well. It's very, very natural. How could it not work? Well, it turns out that this is a domain in which No Free Lunch actually does work.

David Wolpert (38m 10s)

And moreover, what it lets us do is the following. Let's say that David Krakauer comes along and chooses whether to hire Michael or David, based upon who does better at making the classification of dogs versus cats on images that they haven't seen yet - a very reasonable way for David to come along and make his choice. Then Geoff West comes along – Gandalf - a very curmudgeonly kind of a fellow, who says, “No, David, I don't believe any of that. What I'm going to do is I'm going to also choose between your Wolpert and Michael Garfield and hire one of them to tell me whether a new image is a dog or a cat. But the way that I'm going to make my hiring decision is not to see whether David or Michael is more accurate on a given test set that I'm giving them, like you're doing David Krakauer, silly man. Instead, I'm going to choose between them based on who does worse.”

David Wolpert (39m 09s)

So David and Michael have both come up with their theory, their little code. They've written their little algorithm to choose between to classify an image on whether it is a dog or cat. And there I now have a test set in which we're taking these algorithms, these codes and seeing which one does better. And Michael's code does better on this little test set that neither David nor Michael has seen before than those for better classifying dogs, which is captain David's code. So therefore David Krakauer hires Mike Garfield. Geoff West in the same situation, being curmudgeonly and a contrarian, he chooses David, who on all the experiments so far, David has ucked up, he has gone down in flames. Michael did much better. Based upon that, Goeff West, the contrarian chooses David,

David Wolpert (39m 59s)

Guess what No Free Lunch Theorem says, it says that there are many real universes in which Geoff West just made the better choice as in which as David Krakauer did. And it is extremely weird that that should be the case. No Free Lunch Theorems tell us that's true. It contradicts common sense, and in terms of the mathematical theory, it's actually very, very difficult to make test domains like the fictional dogs versus cats that actually demonstrate this.

David Wolpert (40m 32s):

They have to be as common as test domains in which the scientific method works. The anti-scientific method has to actually beat the scientific method just as often as the other way around. But it's very, very hard to construct little experimental test beds in which that's true, but the No Free Lunch Theorems say it is true. And this is something very deep about our world that the scientific method actually works. Nobody actually, rather than the ad hoc technique of cross- validation, nobody's got a mathematical Pythagorean theorem proof across co-evolution works, because you can't have one like No Free Lunch, but it does work. It works damn well better than anything else. And nobody has a f*****g clue as to why science works, in this kind of a formal proven, like you prove the Pythagorean theorem, sense.

Michael Garfield (41m 25s):

So this seems tied to the question of why does math work at all?

David Wolpert (41m 31s):

They’re related, yup.

Michael Garfield (41m 32s):

I'm glad that we're here now, because you know, in the philosophy of science question of, you know, like Simon Tedale gave that talk to the applied complexity network last week about conspiracy theories, and how you can't really fault someone for being a conspiracy theorist because we're all applying an aesthetic preference, if you will, towards, you know, we're, we're seeking simple, consilient, unifying explanations. And so we look for the lowest Komogarov complexity or whatever.

David Wolpert (42m 10s):

Yeah, that's an easy one, No Free Lunches smack up in the middle. It says going for …raiser, lowest common Komogarov complexity, that's an easy one. It's easy to find situations where that dies. Cross validation is the harder one. Yeah, but you're right, it's the same adjacent thing that the conspiracy theorists, it's a contingent fact of the world. Is Q Anon full of it or are they actually, everything that they're saying is true? Well, if you're just a little bit science-fictiony, you can very easily imagine, or you were talking about multi-verses before. There are many, many ways that you can imagine there might be a reality, which agrees with everything that we've seen in our lives, you and me and everybody else, which says no, that's a crock, but it turns out to be true. You can't use the Pythagorean theorem to distinguish between those two.

David Wolpert (42m 59s):

There's this, you said it's just like Wigner’s comment about the unreasonable effectiveness in the bath, in a physical world. It's actually in a certain sense, putting a wedge between math and the physical world. And that it's saying, here's a sort of phrasing it, the mystery of why the scientific method works. Maybe I actually was too simple-minded in my thinking about what your statement was. One way for the No Free Lunch Theorem is that it drives a wedge between math and physics. Over on that side are things you can prove with math. And over on this side are things are going to be contingent on the state of the world, with all the sciences, biology, chemistry, physics, economics, sociology, whatever.

David Wolpert (43m 39s):

And there are some things that don't fit there. I'm just saying that you can't actually take all that power over on the left-hand side of the fence and use it to resolve things on the right-hand side of the fence. It depends on what reality happens to be. But people like Wigner have found that that might be true. He didn't know No Free Lunch Theorems. I actually knew him by the way, very, very briefly, nice guy. He didn't know No Free Lunch Theorems, but he did understand that there was a mystery here about why the mathematical power should actually be so applicable. Why the stuff from the left-hand side of the fence should have so much power on the right-hand side of the fence, and the scientific method, which we can actually formalize. You can write it down. What does it mean? It means cross-validation. We know what it means that there's, in a certain sense, an even deeper mystery why the scientific method works, because whereas why math works or not is a hard question to even formalize, we can formalize question whether the scientific method works or not, and we can prove that it shouldn't. But it does.

Michael Garfield (44m 43s):

Yeah, exactly. You're right to note, there are two theorems here. One focusing on inference and one on search, right?

David Wolpert (44m 54s):

Yeah. Optimization in general.

Michael Garfield (44m 56s):

Yeah, I'd like to hear you differentiate those a little bit and cause, cause you know, this is related to a lot of the other work that's going on at SFI and just sort of understanding a general theory of intelligence and like what intelligence and adaptation even are, when we're defining complex systems by calling them adaptive matter. What do we mean? So I mean, aside from the fact that this is arguably, reality is where formalisms fail, right? That this is a, you know, we're, we're just, we're trying to climb up a vertical slope here. The theories we construct are just like whatever sand pile we happen to be standing on.

David Wolpert (45m 39s):

Yup.

Michael Garfield (45m 40s):

I'm curious to hear you talk about differentiating those two theorems and what that implies about intelligence more broadly.

David Wolpert (45m 50s):

Okay. So those two theorems actually in a certain sense, aren't differentiated, we've found after the ones for machine learning were written, those are the first ones, proving that, I mean, the intuition is actually another way I've seen. It's very, very simple. Everybody in high school has had this little exercise of curve fitting. You draw down some graph paper, you draw some dots and you say fit the curve. And you're like many other people you might react with annoyance saying, “How you gonna prove to me that one fit is right and another one's wrong?” And No Free Lunch Theorem says, “You know, you annoying little brat kid who said that to the teacher. You were right.” And it's similar. If you ever see these things like the back of newspapers, or I guess online versions now, where they are intelligence tests, complete the following sequence. And if you're like me, you would look at that and get really annoyed and say, “This is stupid! There's no right or wrong answer. I can put any number I want down there, any letter. And you're not going to be able to prove to me I'm right or I'm wrong. This is a stupid intelligence test!” And that bratty kid, that bratty person reading these intelligence tests, the blank papers , they're right. That's what the No Free Lunch Theorem says. That's the core idea.

David Wolpert (47m 03s):

What it was first applied to was what's called machine learning, because curve fitting is all that machine learning is. Curve fitting, you've got that little graph paper. Rather than dots that you're drawing a line through, on your x-axis are instead images. And then your Y axis, it's basically labels: are you a cat or are you a dog? And all any image recognition machine is doing is curve-fitting, in that rather more exotic kind of a space, that's it.

David Wolpert (47m 32s):

And No Free Lunch says, just like you were doing in school, when you're fitting those dots, there's no right or wrong answers. You can't prove that there's one that's right or wrong. And then this whole business about cross-validation says that, in point of fact, if little Jack and little Jill are both in curve-fitting exercises and you find that Jack does much, much better in all the tests of curve-fitting than Jill does, that doesn't mean that when they go out into the outside world, that Jack's fits to curves are actually going to be any better than Jill's fits, even though he won all the tests and exams inside the school room. That's why you can't prove the scientific method works. The scientific methods all work in the school room where Jack, who happens to be the person who came up with quantum mechanics, the people who came up with quantum mechanics, they did a much, much better curve-fitting. But then when we go outside of that, then it might turn out that Jill's actually doing much, much better than that in point of fact, ….. is correct or something like that. And so to answer your question in this roundabout way, we realized that that same basic idea could apply if you are trying to do a search algorithm when you don't know what the underlying fitness function is, or if you're trying to draw a curve between some dots. It's the same basic principle that the bratty kid enunciated so well back in second grade.

Michael Garfield (48m 55s):

Yeah. For instance, Melanie Mitchell and Jessica Flack’s recent piece in Aeon, talking about how you create key performance indicators in the workplace, and then people attune to that, and then there's a drift and you're not actually measuring what you thought you were measuring. Everyone knows good students don't, I mean, um, I was a high-testing kid.

David Wolpert (49m 16s):

Laughter.

Michael Garfield (49m 16s):

That was a rude awakening. I know you got a hard stop in about a minute, but I just have to ask, you know, given the vertigo of this revelation, how you even continued doing science for another 20 plus years?

David Wolpert (49m 36s):

Yeah, one minute it's turtles all the way down. It is exactly as you said, what sandbox you choose to be in. There has never been an argument that anybody has offered that refutes the hypothesis that you’re a brain in a vat. It gets better, in terms of your memories, which you think are real. From a physics point of view, memories are just retrodiction. They are predictions of the past, rather than the future. There is actually no legitimate reason for believing your memories, any more than you would believe your predictions. It's just, we'd like to think that this fore-accuracy is better but we can't be sure. And No Free Lunch, in some of my notes to myself, I refer to that as THE PROBLEM, all in caps, because it's saying a similar kind of a thing with the entire scientific method. All of these things are different aspects of the fact that it's turtles all the way down, that doesn't bottom out.

David Wolpert (50m 31s):

It’s when you appreciate that, that you go back to the beginner's mind like you were talking about today, this whole podcast ties itself up. That's when you get to Zen. That's when you achieve such heart, you realize that no, there ain’t nothing down there. All that we can be doing is trying to figure out some of the structure up at this level that we happen to be. You know this turtle is standing on that turtle, standing on this turtle. So, but let's take that bottom turtle as a given. I can say something about this local structure of turtles, but there's nothing underneath it all. Ultimately, any turtle I might want to stand on might just burp and yertle the turtle, and we all come falling down.

Michael Garfield (51m 09s)

Well, that feels like a timely sentiment. That's an important thing to bear in mind, amidst the turbulence and uncertainty of our time as David. Thank you so much for taking the time.

David Wolpert (51m 27s)

All right, I hope people get some value out.

Michael Garfield (51m 32s)

Yeah. And I look forward to having you on again, to lead us by the hand into, you know, some other ineffable zone.

David Wolpert (51m 40s)

Yeah. Philosophy hits the road

Michael Garfield (51m 44s)

Take care.

David Wolpert (51m 45s)

Take it easy.

Michael Garfield (51m 48s):

Thank you for listening. Complexity is produced by the Santa Fe Institute, a non-profit hub for complex systems science located in the high desert of New Mexico. For more information, including transcripts research links and educational resources, or to support our science and community efforts, visit Santa fe.edu/podcast.