COMPLEXITY

Sidney Redner on Statistics and Everyday Life

Episode Notes

Complexity is all around us: in the paths we walk through pathless woods, the strategies we use to park our cars, the dynamics of an elevator as it cycles up and down a building. Zoom out far enough and the phenomena of everyday existence start revealing hidden links, suggesting underlying universal patterns. At great theoretic heights, it all yields to statistical analysis: winning streaks and traffic jams, card games and elevators. Boiling down complicated real-world situations into elegant toy models, physicists derive mathematical descriptions that transcend mundane particulars — helping us see daily life with fresh new eyes.

Welcome to COMPLEXITY, the official podcast of the Santa Fe Institute. I’m your host, Michael Garfield, and every other week we’ll bring you with us for far-ranging conversations with our worldwide network of rigorous researchers developing new frameworks to explain the deepest mysteries of the universe.

In this episode, we speak to SFI Resident Professor Sidney Redner, author of A Guide to First-Passage Processes, about how he finds inspiration for his complex systems research in the everyday — and how he uses math and physics to explore hot hands, heat waves, parking lots, and more…

If you value our research and communication efforts, please rate and review us at Apple Podcasts, and/or consider making a donation at santafe.edu/podcastgive. You can find numerous other ways to engage with us at santafe.edu/engage. Thank you for listening!

Join our Facebook discussion group to meet like minds and talk about each episode.

Podcast theme music by Mitch Mignano.

Follow us on social media:
Twitter • YouTube • Facebook • Instagram • LinkedIn

Key Links:

Sidney Redner’s SFI Webpage
Redner’s textbook, A Guide to First-Passage Processes

Papers Discussed:

Kinetics of clustering in traffic flows
Winning quick and dirty: the greedy random walk
When will an elevator arrive?
Role of global warming on the statistics of record-breaking temperatures
Understanding baseball team standings and streaks
Random Walk Picture of Basketball Scoring
Safe leads and lead changes in competitive team sports
Simple parking strategies
A Fresh Look at the “Hot Hand” Paradox
Citation Statistics from 110 Years of Physical Review

Explainer Animations:

Simple Parking Strategies: A Primer
"Sleeping Beauties" of Science: Unseen ≠ Unimportant
When Will An Elevator Arrive?

Episode Transcription

Transcript generated by machine at podscribe.ai and edited by Aaron Leventman.

Sidney Redner (0s): There's a very famous saying, which has been attributed to Einstein, but apparently it was not his saying, which is that a model should be as simple as possible, but no simpler. I try and live in that space of the model should be as simple as possible, but no simpler so that we capture some of the collective behavior that one might see in a complex system. Like where should you park your car? Or what is the length of winning streaks of a baseball team? But all these other thing, maybe some people like to park in the shade. Maybe some people want to get a bit more exercise. These are like complications that in principle, I know the tools of how you might deal with it, but it then loses the simplicity and the beauty of like a clean solution.

Engineers have to worry about real world problems. They have a real parking lot with a real traffic pattern in it. And then you have to worry about all of these microscopic details. But I think that those microscopic details are a way that you then lose the forest through the trees. And so I'm trying to look at the forest, the big picture, rather than each individual tree.

Complexity is all around us, in the path we walk through pathless woods, the strategies we used to park our cars, the dynamics of an elevator as it cycles up and down a building. Zoom out far enough in the phenomena of everyday existence. Start revealing, hidden links, suggesting underlying universal patterns at great theoretic heights. It all yields to statistical analysis, winning streaks and traffic jams, card games and elevators boiling down complicated real world situations into elegant toy model. Physicists derived, mathematical descriptions that transcend mundane particulars, helping us see daily life with fresh new eyes.

Welcome to Complexity, the official podcast of the Santa Fe Institute. I'm your host, Michael Garfield, and every other week, we'll bring you with us for far ranging conversations with our worldwide network of rigorous researchers, developing new frameworks, to explain the deepest mysteries of the universe. In this episode, we speak to SFI resident, professor Sydney Redner author of A Guide to First Passage Processes about how he finds inspiration for his complex systems research in the everyday and how he uses math and physics to explore hot hands, heat waves, parking lots and more.

If you value our research and communication efforts, please rate and review us @applepodcasts and/or consider making a donation at santafe.edu/podcast. Give you can find numerous other ways to engage with us at Santafe.edu/engage. Thank you for listening. All right.

Michael Garfield (3m 1s): Sydney Redner, it's a pleasure to have you on complexity podcast.

Sidney Redner (3m 4s): I'm glad to be here.

Michael Garfield (3m 6s): As we typically do on this show, I'd like to start by inviting you to talk a little bit about your early years and what got you into science in the first place and how you were animated to become a physicist. What burning questions drove you into it and who helped you along the way?

Sidney Redner (3m 30s): That's an interesting question. First of all, ever since I was a little kid, I knew I wanted to be a physicist. I mean, my mother's a Holocaust survivor. My dad is a Holocaust escapee. They came to the United States and then to Canada with no education. They didn't understand anything about science or anything like that. But somehow I remember when I was about six years old, they bought me a book. It was called the Golden Book of Mathematics and I was hooked. I was just simply hooked. And so I knew right from the very get-go that I wanted to be a scientist.

Michael Garfield (3m 60s): So how did you decide to pursue the trajectory that you have pursued in your career as far as like a research topics? Because it seems like a lot of your work as we'll discuss in this episode focuses on a very particular approach to thinking about stochastic processes. And obviously this has very wide applications, but it's a very narrow focus and maybe it's worth defining some of these core concepts here too.

SidneyRedner (4m 28s): First of all, I followed a trajectory, which I think is very common among physics graduate students, which is that when I was first learning about physics at a dilettante level, I learned about relativity and high-energy physics. And I thought, well, that's the most pure, the most fundamental that's what I'm going to do. And that fantasy lasted until about my first year of graduate school. And I realized it's not really what it's all cracked up to be and there's a whole other world out there in other fields of physics. When you're a pine geophysicist, you think everything else is kind of like squalid state physics, according to, I guess Marie Gillmann, but there's just so many beautiful problems out there.

And I actually sort of fell into statistical physics, which is formerly the field of physics I work in. I kind of fell into it almost by accident, but once I fell into it, there's a certain intrinsic beauty because statistical physics is both nowhere and everywhere at the same time in the sense that there's no like division of statistical physics in the American physical society, but statistical physics underlies the reasoning that people use to understand all kinds of systems because as soon as you have many particles or many degrees of freedom, you can never hope to understand the microscopic motion of every single particle.

For example, like the weather. You can't follow the motion of every single molecule in the atmosphere and figure out the weather that's just hopeless. And so one has to develop statistical methods to describe these sorts of systems in a statistical way, rather than a deterministic way. And that kind of reasoning just resonates with me all the time.

Michael Garfield (5m 58s): It strikes me that your own career is actually a good example of your domain of research and that you kind of applied a random walk through physics until you like landed somewhere.

Sidney Redner (6m 10s): I guess my choice of research topics has been more eclectic than the average person. I generally have not worked with real experimental data. There are some theorists who say, “let me look at the data of this scattering experiment and try and understand the interactions and try and make some predictions about some material.” I do work with data at certain times, but I'm just sort of like looking at my window right now and looking at the leaves shaking and thinking about birds landing on the tree and like how far apart they're going to be spaced and all that. I see, like the world around me is just full of beautiful examples of statistical phenomena that I like to try and model and formalize and try and solve.

Michael Garfield (6m 51s): So while you were a professor at Boston University, you wrote a textbook called a Guide to First Passage Processes. And because this is a concept that has bearing on so much of what we're going to talk about today, I'd like for you to just introduce this concept and explain why it is so fundamental and so illuminating to so many different areas of inquiry.

Sidney Redner (7m 14s): So first passage processes is kind of a subset of the whole universe of random walks. So the first thing is that there are many processes in nature that described by a random walk. Maybe you've heard about a random walk down Wall Street. Stock prices move in a stochastic manner. And one might try to model that by random walk. And then there's something called stock options where if a stock hits a certain price by a certain date, something happens. So oftentimes you're interested in when a random walk first reaches a threshold level.

So the stock option is a typical example of that. Another example is integrating fire neurons in the brain. You have neurons that are farming the brain all the time. There's this threshold voltage. And every time a neuron fires, it sends currents to neighboring neurons. And so the voltage level on a given neuron is fluctuating randomly with time. But when it reaches a certain threshold level, it fires and sends signals somewhere else in the brain. And that's basically how the brain works. It's just this cacophony of firing of neurons all the time, but they're all driven by this underlying first passage process where a voltage has to reach a certain threshold level before it fires.

And the voltage level is gradually rising, but rising in a fluctuating way.

Michael Garfield (8m 28s): This would also apply then to diffusion based processes like circulating oxygen through a body, specifically like the transition between passive oxygen diffusion to an active pump circulation. We're talking about this with Geoffrey West, talking about the emergence of the circulatory system or talking with Brian Arthur about the emergence of active initiatives to redistribute wealth in society as society scales. That all depends on first passage resetting and how long it actually is going to take a given oxygen molecule to make it from outside to the inside of an organism.

Am I getting that right?

Sidney Redner (9m 6s): Well, you've got pieces of that, certainly correct. The place where I think that first passage processes are still like yet to be fully developed is understanding physiological processes. And so the question that you were asking, like if you take a random oxygen molecule, how long does it take before it binds to a cell, then it gets transported inside the body? How long does it take? I mean, that is an example of a first passage process, but it's not just pure diffusion because you know, you're breathing in. And so there's convection, there's turbulent flow. There's all kinds of other things going on. But nevertheless, one is interested in like how long does it take for like oxygen to get into yourselves?

Michael Garfield (9m 41s): Let's make it a bit simpler as you do very well in some of your papers, creating these toy models of traffic and parking and so on. And let's start with this paper that you co-authored with Ben-NaimandKrapivskyon traffic, on kinetics, clustering and traffic flows. So could you walk us through how you built this model?

Sidney Redner (10m 5s): First of all, it's something that we've all seen on a two-lane highway when you're trying to pass and you can't pass cause there's traffic coming the other way. You pile up behind a slower car. And in fact, this actually has an amusing backstory because when my kids were about eight and five, I thought we should go to Cape Cod for vacation. So we're driving down the Southeast expressway, then there's highway three to get to Cape Cod. And at some point you go across the bridge and you're on Cape Cod and it's a four lane highway that suddenly changed the two lanes.

And if you know anything about Boston, you think the infrastructure is always messed up. “Oh, this must be a mistake. They're under construction.” No, it's not a mistake. There's a 13 mile stretch called suicide alley where the four lane road turns to two lane road and passing is definitely not allowed. There's a little bollards in the road. And so I remember thinking, “Damn it, it's going to take us forever to get to Cape Cod.” But I notice it is we're entering suicide alley. There's cars coming in the opposite direction, like every second. And then after a while there's a huge gap.

And for two minutes, no car is passing me and then another huge cluster. Every car passing once a cycle. And finally, when they get to the far end of suicide alley I realize there's no more large clusters. There's just cars coming, randomly entering the road. And so it hit me like a bomb, of course, all is happening is that cars that are faster than the average are getting stuck behind a cluster. And we can understand that mathematically and I have the tools about how to treat it mathematically. And so, bingo, that was the start of a paper on traffic clustering and one dimensional flows.

Michael Garfield (11m 41s): Yeah. I think it's key in this piece, as in the piece on parking for the sake of simplicity, you're reducing it to that one-dimensional traffic flow.

Sidney Redner (11m 50s): The point is that nature in that particular case was exactly one-dimensional flow.

Michael Garfield (11m 54s): So this is actually a rather detailed model. You're assuming a power law behavior?

Sidney Redner (12m 0s): It doesn't really assume very much in the sense that like to make the basic theory. All you need is that there's a distribution of intrinsic speeds for each car. Like a little old lady might drive a little slower than the average and aggressive young man might drive faster than the average. All you need is a distribution of velocities. And so you start with cars flowing on a one dimensional line each with their own intrinsic velocity and then one just follows what happens. And so like have a faster car catches up to a slower car now you have a cluster of size two. Maybe this cluster of size to catch us up to an even slower car.

And you got a cluster of size three and then this clustering just continues ad infinitum. And so the actual basic phenomenology doesn't depend on the distribution of intrinsic speeds. It turns out that if you choose a particular form for that intrinsic distribution, you can solve it. It makes pretty formulas. But that's almost irrelevant. The fact that there's just a distribution of speeds is the main point.

Michael Garfield (12m 53s): And so it seems like you can draw from this an illuminating analogy to some other one dimensional flows. Like where else do you notice, where do you observe this kind of behavior outside of traffic? And then maybe like a B part to that question would be, did you find it in doing this work that you gained any insight into what could be done to kind of counter this? Is there a way to unstick or prevent traffic jams by regulating on-ramp like you see in Los Angeles, like some effort in this direction, but that doesn't actually account for the tiny variations once you get on the highway.

So that's an A and a B.

Sidney Redner (13m 31s): Let me answer the B first. I think I have a reasonable answer for that, which is that if you have a one-dimensional road with only a single lane of traffic, of course, it's going to strongly cluster. And you know, there are places where maybe there's a broken line in the center and you can pass, but it's still pretty restricted. But a thing that you can do in which a lot of places do in rural areas is that you just make a third passing lane every once in a while. And so an interesting question then is if you want to minimize the cost by building more third passing lanes, but you want to maximize traffic flow, what should you do?

That sounds like actually an interesting optimization problem that I've not really thought about.

Michael Garfield (14m 7s): That seems related to this other issue that comes up a lot in complex systems research, which is the trade-offs between the efficiencies of an economy of scale and the resilience or adaptability of so-called inefficient systems. Every one of those little turnoffs is an additional investment in the infrastructure that would seem to like a first layer of analysis to be a waste of money. But here you're saying actually if you embedded in a larger consideration, then it may actually be saving us time and money,

Sidney Redner (14m 42s): You invest money to build the passing lane. But on the other hand, the people save a certain amount of time and time is money then there is some optimization calculation one could formulate.

Michael Garfield (14m 53s): So maybe this isn't the right piece to tackle next. We're not taking this in chronological order, but I did just read the paper that you did with Sarah Fang on elevators. And this seems again, related to a very tiny piece of a very in-depth paper that the two of you wrote on how long you're going to sit there in the lobby, waiting for an elevator to arrive. And the question of how many elevators a given building should have in order to minimize waiting time. Again, I think the real world motivation for this piece should be fairly self-evident, but I'd love to hear a bit of the backstory on this and then how the two of you broke down this problem and actually attacked it.

Sidney Redner (15m 31s): I guess the other thing is that we're living in an increasingly crowded world. Not so much in Santa Fe, but when I was living in Boston, you always have to worry where could you park your car? Did you have to avoid driving here at this time of day because you wouldn't find a place to park. And if you went to the movie at 9: 30, it was overcrowded and you wouldn't get in. So I mean, there's always lots of phenomenology involving crowding. And so I've always been interested in examples where you have crowding, you know, so it happens when you try and board or get off an airplane. It happens when you go into the lobby of a building. It's everywhere.

So, you know, Sarah came as an undergraduate research for experience for undergraduate student. We were just talking like, what could we work on that one could finish in a summer, which we never did finish in the summer. I don't know how it came up. I'm an impatient person. I might've said, “Damn it. I waited like 10 minutes for something or other.” And so somehow we hit upon thinking about waiting in an elevator because everybody does it and everybody probably feels a little bit frustrated about it. I thought, well, we could perhaps mathematize it.

And you know, the way we mathematize it as a way like any self-respecting physicist does, which is you first start with a spherical cap approximation. And so the spherical cap approximation for an elevator would be one elevator that has infinite capacity because then when the elevator comes, nobody's like left behind. So it simplifies mathematics and that turned out to be easy to solve. And then it sort of led us to think about a finite capacity elevator and then multiple elevators is the natural progression starting with something so idealized that it's not relevant to the real world, but it provides you an insight to how to then deal with the more realistic cases of finite capacity elevator and multiple elevators.

Michael Garfield (17m 18s): One of the things that I know appeals to many, many people in the complex system enthusiast world is this question of synchronization. I just had a conversation with Orit Peleg about Firefly synchronization. And I think this is a really nice sort of place to expand on the traffic jam and crowding element from the one lane road example, to talk about how, even if you only have one elevator per track that you still end up with synchronization.

Could you talk a little bit about that because it took me a little time to wrap my head around how you would still end up with these situations where you get traffic jams, even with a bunch of independent elevator cars.

·
Sidney Redner (18m 2s): The point is that even though their trajectories are independent, the loading process makes them highly correlated. It's easier to understand from the framework of thinking about a bus route model, because in fact, part of the inspiration for this, there was a very beautiful paper, maybe about 30 years ago, I forget now, about the one-dimensional bus route model. And in that model, you imagine you have a bus route which has a circular loop, and there's a number of buses going around the loop and there's equally space stations and passengers arrive at random at the various bus stops. A random number of passengers get off at each stop.

And the point is that in this one-dimensional world, if by some bad luck, there's more passengers waiting at a stop than average, the bus that stopped at the stop takes longer to pick up the passengers and the bus behind will catch up. And when it gets to the stop where the bus took longer than average, there's typically fewer than average people waiting because the next bus is started to catch up. So there's a smaller time gap. And so that instability just feeds on itself because the second bus is picked up fewer passengers so it takes less time at that one bad stop.

This instability just builds and builds. And so the same kind of thing happens in the elevator situation. It's just that instead of having, there's stops along the floors, but then there's a long stop at the lobby. And it's the long stop at the lobby where if there's more passengers than average, then that one elevator is spending a long time waiting. And so there's more time for the next elevator to come to the lobby and catch up to the elevator that's loading up to capacity. And by the time it leaves, maybe there's fewer passengers waiting and the next elevator can pick them up quickly.

And if it's behind, then it can catch up to the elevator that's ahead of it.

Michael Garfield (19m 45s): It feels to me like there might be some insight from this that you could apply to like HR practices or like the org chart, where especially with smaller organizations, you end up with all of these bottlenecks where one particular duty falls on one person. But then even as the organization scales and you move from like a one elevator model to multiple people that could be answering the same inbox, you can still end up in these jams where you think you're creating a bit more like resilience against this kind of thing by hiring additional people into a particular function, but it doesn't actually solve the problem.

Sidney Redner (20m 25s): I haven't thought like along those lines, as I'm more interested in like concrete models, but that's an interesting thought.

Michael Garfield (20m 33s): Well, moving on in a kind of random walk through your papers, you have one here that again, I think has a real relatable origin story. This piece you did with Ben-Naim onwinning and dirty, the greedy random walk. I'd love to hear you elaborate on this one next, if you could.

Sidney Redner (20m 53s): This is based on the children's game of war. For those who have never played war it's about as simple as simple can be. You just take a deck of cards, these divided into you each turn up one card at a time face up and whoever has a higher value card takes both cards. And then there's one other twist to the model, which is that if you both turn up a card, which has the same value, then you're supposed to put down three more cards and turn up another card. And it's called a war because it's more costly for the person who loses. And so when the next card is turned up, whoever has the highest card wins.

And so it's kind of like a random walk process because if you look at the number of cards that anybody has, it goes up by one or down by one, or maybe it goes up by plus or minus four after you play a war. And so if you just think about a random walk with 52 cards, and when somebody has all 52 cards, the game is over, you can calculate how long it's going to take on average for this game to last. It takes a long time. I mean, especially if you're playing with a four-year-old son who really likes to play war and it's sort of past bedtime and you're tired and you've got things you've got to do as well.

And so I thought to myself at some point I have a great idea “Gabe let's make it every time a war let's make it even bigger.” And so of course, being a four-year-old kid, he was really enthused about that. Let's have Mega Wars. And so like the first war, you put three additional cars and turn up the fourth card. The next one, you put four cards down to turn up the next card. So that was the game of war that it came up with and it had the incredible benefit that it's over with quickly. And so the paper that we wrote was just the actual study of like how much quicker that game is compared to normal war.

And I was happy when I thought about it because Gabe was happy. The main thing is you want your kid to be happy. So he was happy to play these Mega War games, but I was happy because the game took less time. So win-win situation.

Michael Garfield (22m 44s): Well, the fact that you use the card game war, as an example here, I think begs, you know, follow up question about real-world arms races and like James P. Carse's book, Finite and Infinite Games. You're trying to win over a given limited resource, say that's the finite game. That's different from a situation in which you're trying to keep the game going for as long as possible. Maybe another link to this would be when Geoff West talks about the finite time singularity about how like a city gets itself stuck in this ratchet where it's constantly bringing crises upon itself faster and faster and has to innovate faster and faster has to make these innovative leaps at an ever-shrinking timescale.

So what do you think that this model suggests about preventing a kind of a super war situation in which the goal is not to end the game as quickly as possible, but to keep it going as long as possible. Is there an intervention that this math suggests?

Sidney Redner (23m 49s): The simplest way to make the game last a long time is that if somebody has an advantage that somehow you provide a handicap, so the probability of the person who's advantage probably of him or her winning the next game is lessened. The one thing which I haven't thought about that seriously, but it's also interesting is that when you started with the game of war, you think the value of the cards of each player are roughly the same on average. As the game progresses, the one who's got more cards typically has more high value cards. So the way to make the game last longer is you handicap based on your higher value cards the way the game is actually played. Roughly speaking, what it means is that when you have a random walk, you have a bias, that's driving it back towards the origin.

Every time you like make an excursion to the right, you have a bias pushing it to the left and vice versa.

Michael Garfield (24m 37s): This seems related to the shelling segregation model and this notion that you actually have to actively compensate for people's desire to affiliate with people like themselves in order to maintain some heterogeneity in society. It seems like kind of the same thing cut at from a 90-degree angle.

Sidney Redner (24m 57s): At some level there is some commonality. I mean the Shelley model is extremely simple and extremely beautiful and it shows that if you just prefer people of your own type, but don't discriminate against the opposite type to still naturally to segregation. So there is this organization that's happening from a more or less random process.

Michael Garfield (25m 15s): Maybe the analogy would be something like that. The benefits of associating with your own kind of person are kind of equivalent to the benefits of constantly upping the ante in a super war. You sort of race to the finish line with that approach. Again, to take a complete hairpin turn here, you've got a piece that you wrote with Mark Peterson on the role of global warming on the statistics of record-breaking temperatures. And this is a completely loose analogy, but in keeping with the race to the finish line here I would love to hear you introduce this paper.

Sidney Redner (25m 55s): Well, actually this paper has a very amusing backstory because, you know, in the seventies I was really into running and running marathons and I really wanted to run the Boston Marathon. And in 1976, I was so ready. I was raring to go. I was going to run two 50 marathon and the day of the marathon, it was like 95 degrees. I couldn't believe it, just all that training going for nothing.

Michael Garfield (26m 19s): And then about nine years later.

Sidney Redner (26m 21s): I was giving a talk in Temple University and it was a colloquium in early

Michael Garfield (26m 26s): April.

Sidney Redner (26m 28s): And because I was invited by someone who was a former graduate student at Boston University, I stayed in his apartment. And that night it was 95 degrees, April 4th, I think 1985. And this is back in the days when you had transparencies. And so I was giving my colloquium, no air conditioning, because one turns it on in early April and I'm sweating into my transparencies and they're melting away before my eyes. It's like that bugged me the whole time. Why is it that the first heat wave is always sort of coincident with the Boston Marathon that stayed with me for like many years?

I gotta be able to make a model of this. I gotta be able to mathematize it. And I tried very hard to get data. And I guess it was in 2004, I was at Los Alamos Labs as an alum scholar that year. And I had a running partner named Mark Peterson. And I remember telling him, I just got data for 150 years of Philadelphia high and low temperature for every day of the year. I'm going to analyze it and see what I can learn about it. And somehow in the course of that run, we sorta realized that if it's just, what's called IID Data, independent identically distributed random variables.

So if I focus on one day of the year, the temperature on that one day of the year of it's an independent, random variable for that particular day, then we know something about how many record temperatures you should expect on that day. And for IID variables, it turns out the number of temperature records grows logarithmically in the amount of time that you've been actually keeping records. And so that was the thing. Could we then take this data and analyze it, look at the number of records for each day of the year, temperature records, and see if there's some hint of global warming.

And so it turns out that for Philadelphia, the data was suggestive when not decisive, that is that we had more record temperatures than we would expect from just random IID variables, which suggested that global warming is actually affecting the number of record temperatures that you see. And subsequent to that there's a flood of data that became available. Before it was always behind paywalls and it was always hidden. And then I found data for like hundreds of cities around the world. and yes, you can see very clearly the effect of global warming and record temperatures, statistics. And, you know, like the average lay person, when it's a record high temperature, then people are always talking about what's global warming or it's not global warming.

And it's fun to mathematize these basic questions and try and understand really what is the effect of global warming on record temperature statistics?

Michael Garfield (28m 51s): You check these predictions, not only against temperature data, but also against Monte Carlo simulations. And just as a way of giving listeners a little nibbles, little insights, windows into the way that a physicist actually approaches these things and test themselves and does everything that they can not to fall prey to their own biases. I'd love to hear you talk a little bit about the simulation component of this.

Sidney Redner (29m 19s): So the other thing is that at the time we did the study, we only had 130 years with the data, which is not a lot of data by like any astronomical standards. So certainly the other thing that was worth doing, and we did do is that we simulated a toy model of, you know, we took like the temperature statistics of Philadelphia. So we know the average temperature for each day of the year. We know the spread for each day of the year. And from that, we can build a model where we can simulate not hundreds of years of data, but tens of thousands or hundreds of thousands of years of data nd look at the record temperatures statistics for this synthetic data and compare it to what we really see. And so it provides kind of a clear window of the role of global warming on record temperature statistic because when you have tens of thousands of years of data, then you see much more clear signals than you would with just 130 or 140 years of data, which is what we actually had.

Michael Garfield (30m 11s): Unless I'm misunderstanding this, you say here that your primary result is that we cannot yet distinguish between the effects of random fluctuations and long-term systematic trends on the frequency of record-breaking temperatures.

Sidney Redner (30m 23s): But that’s only for the one case of Philadelphia and subsequent to our paper. I did find much more data. And then for longer time series, you really see clearly the effect of global warming and record temperature statistics. And other people have then done much more with much more comprehensive datasets. And then also the other thing is that focusing on record temperatures is like throwing away most of the data. And so there's actually more incisive ways of finding the canary in the coal mine. But I think that we sort of started a little bit of a cottage industry on this, but yes, we do see clearly, because the case that I found about six months later.

I think I had 236 years of data and there it's just crystal clear. You can see the canary in the coal mine just by the record, temperatures statistics

Michael Garfield (31m 8s): In the 15 years, since this paper has the growth in the signal here, you know, the statistical signal of increasing record temperature events. This is sort of a casual question. How serious does it seem to you at this point?

Sidney Redner (31m 27s): I'm not a climatologist and I wouldn't want to try and stick my neck out on this one. But certainly in the United States, it's really clear that records are happening much more quickly now than they did 50 years ago or a hundred years ago. Again, it's just one of many possible signals of global warming and certainly what people pay much more attention to is not looking at records statistics, but average statistics over long periods of time and seeing a very slow, but systematic increase in average temperatures.

Michael Garfield (31m 56s): So kind of adjacent to this in terms of thinking about taking enormous datasets and identifying statistical streaks in them, you've got the work that you did with sire on understanding baseball teams, standings, and streaks. And so this seems like you're applying a similar approach to actually determining whether or not this common apprehension that teams get on streaks and that streak has something to do with the team's ability is like whether that is just a statistical artifact or whether we're actually seeing something in the system itself that's torquing the results.

Sidney Redner (32m 36s): And is that for that example of like baseball team winning, I used to know the data really well. I've now forgotten it, but I forget which team has a longest winning streak, but the question was, is this something which has really fluky or is this something that one can understand in a clean and simple way? And the answer at least I believe is that you can understand it in a very clean and simple way. One might say, well, if you're on a streak, you feel hot. You really macho, you intimidate the other team. The other team is a little bit intimidated. And so it feeds on itself.

And in fact, the data at the beginning is suggestive because if teams were all equally strong and there was no systematic effects or no intimidation or anything like that, then the length of a streak should have the probability that you have a winning streak of end games should be equal to two to the minus end because half the time you win half the time you lose. So the proper that you have a two game winning streak is one half squared, a three game winning streak is one half cube. And so that's, what's called exponential dependence on the length of the streak.

And if you look at the data, normally what we do and we have data which like decays exponentially with time, you plot it on a logarithmic scale and it looks like a straight line on a logarithmic scale and the data was curved upwards. So it looked like no, there's something really unusual going on because of the fact that it wasn't an exponential. And it turns out that what that apparent curvature is, is actually a finite end effect for one thing. And the other mechanism for it is the fact that the teams are not all equally strong. So if you include the fact that the teams are different strengths and work through all the mathematics, you can actually predict the length of the winning streaks as a function of the heterogeneity and the team strengths.

And so we came up with a very simple description, statistically, of team strength as a function of the number of wins at the end of a season, from which we can infer then the relative strength of teams from which we can then compute the length of winning streaks for like a set of heterogeneous opponents. And then we find a very good match between the theory and the data. So there was no intimidation effect. There was no feeding on itself. It was just simple, random statistics.

Michael Garfield (34m 47s): However, you took this data set and you cut it in half. So you've got 1901 to 1960 and then 1961 to 2005 and things look different after 1961. So what's going on there?

Sidney Redner (35m 0s): This also comes to like the question about basketball as well, but the point here is that we're living in a competitive society. People want to get ahead and, you know, baseball is kind of a magnification of the competitive society because the whole point of baseball is winning games. And so if anybody has a competitive advantage, then the other teams will try and figure out what it is to arbitrage it away. And in the early days of baseball, when baseball players didn't earn a lot of money and you had to have a second job and people didn't have a lot of time to watch baseball games, the intensity of the competition was not at the same level as it is now, where you have, a team will have a masseuse, a psychological coach nutrition coach.

They're trying to optimize the performance of every single player as best as possible. And so one sees that in the fact that I'm in the period, what we call the classic period of baseball, which is 1911 to 1960, when they all played 154 game series, that the difference between the best team and the worst team was that the best team would typically win two thirds of all games and the worst team would win one third of all games. After 1961, the best team typically wins. And I forget now the numbers because I don't. I've forgotten the details, but it's just that the spread between the best and the worst teams has gotten systematically smaller.

And so that effect is just that the very worst players on the worst teams now are much better than the worst players in the worst teams a hundred years ago, because they're all very well-trained, they have good nutrition. It just like reduces the disparity between the best and the worst.

Michael Garfield (36m 32s): This seems related to questions about the scale of competition in other areas. For example, the superstar effect with like YouTube and how competing on a global stage has changed things for say, I look at like on avant garde acoustic guitarists online, and you see that over the last 15 years, this global arena for competition has yanked everyone up to a certain level of performance, but also diminished the variety, the heterogeneity of players.

In some respect, they've sort of focused on trying to optimize for a particular audience. Doug Guilbeault, the 2019 CSS alum that just gave a seminar on the convergence of novel categories and how like the larger a society gets, he was doing like a Rorschach thing. And having people like try to agree on what shape that particular item is, that image. And then as the society gets larger, people are more and more likely to converge on categories.

And he was comparing it to the way that you see genetic drift can push a mutation that has no benefit into saturating an Island population, but then it just gets lost in the noise on the mainland. There's all of these ways that it seems like your research is also pointing to how scale induced competition drives convergence and how even if we're on average, getting more and more competitive, you know, within a given sort of domain that we may also be losing variety.

If there were more than just that one degree of optimization for winning major league baseball games that this evolutionary arms race is going to eventually undermine itself.

Sidney Redner (38m 24s): That's certainly an interesting point. And if I think about like professional sports let me compare Canadian football league because I'm Canadian with the NFL and Canadian football has a much lower level of competition, but it also has a much wider spread. The very best players in the CFL are good enough to be in the NFL. The worst players in the CFL have no business being on the field almost, and the games are more wacky there. They're just wackier. It's also the same. You've seen college football, like the first week of the season you'll have like Texas playing some Podunk University and the game is like 72 to 3 or something like that.

So when there is this disparity in the competition, then you have a wider spread of outcomes and also perhaps more wacky and more entertaining. But as you get like to the very top level of competition, it's like everybody looks the same. They're all ferociously built. They're all monster athletes and they all do the same thing. It seems at some level, a little bit more dull.

Michael Garfield (39m 20s): So this is related to another piece when you're talking about like winning streaks and losing streaks, but outside of the context of competitive sports, per se, you have a piece, a fresh look at the hot hand paradox. That's looking at this like purely within a series of point flips and you get some results that on the first pass would almost seem to contradict your findings in baseball.

Sid Redner (39m 47s): That's probably a more idiosyncratic piece of work because it's like very specialized and it has this weird feature of what's called the hot hand paradox, which is that if you flip a coin and it's a fair coin and you're flipping either heads or tails, and the question is, how long do you have to wait before you get to say, a particular sequence, like three heads in a row versus head tail head. If the coin flip is entirely it's fair and uncorrelated to get three heads in a row. Well, that's like each outcome is a probability one half.

And so you should say, well, it's one half cube. That's one eighth. So you say eight time steps. You should see something like three hits in a row. But it turns out that the sequence heads, heads, heads, heads, tails, heads have different waiting times to actually see them. And it's kind of a particularist phenomenon. It doesn't really refer to like real sports or anything like that. It's just a quirk of the coin flipping situation that I'm not really prepared extemporaneously to give like a really good intuitive answer for why that is. I mean, if you gave me 10 minutes, I'd work it out again because it's one of those things that it's so non-intuitive, that seems unbelievable.

But then like I work it out and, here it is. And here's why it is, but it's like, because I don't work in that area. It's like not something I'm on top of right this second.

Michael Garfield (41m 3s): So you've brought up hot hand elsewhere. You brought it up in your work on basketball scoring. You've got a piece on this that you wrote with Alan Gabel. Is there evidence of hot hand as a valid phenomenon and when you are actually looking at this sort of high dimensional?

Sidney Redner (41m 22s): Well, I mean, let me talk a little bit about basketball and by the way, there's a second paper with Aaron Clauset I'm Marina Kogan. But the point here is that just as we were talking before, about as the level of competition increases, you lose any systematic advantage. So like if you think about the NBA, every single player in the NBA is an amazing athlete that they do things that I can only dream about. And so the worst player on the worst team in the NBA is still a fantastic athlete. And also the worst team can be the best team at any given day of the year.

And so the argument that we were coming up with is that any systematic advantage has been arbitraged away because everybody has the same training methods. They're all very well coached, whatever. And so it's really the outcome of a random walk. And so we wanted to try and test whether or not one could describe basketball scoring event by event. It's just like repeated flippings of a coin. And the answer was, yes, we can actually model really beautifully and precisely. With a zero-parameter fit we can fit the scoring data of NBA basketball games to just a simple one-dimensional, random walk or repeated coin flips.

Michael Garfield (42m 29s): Again, we're just taking hairpin turns through your work here, but I wanted to hairpin next into citation statistics from 110 years of physical review, because to the extent that we can start thinking about the intense competition in sports or in the global entertainment industry, there's also really intense competition and situations of like preferential attachment in science and in the citation of scientific papers and you know, how it is that a paper gets noticed. And you wrote this piece, looking at the statistics of the history of physics publications and finding that some of the most important papers just sort of slept dormant for years before they were recognized. So I'd love to know how you approached this particular study and then how you see it as linking into some of the other topics and insights that we've discussed in this episode so far.

Sidney Redner (43m 32s): It comes down to looking through the world through statistical eyes. So somewhere in the nineties, and before that, how did scientists get tenure for example? You would have to go to the library, go to this book called the science citation index. You'd have to look at how many times your papers were cited, and then you would put it in your CV. And then you'd say, give me tenure, because I've been cited like, 1000 times or something like that. And then somewhere in the 1990s, the data has started becoming available electronically. And then, it opened it up for every, anyone could play with the data.

And so somehow there was this list that was circulating around of like well-known physicists and how many times they've been cited. And it just can't any immediately. Well, I should look at the distribution of the number of citations. And so I saw that it was heavy tailed and there was another person who was also working on the same data set. We had competing papers appearing at the same time. And then it turns out there was also data about the internet itself and like essentially the spatial structure of internet, like what is the number of links between nodes on the internet.

And that also seemed to have a heavy tail. And so that's where Barabasi got the idea of this preferential attachment notion and he was justifiably very famous for coming up with this. It turns out that that model was actually first solved by a guy named Herbert Simon in 55. But leaving that aside, the point is that, you know, scientists care about how often they're cited and then it's very natural to try and systematize those studies and understand what is the probability that a paper of mine is going to get cited. How many times am I cited?

Where do I fit in the grand scheme of all physicists and why a well cited physicists are my poorly cited physicist. Lots of people that have been looking at systematics of this kind of data. But one thing that we did do that I think was really a lot of fun and taught us a lot was using the Google page rank algorithm itself to then try and classify papers. So, you know, everybody uses Google all the time and the basic Google page rank algorithm is simple. As simple can be. It's basically a random walk algorithm. And so the idea is that you have the worldwide web, you have a whole bunch of websites.

And a lot of the traffic on websites are robots that are just robots, going around, mapping out where they are. So the basic Google page range algorithm itself, you sent a whole bunch of robots out on the worldwide web, and you just go from one link to another. And every once in a while, you just say, I'm going to go randomly anywhere that's called teleportation. And then you'd look at the distribution of where the robots are as a function of time. And it turns out that popular websites are highly cited papers will have a lot of robots on them and you just order the sites by the number of robots on them. And that's basically what Google page rank did in its early days.

So we applied that same algorithm to the citation graph. So what we imagine now is that you have scientific papers, they cite other papers, and links between papers are the citations. And we just put random walkers on this citation graph and let them run around to nearest neighbors. But every once in a while, we just say teleport anywhere else on the graph. And the reason you do teleportation is that you could have some sinks in the graph where like you get stuck forever because there's no way out. You have only in-links, but no openings. So teleportation is actually a necessary feature of trying to make a sensible measurement.

And so it turns out that by using this Google page track algorithm on the citation network, we could then identify papers that had high Google page rank, but were not that highly cited. And the thing about Google page rank is that roughly speaking, if you're important and you cite me, that really helps me a lot. Whereas if you're unimportant and you cite me, it doesn't mean anything to me because you're just unimportant. And so what Google page rank does is it looks at the importance of who's citing you. So if someone important sites me that boosts my Google page rank.

And so it turns out that we found papers that were not very highly cited, but the people who cited them were really famous people. So it was obviously this was an important paper and we could identify really important papers by this kind of mechanism. It seems to me like, and maybe this

Michael Garfield (47m 30s): Is just because my head is broken in this way, that it's like, everything reminds me of Jennifer Dunn's work on food webs and feeding networks. And I wonder if this disparity between the citations and the page rank is useful in terms of kind of predicting the way that a given organism is going to move in a trophic network as that network is disrupted. For instance, if you look at, say 120 million years ago, there's no evidence that the animal that eventually became Tyrannosaurus Rex was going to ever take over that niche as a Keystone predator.

And yet it did. And maybe this question of like, it's not, the analogy would be something like it was feeding on the right animals. Like it wasn't necessarily that it was, you could be a generalists say and still everything that you're eating gets wiped out in some extinction, but if you're eating the right thing, then you're leftover afterwards. I mean, is this making any sense?

Sidney Redner (48m 30s): There's still a lot that can be, I think, gleaned from citation statistics because there's fields that become hot and fields to become fallow and fields that stay fallow for a long period of time and they become hot again. And so there's a lot of this sort of bursting this to like the citation statistics that I think might have some analogy with evolution itself, you know, we think about punctuated evolution or something like that. So I think that's a really interesting thought. I don't think has been that well developed. You know, I think people have been thinking along those lines about just how different fields have become hot, how different fields get used up in some sense, or how some fields followed a favor and then some experimental discovery happens or some technology happens that allows a field to rise again after a long period of time.

You mentioned the notion of sleeping beauties that seems to underlie a lot of the sleeping beauty phenomena that we see in scientific citations.

Michael Garfield (49m 22s): It seems like this kind of analysis would be useful in helping to understand when a particular market bubble isn't really just a bubble. That's like something about the interoperability that you would think about, like in terms of a citation network that would help us differentiate between when something is just seasonally hot when it's in fashion versus when it has real traction.

Sidney Redner (49m 45s): Well, I mean, that's a really interesting thought and I don't know how to differentiate between like fashionable bubbles versus something that's really got traction. I can maybe retrospectively you could see it. I don't know if you want to do anything predictively.

Michael Garfield (49m 59s): So the last place I'd like to park this conversation as it were is in a paper that you wrote with Paul Krapivsky on simple parking strategies. This was a lot of fun for me. I got to do an animation of this paper for the SFI YouTube channel and to loop this back around to where you really shine as a researcher in creating these concrete real-world relatable scenarios, where you can apply the math and give people some insight into how to navigate their lives. You know, take us out on parking.

Sidney Redner (50m 33s): If you asked me like, where did this project get born? I honestly don't know. Paul Krapivsky is my closest collaborator. And we've written 85 papers together, maybe 86, I forget. And we talk about anything and everything. And like our conversations will go from science to politics, to religion, to science back again. And I've always been thinking about crowding phenomenon. And, and in fact, we wrote a textbook, a graduate textbook on statistical physics, where we talk about absorption, desorption problems.

And in there is something that we can think of as a model for parking, where cars come and take up spots and then a car might leave. And the point is that if there's no well-defined parking spots, it could be that you have like a space that could fit three cars, but two schmucks parked like in a bad way so that you can only fit two cars in there. But when those two cars leave, three cars can go into that spot. And so there's like an irreversible increase in the density of parked cars, but that model is very complicated.

And so somehow we were just talking about what's the simplest thing we could possibly do. And so that led us to this parking model of a one-dimensional parking lot, where you're trying to get to a store and you want to get close to the store, but if you go too close, then maybe you don't have a parking spot. And then you've got to go back all the way around to the back of the parking lot. And so that's how that model got itself born. And in fact, the model that in the end turns out to be the most interesting was something which you did not mention, which is our second paper on the subject, which was what was the optimum parking strategy.

And so imagine that you have like a one-dimensional lock. So again, we physicists, we try and make the spherical cap model of everything. So we want to simplify things to the point that we can mathematize it. So we imagine there's a one-dimensional lot. There's a store at the end of the one-dimensional lot and cars arrive and they arrive at a certain rate. They leave at a much slower rate because the point is that you might spend say an hour in a shopping mall. So that sort of defines the relative rate of coming in to leaving. And so then the question is, what should you do?

And if you have a one-dimensional lot, then you can see an open spot, but you can't see beyond parked cars to see if there's other open spots closer to the store. And so what is the optimal thing to do? And it turns out that for this very simple toy model, the optimum thing, and also you have to define what do you mean by optimum? So we would define optimum is that you've parked in the very best available spot. And so the way you park in the very best available spot is that you enter the parking lot. It has a certain length L. You go to L over two, you ignore all open spots, a distance L over two or further away.

And at L over two, you start looking for spots, you'd take the first one available. And that turns out to maximize the probability that you will actually park in the best available spot. That's there at the time that you're in the lot.

Michael Garfield (53m 20s): Just to open this back up from toy model spherical cap-ness, to real-world complexity. I think you and I both know that we've got a lot of kickback when we shared this. Some people want to park their car in the shade. You know, some people actually want to optimize for walking time because they feel like they need more exercise. And I just thought it was so funny that people felt the need to tilt their spirits at this, given that I watched you ride your bike up that insane hill to work every day before the pandemic. And as it's you don't know who you're dealing with here.

I think it's just worth mentioning as a tie into the conversation I had with Chris more about how careful we have to be in the design of our algorithms to think about what real world complexity we're actually leaving out. And maybe just inviting you to talk about that a little bit would be a great way to wrap this up.

Sidney Redner (54m 9s): There's a very famous saying, which has been attributed to Einstein, but apparently it was not his saying, which is that a model should be as simple as possible, but no simpler. And so I think the point is that I try and live in that space of the model should be as simple as possible, but no simpler so that we capture some of the collective behavior that one might see in a complex system. Like where should you park your car? What is the length of winning streaks of the baseball team, but all these other things, well maybe some people like to park in the shade. Maybe some people want to get a bit more exercise.

These are like complications that in principle, I know the tools of how you might deal with it, but it then loses the simplicity and the beauty of like a clean solution. Engineers have to worry about real world problems. You know, they have a real parking lot with a real traffic pattern in it. And then you have to worry about all of these microscopic details. But I think that those microscopic details are a way that you then lose the forest through the trees. And so I'm trying to look at the forest, the big picture rather than each individual tree.

Michael Garfield (55m 10s): So looking at the forest, what's next for you? What have you not yet explored in your work that's maybe a burning question for you now, or an area that's calling to you in the time to come?

Sidney Redner (55m 24s): I am working on a project, which is in its very formative stages right now. And it might be crap. It might lead to nothing, but I'm very enthused about it because it seems to bring up a new element about first passage processes and it's to try and understand the rule of income redistribution of any kind on overall wealth of the society. And so, again, in the spirit of the spherical cal, we're just thinking about the following thing. You have two starving graduate students and their wealth just evolves by random walk. Sometimes they gain a dollar.

Sometimes they lose a dollar, but when you go broke, there's two outcomes you could imagine. One is that one grad student goes broke and then the other graduate student goes broke, and that might be the egoistic society. And then you can say, well, how long does it take before both people are broken? What is the average wealth of this two-person population as a function of time? But now you can say, well, suppose that when one guy goes broke, the other guy has an altruist and the person who's broke says, please lend me some money. And the other guy says, or girl says, well, here, you can have half of my money because you know, I'm an altruist.

And then you continue until somebody goes broke again. And then you just share your money equally. Is altruism good or is altruism bad? And so that's the basic question that we're trying to investigate right now. And it's turns out to be very simple, to formulate. I've just formulated the model and incredibly challenging to actually solve analytically. And we have some partial answers and maybe in six months we'll have more answers, but it seems like a really fun problem. And it raises some very fundamental issues in first passage process and fundamental technical issues and some fundamental conceptual issues that we're grappling with right now.

Well, I await your results with a baited breath aware a stimulus check. Thank you so much Sid for taking the time to talk and for sharing your mind and your work with everybody today. I'm glad to do it. Thank you.

Thank you for listening. Complexities produced by the Santa Fe Institute, a nonprofit hub for complex systems science located in the high desert of New Mexico. For more information, including transcripts research links and educational resources, or to support our science and communication efforts. Visit santafe.edu/podcasts.