COMPLEXITY

Caleb Scharf on The Ascent of Information: Life in The Human Dataome

Episode Notes

Chances are you’re listening to this on an advanced computer that fits in your pocket, but is really just one tentacle tip of a giant, planet-spanning architecture for the gathering and processing of data. A common sentiment among the smartphone-enabled human population is that we not only don’t own our data, but our data owns us — or, at least, the pressure of responsibility to keep providing data to the Internet and its devices (and the wider project of human knowledge construction) implicates us in the evolution of a vast, mysterious, largely ineffable self-organizing system that has grabbed the reins of our economies and cultures. This is, in some sense, hardly new: since humankind first started writing down our memories to pass them down through time, we have participated in the “dataome” — a structure and a process that transcends, and transforms, our individuality. Fast-forward to the modern era, when the rapidly-evolving aggregation of all human knowledge tips the scales in favor of the dataome’s emergent agency and its demands on us…

Welcome to COMPLEXITY, the official podcast of the Santa Fe Institute. I’m your host, Michael Garfield, and every other week we’ll bring you with us for far-ranging conversations with our worldwide network of rigorous researchers developing new frameworks to explain the deepest mysteries of the universe.

This week on Complexity, we talk to Caleb Scharf, Director of Astrobiology at Columbia University, about his book, The Ascent of Information: Books, Bits, Genes, and LIfe’s Unending Algorithm. In this episode, we talk about the interplay of information, energy, and matter; the nature of the dataome and its relationship to humans and our artifacts; the past and future evolution of the biosphere and technosphere; the role of lies in the emergent informational metabolisms of the Internet; and what this psychoactive frame suggests about the search for hypothetical intelligences we may yet find in outer space.

Be sure to check out our extensive show notes with links to all our references at complexity.simplecast.com. Note that applications are now open for our Complexity Postdoctoral Fellowships! Tell a friend. And if you value our research and communication efforts, please subscribe, rate and review us at Apple Podcasts or Spotify, and consider making a donation — or finding other ways to engage with us — at santafe.edu/engage.

Thank you for listening!

Join our Facebook discussion group to meet like minds and talk about each episode.

Podcast theme music by Mitch Mignano.

Mentioned and related resources:

Caleb’s Personal Website, Research Publications, and Popular Writings

Caleb’s Twitter

We Are The Aliens
by Caleb Scharf at Scientific American

We Are Our Data, Our Data Are Us
by Caleb Scharf at The Los Angeles Times

Is Physical Law an Alien Intelligence?
by Caleb Scharf at Nautilus

Where Do Minds Belong?
by Caleb Scharf at Aeon

Autopoiesis (Wikipedia)

The physical limits of communication
by Michael Lachmann, M. E. J. Newman, Cristopher Moore

The Extended Phenotype
by Richard Dawkins

“Time Binding” (c/o Alfred Korzybski’s General Semantics) (Wikipedia)

The Singularity in Our Past Light-Cone
by Cosma Shalizi

Argument-making in the wild
SFI Seminar by Simon DeDeo

Coarse-graining as a downward causation mechanism
by Jessica Flack

If Modern Humans Are So Smart, Why Are Our Brains Shrinking?
by Kathleen McAuliffe at Discover Magazine

When and Why Did Human Brains Decrease in Size? A New Change-Point Analysis and Insights From Brain Evolution in Ants
by Jeremy DeSilva, James Traniello, Alexander Claxton, & Luke Fannin

Complexity 35 - Scaling Laws & Social Networks in The Time of COVID-19 with Geoffrey West (Part 1)

The Collapse of Networks
SFI Symposium Presentation by Raissa D'Souza

Jevons Paradox (Wikipedia)

What Technology Wants
by Kevin Kelly

The Glass Cage
by Nicholas Carr

The evolution of language
by Martin Nowak and David Krakauer

Complexity 70 - Lauren F. Klein on Data Feminism (Part 1)

Complexity 87 - Sara Walker on The Physics of Life and Planet-Scale Intelligence

Simulation hypothesis (Wikipedia)

Complexity 88 - Aviv Bergman on The Evolution of Robustness and Integrating The Disciplines

Building a dinosaur from a chicken
by Jack Horner at TED

Complexity 80 - Mingzhen Lu on The Evolution of Root Systems & Biogeochemical Cycling

Why Animals Lie: How Dishonesty and Belief Can Coexist in a Signaling System
by Jonathan T. Rowell, Stephen P. Ellner, & H. Kern Reeve

The evolution of lying in well-mixed populations
by Valerio Capraro, Matjaž Perc & Daniele Vilone

Complexity 42 - Carl Bergstrom & Jevin West on Calling Bullshit: The Art of Skepticism in a Data-Driven World

Episode Transcription

Caleb Scharf(0s): It's everything, right? It's this conversation being recording digital bits. It's the information that went to, and from your phone when you picked up in the morning. It's the video you made. It's all the financial transactions, it's all the scientific computation. And that, of course all takes energy. It takes the construction of the technology in the first instance, making Silicon chips is an extraordinarily energy intensive thing because you're making these exquisitely ordered structures out of a very disordered material. And so there too, we go back to thermodynamics and you're fighting in the sense against local fashion.

We're having to generate electricity to power, current informational world, that piece of the data. And the rather sobering thing is that already the amount of energy and resources that we're putting into this is about the same as the total metabolic utilization of around 700 million humans. And if you look at the trends in energy requirements for computation, for data storage and data transmission, the trends all upwards, it's an exponential curve.

And they suggest that perhaps even if we have some improvements in efficiency, unless those improvements are extraordinary then in a few decades time, we may be at a point where the amount of energy just electrical energy required to run our digital is roughly the same total energy as a global civilization at this time.

Michael Garfield (1m 54s): Chances are you're listening to this on an advanced computer that fits in your pocket, but is really just one tentacle tip of a giant spanning architecture for the gathering and processing of data. A common sentiment among the smartphone enabled human population is that we not only don't own our data, but our data owns us, or at least the pressure of responsibility to keep providing data to the internet and its devices and the general project of human knowledge construction implicates us. The evolution of a vast, mysterious, largely inevitable self-organizing system that has grabbed the reigns of our economies and cultures.

This is in some sense, hardly new since humankind first started writing down our memories to pass them down through time, we have participated in the dataome a structure and a process that transcends and transforms our individuality. Fast forward to the modern era when the rapidly evolving aggregation of all human knowledge tips, the scales in favor of the dataomes emergent agency and its demands on us.

Welcome to Complexity, the official podcast of the Santa Fe Institute. I'm your host, Michael Garfield, and every other week will bring you with us for far ranging conversations with our worldwide network of rigorous researchers, developing new frameworks, to explain the deepest mysteries of the universe. This week on Complexity, we talked to Caleb Scharf, Director of Astrobiology at Columbia University about his book, The Ascent of Information; Books, Bits, Genes, and Life's Unending Algorithm.

In this episode, we talk about the interplay between information, energy and matter the nature of the data home and its relationship to humans and our artifacts, the past and future evolution of the biosphere and technosphere, the role of lies in the emergent, informational metabolisms of the internet and what this psychoactive frame suggests about the search for hypothetical intelligences we may yet find in outer space. Be sure to check out our extensive show notes with links, to all our

We're having to generate electricity to power, current informational world, that piece of the data. And the rather sobering thing is that already the amount of energy and resources that we're putting into this is about the same as the total metabolic utilization of around 700 million humans. And if you look at the trends in energy requirements for computation, for data storage and data transmission, the trends all upwards, it's an exponential curve.

And they suggest that perhaps even if we have some improvements in efficiency, unless those improvements are extraordinary then in a few decades time, we may be at a point where the amount of energy just electrical energy required to run our digital is roughly the same total energy as a global civilization at this time.

Michael Garfield (1m 54s): Chances are you're listening to this on an advanced computer that fits in your pocket, but is really just one tentacle tip of a giant spanning architecture for the gathering and processing of data. A common sentiment among the smartphone enabled human population is that we not only don't own our data, but our data owns us, or at least the pressure of responsibility to keep providing data to the internet and its devices and the general project of human knowledge construction implicates us. The evolution of a vast, mysterious, largely inevitable self-organizing system that has grabbed the reigns of our economies and cultures.

This is in some sense, hardly new since humankind first started writing down our memories to pass them down through time, we have participated in the dataome a structure and a process that transcends and transforms our individuality. Fast forward to the modern era when the rapidly evolving aggregation of all human knowledge tips, the scales in favor of the dataomes emergent agency and its demands on us.

Welcome to

Complexity

, the official podcast of the Santa Fe Institute. I'm your host, Michael Garfield, and every other week will bring you with us for far ranging conversations with our worldwide network of rigorous researchers, developing new frameworks, to explain the deepest mysteries of the universe. This week on

Complexity

, we talked to Caleb Scharf, Director of Astrobiology at Columbia University about his book,

The Ascent of Information

;

Books, Bits, Genes, and Life's Unending Algorithm.

In this episode, we talk about the interplay between information, energy and matter the nature of the data home and its relationship to humans and our artifacts, the past and future evolution of the biosphere and technosphere, the role of lies in the emergent, informational metabolisms of the internet and what this psychoactive frame suggests about the search for hypothetical intelligences we may yet find in outer space. Be sure to check out our extensive show notes with links, to all ourreferences at

complexity.simplecast.com

and note that applications are now open for the Santa Fe Institute Complexity Post-Doctoral Fellowships. Tell a friend, and if you value our research and communication efforts, please subscribe, rate, and reviewus at applepodcasts or spotify and consider making a donation or finding other ways to engage with us at

santafe.edu/engage

Thank you for listening, Caleb Scharf. It is a pleasure to have you on

Complexity

podcasts.

Caleb Scharf (4m 48s): Well, thank you so much for having me.

Michael Garfield (4m 51s): Before we dive into your amazing book here, I would love for you to ground our conversation in a little bit of autobiographical background, tell people the story of your passion of the mind and how you became a scientist and what led you into the kind of questions you're pursuing in your career.

Caleb Scharf (5m 9s): Wow, how long have you got?

Michael Garfield (5m 12s): All day.

Caleb Scharf (5m 13s): So, you know, if I look back, I, you know, I had somewhat unusual upbringing and as much as both my parents were academics, which is not unusual, but they are academics very much in the humanities. Both my parents were art historians of some repute, and I grew up in a little village in England. And so to come out of that as a scientist, I think left me with a slightly different mindset than many other people that I know who've become scientists. And the way I look at the world, I see a lot of color in the world and that sort of informed my scientific career.

So I trained originally in physics and astrophysics. I have a PhD in cosmology and over time got more and more interested in more complicated things. Cosmology may sound impressive, but it's actually not so very complex concepts of the way the universe works. It's overall evolution and history. You're actually kinda simple in some words. And so over time, I've got more and more interested in all the messy stuff that happens inside, all the messy stuff that's in front of us. So I evolved my research, my scientific research towards astrobiology, just the study of the nature of life in the universe, both the quest to find other examples of life, but also to understand the nature of life and over time that's evolved.

And I think the book, this is the extent of information is one example of my personal evolution in thinking about the nature of the world, the complex world, where I've begun to look beyond just parochial biology, to see that some of the same patterns emerge outta the interaction of molecular structures on earth that gives rise to life. I feel that we see those patterns elsewhere. That's not a particularly original thing to say, I guess for me though, it's been a bit of a revelation to dig into all of that.

So that's kinda where I come from. I have a slightly colorful intellectual background as a child, and then I've sort worked my way through some quite conventional fields of research, but I found myself drawn to question of the more complicated and colorful things that exist in the world.

Michael Garfield (7m 22s): So this episode's a little different for us in that rather than focusing on your original theoretical research what we're looking at here is a rather extensive review of research and thinking and synthesis on this idea of information and particularly the idea of the dataome which I love this term. You've coined here.

Caleb Scharf (7m 47s): I'm glad you like the word. A lot of people don't seem to like the word.

Michael Garfield (7m 51s): Well, they're not on this call with this. So if you're listening, give another chance. Before we really get into this, I would love for you to just outline cuz you spend a considerable amount of the first act of this book getting into what you really mean by information and its relationship to matter and energy as is understood in complex system science today. And then what you mean by the dataome which is this sort of, I almost think of it as like a kind of Lovecraftian or Eldritch thing that we're all embedded in.

That's acting on us, that's governing our behavior. And that's where I would like to start this. And I'll give you the opportunity, I think as you do in this book, cuz you start the book with the lovely anecdote about visiting Stratford-upon-Avon and the legacy of William Shakespeare and its impact on human civilization history. And so yeah, why don't we lay out some fundamentals before we dive in?

Caleb Scharf (8m 50s): Yeah, well, so perhaps before we get to the dataome and that anecdotal introduction, they give it in the book. So information as a thing is a comparatively new concept. At least it's a concept that really came into focus in the 20th century that we can talk about information almost as if it were a substance in the world. I think beforehand, before the 20th century we had the word in English of information and we would apply it to all sorts of things. But the notion that you could begin to quantify information, the idea that certain passages of text might contain more information than others.

Whatever that really means is kind of a startling thing. And I think that, you know, that startling nature didn't really emerge until the 20th century. And it came from, I feel a combination of very much technological advancements. So the interest in communicating, using telegraphs and radio and all of these things and the beginning emergence of digital computation. And so the idea that we had to begin to quantify, I'm gonna send you something well, how long is it gonna take and what kind system do I need to send that?

How do you begin to evaluate that sort of was part of the seed of this idea that information can be quantified in quite austere fashion. This came to a peak really, or was, was really sort of capitalized the most by the work of Claude Shannon, many people will have heard of who in the late forties put together his thoughts on this and some quite famous publications that talked about the quantification of information. Now that was tightly related to the idea of representation of information.

So you want to quantify information, you have to represent it somehow. I think we're all so familiar with language and written symbols on pieces of paper, no matter what your background is, your spoken language or written language so that we sort of forget that that is a system. It has its own peculiarities, but it has an underlying logical basis. And so Shannon kind of dove into this and talked about the symbolic representation of information, data that contains information, talked about the most fundamental form of that or the most simple, basic form of that, which not unsurprisingly is to do with ones and zeros, you know, digits, the idea that you can represent stuff in ones and zeros, it's relevant to communication, its relevant to what was there and emerging ideas and computation and so on and mathematically, it offers sort of neatest, simplest entry point to all of this.

You turn everything into ones and zeros, but also turn things into probabilities. So this is the other piece of it. This is where it sort of connects to even deeper ideas in physics that, you know, representations of information in ones and zeros, well, ones and zeros are built outta something in the world. And they also resemble things that we talk about in the abstract in physics to do with states of matter, things can be in certain states. So atom can be in a certain energy state, maybe it's so-called state where it's sort of the lowest energy state possible, where it's an state.

These are very specific things that can be differentiated from each other, just like a one and zero can be differentiated from one another. And so I think Shannon began to see there's a deep connection between ways that we describe the physical world in terms of how matter and radiation occupy certain states. And this is something that comes from so-called statistical mechanics, which is all about thermodynamics as well. It's a sort of more modern version of thermodynamics, which helps us describe the behavior of stuff in the world, how it responds to energy, what it does in response to energy, how it moves, how it expands, if it's a gas, how it contracts and statistical mechanics takes that a bit further by explicitly, including the fact that matter and electromagnetic radiation could be divided up into little pieces, it's atoms and molecules and photons. And it's the statistics of the behavior of those little pieces, each of which can occupy certain states, whatever those are, might be a state of energy, it might be a position. It might be a velocity in space. Each of those can be talked about as a state. Statistical mechanics, the statistics of those states actually allows you to build thermodynamics, the rules of matter on larger scales, sort of macroscopic rules of matter.

So this is very long-winded way of saying that part of what Shannon was doing and became very apparent was in a sense digging into that same toolkit for describing the world. But recognizing that symbols could be an atom in a number of different states, you can have a one an, an atom in two different energy states and so on and so on. So it's all part of the same thing, which is really a kind of shocking realization. And I dunno for sure if this is how Shannon really saw it, he was quite fixated on solving problems in communication, a data exchange of data quantification, but it's there.

And, you know, there's all sorts of interesting anecdotes about some of the naming conventions that he invented sort assigned certain things that he derived mathematically. There's a thing called Shannon entropy, which is really a measure in sort of probabilistic sense of how much information there may be in something. It's a very austere measurement. It's something that in effect might tell you how compressible a dataset is. So your JPEG image on your camera has been compressed using a variety of different algorithms and things like Shannon entropy give an evaluation of how far you can go in that compression before you start to lose information.

So it's not telling us about the meaning of information in the way that we parochially use that term, but so entropy the fact that that term entropy was used, the anecdote is that this was suggested to him because nobody really understood what entropy was or his. And so he'd always win the argument when describe what Shannon entropy was. The truth is there is a deep connection between channel entropy and the sort of entropy we talk about all the time in physics. In many respects, it's one.

And the same thing. Entropy is in physics is a quantification and measurement of the distribution of things. Number of possible configurations of things, the number of states that stuff can fall into. And that is very much akin to this informational measurement of just how much information there is in certain symbolic representation. So that's a very long-winded introduction to this before we get to talk about the data.

Michael Garfield (15m 44s): Okay. From there, I do wanna make sure that we return to the Shakespeare piece here, but before we do that, you know, there's a theme in this piece that feels worth stressing. You say at different points in here that in a way decisions create phenomena and then elsewhere, that information turns out to be an epistemic property of data. You know, I'm thinking about, you know, that in the deep history of complexity science, you have this idea of the cybernetic helmsman, you know, of someone that is making a choice in the navigation of information.

And then you've got people like Francisco Varela and Humberto Maturana developing their notion of autopoesis in the way that there's a, if you were like a third person perspective in science, on the first person experience of things as simple as an individual cell. So a lot of what I see in the literature on this stuff has to do with the way that data is represented informationally and to whom. And it's about this idea of pattern recognition that information is not purely or simply an objective property of matter and energy in its configurations. But that I'm thinking about this comes up on the show a lot, the piece that Michael Lachmann and Cris Moore and Mark Newman wrote in 1999, about the physical limits of communication and how they demonstrated mathematically that optimally encoded electromagnetic communication between alien civilizations might be indistinguishable from black body radiation.

And so we could be for all, we know swimming in a sea of encrypted alien communications. So I'd love to hear you expound on that a little bit, cause you brought up meaning and how there's some nuance about what we're talking about when we actually mean meaning here and what that has to do with compressibility because you know, we're the point I wanna drive this into is when we do get back to Shakespeare and these bigger questions that you ask in the book about the thermodynamic and material burdens of our information processing and the limits of the efficiency that we can find in that processing, you know, that's the juggernaut I want you to chew on for us.

Caleb Scharf (18m 5s): Yeah. Gosh, where to begin. I mean, I think so one thing as you were talking that occurred to me and it's something I mentioned briefly in the book and people heard this, the physicist John Wheeler at some point in his life, began to talk about this idea of it from bit, that if you dig enough into the physical reality, that in the end, perhaps it comes down to really the influence of information on the world.

So it produces bit. And I think what he was getting at is the idea that suppose the universe was just filled with individual objects that were not having anything to do with each other, just sitting there scattered through space and time. And obviously that's not a very interesting universe, but it's also universe where it's gonna be quite hard to even say what the physical rules of that universe are because nothing is happening. So what do we mean when we say something is happening? Well, what we really mean is there are interactions and atom bumps into an atom.

A photon gets captured by an atom, or there's a spontaneous production of a things go on, interactions happen. And that's interesting point because if that's where physical laws from it's from interaction or interaction is a form of information exchange or information sort of manifesting itself in some way. I would argue, how do you know that that bit is a one or a zero?

Well, you have to interrogate it. And you have to interrogate it with something else. And in making that interrogation that will change the configuration of the interacting things. And I think that's part of what Wheeler was trying to get at. And I guess I bring that up because it is part of this question of meaning versus not meaning. Actually in my book, I completely avoid using the term semantic information, which is what everyone else uses. And yeah, that was to some extent, a conscious decision just to not introduce another piece of jargon in a book that's full of all sorts of complicated concepts.

But I also, I think by naming something semantic information, suddenly you've created this distinction between, oh, there's this type of information, this type of information and so on. And I actually personally struggle with seeing how to make that distinction. I think in everyday life, it's not so difficult to say without semantic information, the bus is coming at five two, you need to there to catch it as opposed to of are some random numbers. One you might say is semantic in nature and the other less so.

The question of meaning in information, therefore I think is very much context dependent. And that also relates to what you were talking about in terms of subjectivity information. And this is something one could explore for a long time. Part of what I try to get to in my, or have tried to get to in my thinking about it is what's the most interesting piece of information that you can have as a living thing? And that may correlate with the most meaningful information.

And again, many other people have been there before and have thought more extensively about it. But part of that seems to be the notion of information that enables a living system to enhance its chances of existing in the future. So it could be information that is directly to do with survival, it's information that tells you where the next food is going to be, or whether you should turn left or right at the end of the street or now is time to modify your chemistry because winter is coming or something like that.

And I think that's an interesting, and sort of decluttering way of thinking about meaning in information. It's a very practical approach if you will ,sort of dodging some of the more interesting philosophical aspects, the meaning of information. Let's just cut through all of that. Let's cut the crap and talk about the information that is most relevant to living systems. And that doesn't mean it's always easy to identify exactly what that is, but I think that's part of it. And that actually relates to where I think we're heading with some of this, which is the notion of the expenditure of energy towards information.

If I may circle round to Shakespeare at this point, I stopped by, you mentioned using this and it's based on real experience. It actually happened wasn't until on, in retrospect that I realized the relevance of this tale, but it was day that I took my family. We were in England, we went to Stratford-upon-Avon and we poked around for the day in Stratford-upon-Avon seeing all of the Shakespearean stuff that is there.

And it's a bit of a zoo. It's like a Shakespeare amusement part that the town is lovely, but it has fully given itself over to the commercialization without exception of Shakespeare. And by the end of the day, we end up actually visiting the church where Shakespeare was entombed. And this is famous epitaph, which I can't remember well enough to repeat, but many people will know it. It's about poking your nose at people, you know, not to disturb these bones or else bad things will happen.

Then it's lovely and it's kind of, it's witty and it's clever. But after this very exhausting day immersed in everything Shakespeare, I take a step back and realize, well, this is really strange because Shakespeare didn't do this to me. He didn't sort of emerge as a ghost and force me to do all of this, to drag my family around, to go and see all of these places to spend money on knick-knacks and, and so on. It's an echo of the information that he implanted in the world.

It's an echo of his plays, his sonnet, his writings, his ideas that has propagated through history and not had me propagated through history, it's grown, it's like snowball, it's gathered more and more around it. And this is really the beginnings of my thoughts on this thing that I've ended up calling the dataome. Shakespeare is just a really nice example of one little facet of it. And so the dataome is this idea of, is the word I assign to all of our externalized information that is instantiated in the world.

It's in ink on pages, it's in digital bits, it's in the structures that we build to support all of that. Arguably it's in everything around you, it's in the clothes that you wear, which were designed at some point, which took information to create in the loom that made the fabric in the patterning, on your clothing, in the transportation of the materials, in the planning of the cotton crops that had to be grown to make it in so on and so, and so, so the dataome is kind a catch to, for the totality of the information that we carry through time with us, for the most part, but it's not encoded in our DNA, at least not very explicitly, if all, yet, here its going through time with us and Shakespeare's works great example of that.

And you look at this and then the first thing that came to my mind, perhaps I'm overly cynical. You know, why the hell are we doing this? What is going on? Because if you go Stratford-upon-Avon it's very clear that enormous number of resources are being given over to this by the town itself, by everyone who visits the millions of people who trek through that place. I mean everything they're doing, they're expanding energy. They're spending time. They're expanding their own electrochemical energy in their brain to sort of interact with this informational remains.

It's a little strange when you see the world that way, but it raises all sorts of questions about why do we do this? And part of the answer relates to things of survival. The flip side to this is it's very easy to see how having that informational support system, that dataome travel through time with us as a species has enabled us to do all sorts of things we couldn't have otherwise. And arguably played a role in our, you know, these days it's hard to call it a success, but certainly in the sort of austere evolutionary terms, you know, right now we have been a success.

We have propagated ourselves very, very well across this planet and we've modified it to our needs and so on. And clearly the dataome helps with that, that in all sorts of ways that we can get into if you want, but it's not without cost. And that's kind of the next piece of this, which is, is it entirely within our control? And I would actually argue that maybe it isn't that this has in a very real sense and life of its own. We're an integral part of it, an essential part of it. But the dataome, if you look at it through the right lens, seems to have a life of its own.

And it's a very expensive life. It uses a lot of resources, a lot of energy, lot of human attention and so on.

Michael Garfield (27m 17s): So yeah, there's a lot there and I'm trying to make sure that we structure this in a way that, yeah, I'm

Caleb Scharf (27m 21s): Sorry. I can talk in shorter bursts.

Michael Garfield (27m 23s): No, no, no, no, no. I think people who listen to this show actually appreciate the deep forest ramblings that we get on here. If you don't, I'm sorry. That's how it is, but there's one point that I wanna make sure that we touch on here because this connects to so many other things about the kind of thinking at play and complex systems research. And one is that what you've just described here can be thought of and I think you do very deliberately make this point in the book in terms of what Richard Dawkins called the extended phenotype, that this is something, and this is again a very common sentiment these days, but that this is something like the way that a beaver dam or spider web operates as, you know, a part of the organism that's not necessarily visible when you try to extract the organism from the processes, by which it is relating to the world around it, the way that it constructs its niche, the way that it uses ostensibly non-living matter to form cognitive prosthetics and so on. And that this is all going on. As you describe, you know, even before the Shakespeare anecdote in this book, as part of this larger effort to bind and encode stable features of the environment and adaptive process with intergenerational cross checking and error correction, the same way that you see things like RNA acting on and repairing broken parts of the DNA sequence.

And so this idea of time binding is really important in the evolutionary origin and justification for the dataome why it exists in the first place. But it's also part of, as you just alluded to a moment ago about the tension between the agency and sort of metabolism and demands of the dataome and the agency metabolism and demands and personal interests of the beings that constitute it.

So that's something I'd love to hear you explore. And also, again, just to like double down on this point to make the case in this book, as you do, that this is not something that's uniquely human, that this is something that you talk about the dataome of Neanderthals and of other organisms and how this way of outboarding these processes is something that we see much like the eye or multicellularity emerging again and again, in the history of the biosphere, there's a strongly convergence set of pressures that is creating data owns plural.

Well, let's just start there and then I've got a place to follow through from there.

Caleb Scharf (30m 5s): Yeah. I mean, it's very interesting. You bring up the idea of other ultimate dataome or dataomes that sort of belong to other species. And I think part of what I've come to feel is that that is true, but in a more limited sets there, and again, this is part of the thesis in the book, and it's always a little bit, it's funny, you know, this is where my cosmological training kicks in, the Copernican principle. We're always so low to say there's anything special about ourselves.

But I think in this case, there is plenty of evidence that there are certainly extended phenotypes as Richard Dawkins originally proposed and codified. And we have those as well. I think the dataome would argue and do argue is more than just an extended phenotype because of its dynamism. But yeah, so we see things with other species that absolutely parallel this. I would argue that those are all much more limited though, really, and truly, and that there is something special about the human dataome.

And, you know, in fact, I know these days we will get, well, maybe not so much at the moment, but for a while, it was trendy to talk about human transcendence, singularity. You know, we're all about to either be overwhelmed by artificial intelligences or we're going to merge with our technology in some new, extraordinary way. I think we've sort of got the wrong end of the stack there. That transcendence for us happened already a few hundred thousand years ago when our particular branch of the dominant tree kind of locked on to the dataome and the dataome locked on us as this innovation beyond what had happened before.

And part of that is clearly related to our mental capacity for abstraction, for language, creating symbolic representations. Now that also probably co-evolved with the proto dataome that we had or paleo dataome, but something took place that perhaps had not taken place for any other species or groups of species on the planet before. And so I sort of human transcendence has already happened. You know, it's old news, right? Already, something quite distinct from what was here before.

And that is sort of extended phenotype on steroids to the power of 10, which is the dataome.

Michael Garfield (32m 38s): So just as a point for people to link out to additional resources that we'll share in the show notes, there's a blog entry actually from SFI researcher, Cosma Shalizi about this back from 2010 where he makes the case, he doesn't put it quite as far back as you do, but he does say the singularity has already happened. And it was over by the close of 1918. It was, you know, the industrial revolution that we look at, things like corporations, and we see how these things function in what Simon DeDeo in the seminar he gave it SFI last week would call borrowing from Western hermetic traditions, the egregore, which are these bodies that we participate in the same way that Lynn Margulis argued bacteria came together, endosymbiotically to form complex cells.

And then you've got, you know, just to give people another point of association for this, you've got Jessica Flack’s work in particular, you know, her paper on core screening as a downward causation mechanism, arguing that even in less sophisticated, if you will organisms like micox, that their efforts to model and understand one another in society end up leading to these collective computations that then shape behavior. So again, back to this kind of Marshall McLuhan thing about how, you know, we shape our tools and thereafter our tools shape us.

And this is where I'd like to dig in a little bit more on what you've said about the burden of these ideas and about the tension between our own intelligence and, you know, the ability to actually track and participate in the ratcheting complexity of the dataome and the way that it leads to you can fact check me on this, but I've read that the brain case of human beings 50,000 years ago was greater than the brain case of humans now that we've actually lost brain volume in the same way that our jaw started to shrink after we started using forks. And so that's something I'd love to hear you refine.

Caleb Scharf (34m 46s): I didn't know about the brain case observation, which is very interesting. I mean, you know, brain size is a peculiar measurement of things. I mean, for a long time, people assumed brain size correlated with how smart you could be or how sophisticated you could be, but it's not so clear that it's that simple. And so even something like a larger brain for our ancestors 50,000 years ago, well, you know, why did they need a larger brain volume?

It could have been physiological thing in response to climate conditions. It could have been something to do with how they had to operate to get food. They may have been much more physical than anyone on the planet today. I dunno, I'm just speculating, you know, it's interesting. And you look at brains of elephants, proportioned regions of their brains. And some of that is undoubtedly because they have a large body and they need neurons to deal with that.

And the active movement has to engage perhaps a lot more computation than active movement, even for something like us, although we're pretty complex. So yeah, so it's very, very interesting. And I think, yeah, it does connect through to, as you put it so nicely, this tension between sort of our success in the world as a species or just our, the probability of us continuing to propagate those as individuals and our particular gene lineages and our species gene lineage and so on and everything around us and how the dataome helps with that, or seems to help with that in so many ways yet does present this extraordinary burden.

And I think that burden has become much more evident, but it's always been there to some extent. I mean, you mentioned people referring to how singularity happened. You know, singularity happened the early 1900s, which I think is lovely. And I think that may well be a better sort of point of reference, but, you know, there are other interesting things that were going on in the early 1900s do with information, as we think about it today, almost digital information. So punched cards, something I talk about in the book and punch cards were for many decades, the primary way of storing information for industry, for finance, all those things, we got punch card machines, punch card readers, the first digital computers, first sort of commercial, digital computers utilize punch cards for programming and for data output and storage and so on.

And what's so interesting is those were tangible, physical things. They weren't invisible pieces of dope Silicon that none of us ever get to look at, unless you scrape away at your chip, they were very tangible in the world and they represented a very significant burden on our resources. And I think people have forgotten that. But if you dig into the history of this, it's really fascinating. You know, just the production of punch cards at the peak, just in the U.S. in, I think the mid 1960s, there were at least something like 200 billion punch cards being manufactured every year and, you know, sizeable piece of card or paper.

And then you have the physical act of punching them, takes energy. You have to cut those things around. I dunno what tonnage that amounted to, but I'm sure it was significant. And people were just producing more and more of these things. And what's so interesting about punch cards is they make it very easy to see the burden on humans. So as a burden of making all that paper, producing all these things, printing them, punching them and so on, but then humans had to carry the things around. If you were a scientist and you wanted to run a piece of code on a computer back in the 50s, 60s, even into the 70s, very often, you would have to put your program on to punch cards and then carry it physically and stand there and feed it into the machine and then retrieve it and carry it physically and put it somewhere safe in your filing cabinet and so on.

You were expending your energy, you know, the hamburger you had eaten up fueling your act of information processing later on. And of course punchcards went onto the side ditches of the roads of technology because they weren't terribly flexible and they weren't as efficient as purely electrical, digital information storage and retrieval and utilization. But today we have this ridiculous growth in the amount of data that we produce.

It's something like 2.5 quintillion bits of new data are generated by our species every single day, every 24 hours. And that's something like a trillion times, all of Shakespeare’s products, every 24 hours. And most of that, or a lot of it is finding itself somewhat permanently stored and it's everything. It's this conversation being recordings. It's the information that went to and from your phone, when you picked up in the morning, when you really shouldn't have done that thought about other things. It's the video you made, it's the, you took on your phone, it's all the financial transactions, it's all the scientific computation, it's everything in supporting the internet and so on.

And that of course all takes energy. It takes the construction of the technology in the first instance, which is very energy intensive. Making Silicon chips is an extraordinarily energy intensive thing because you're making these exquisitely ordered structures out of a very disordered material. And so there too, we go back to thermodynamics and you're fighting in the sense against entropy local fashion. That takes a lot of energy. We're, we're having to generate electricity to power, current digital, informational world, that piece of the dataome.

And the rather sobering thing is that already the amount of energy and resources that we're putting into this is about the same as the total metabolic output or utilization of around 700 million humans. And if you look at the trends in energy requirements for computation, for data storage and data transmission, the trends all upwards, it's an exponential curve. And they suggest that perhaps even if we have some improvements in efficiency, unless those improvements are extraordinary.

Then in a few decades time, we may be at a point where the amount of energy, just electrical energy required to run our digital dataome is roughly the same as the total amount of electrical energy. We utilize as a global civilization at this time, that's for everything that's, we're putting on your lights, running the pumps and your water plants, charging your electric vehicle these days. That will be matched by just our informational world. So you look at that and you think this might be a problem.

And that's another example of this tension. And I think when I started to look at this, I looked at these numbers and it feels rather disturbing, especially because we're so aware, acutely aware of the changes taking place to the planet's climate, the ways in which we generate energy or wise energy, this is not necessarily good for us, but at the same time, I think we've all had that feeling, just very personal level of being almost governed by electronic devices, our smartphones.

And of course, it's the algorithms that the companies produce in the first place to encourage us to engage with content, to click on things, to scroll through things, to perhaps look at an advertisement and so on, but you know, those algorithms are there because other people are trying to make a living. They're trying to make money, trying to survive. So it's this sort of this closed loop, but it all results in just an ever increasing burden, physical burden on our species. And that is part of what got me into thinking about whether the dataome is much more than just an extended phenotype.

I mean, yes, there's a burden to extended phenotypes termites build a nest and so on and so on and so on, but it doesn't seem to have this same sort of exponential growth that we see for the human dataome. And that kinda leads one down, all sorts of different paths. One path is to ask whether actually we're no longer really in control of this and the more realistic way to examine something like the is the dataome. And this is kinda a crazy sounding statement is that it is already an alternate or type of alternate living system here on the planet that coexist with us, we're in a symbiotic, perhaps even endosymbiotic relationship with it.

And that is why we're seeing our needs somewhat pulled on by the needs of this other thing. So in symbiotic relationships, there's this interesting game that's going on all the time between the needs of one versus the needs of the other and the benefit of nonetheless working together, whether that outweighs disadvantages to one or disadvantages to the other. And I think one can argue that we're seeing that same dynamic between us and our dataome in many respects, which is quite disturbing, but it's also intellectually very interesting because it ties back around to all of our ideas about the nature of information, nature of biological information.

You know, Richard Dawkins again has called sort of the emergence of information carrying structures in genes and DNA and RNA, information bomb that exploded here on earth a few billion years ago, and has been continuing that explosion ever since. There's something that feels almost unstoppable about propagation of information and the dataome as it is today, feels like it may be a continuing part of that information bomb that that sort rolling explosion, but a particularly energetic part, it's been catalyzed by a set of circumstances where this species emerged that began to generate more and more of its externalized information.

And it benefited from that. And so you have this rolling snowball effect.

Michael Garfield (44m 59s): So something that just to deepen this a little bit, before we move on to another topic here, something that comes up a lot in the show, we talked about it with Geoffrey West back in episodes, 35 and 36, was this idea of biophysical scaling and how as an organism gets larger and larger, more and more of its energetic demands must be allocated to maintenance and repair and West and Chris Kempes. And other people here have looked at this in terms of how scaling laws determine the frequency of cancer and multicellular organisms.

A few years ago at our science board symposium Raissa D'Souza, another of our external professors talked about how there's a kind of principle in networks in which as they grow and the costs of bureaucracy in those networks grows that it leads to or provides the mechanism for endogenous collapse. And this is the thing that comes up again and again on the show right now, because as you said, I, you know, this is something that's so palpable and so imminent and urgent for so many people now.

And so there's this kind of a paradox, or if we wanna stay Shakespearean on it, there's a tragedy kind of built into all of this, which is, you know, one way to articulate it is Jevon’s paradox, which is this principle that efficiency gains they don't get used except to continue to feed into this ratcheting and ongoing thing. And so you get these things where even though, you know, we innovated fossil fuels and photoville take energy and so on now as a species, we're actually burning more wood and coal than we were in the past.

And so when you're talking about, as you cite early in this book, you cite a study by Williams, Ayers, and Heller that the actual footprint of a microchip, if you wanna think about it in these terms is like 1.7 kilograms that this tiny little thing in your phone, contrary to, again, to talk about like pattern recognition and, you know, semantic information, the way that this appears there's been so much rhetoric outta Silicon valley, that technology is becoming more and more ephemeral, but you know, your book rhymes rather strongly with a long term fan of SFI technologist, Kevin Kelly, who makes the point that actually what we're just seeing is the tip of the iceberg.

And that the phone is just this tiny little terminal unit that's, you know, the tip of the tentacle of this, again, this rather massive thing that we're all participating in. So to make it a bit more explicit, I wanted to talk with you about the way that we reconcile all of this, the way that we adapt to it in terms of the evolutionary value of forgetting, and to your point about the transition from, you know, punch cards to cloud server data and this kind of thing, that this is tied to a key idea in complex systems thinking, which is the idea of evolvability and there being a kind of a phase transition in which much like the way that multicellular life regulates the lifespan of individual cells through program cell death, and apoptosis, and much like it seems brains deal with the demands of adapting to complex environments through forgetting that this is going on at the scale of civilization itself.

And so there's this weird thing that I'd like to hear you explore a bit, which is that even as we extend our collective memory further and further into the past, and even as we extend our predictive horizons further and further into the future, that in another way, in ways that people like Nicholas Carr has explored in his book on automation called TheGlass Cage, that we're becoming more and more forgetful at the same time. And so this has really profound implications for our ability to actually navigate search and employ the dataome as it employs us.

And so, yeah, please go into that a little bit.

Caleb Scharf (49m 6s): Yeah. Well, terrific stuff. Yeah. Very, very interesting. Yeah. So as you were saying that last little piece that notion that we, yeah, in some respects, certainly those of us privileged enough to live with the full benefits, if you will, of the dataome in our technology and our comfort that we're not having to spend a lot of time worrying about where our food is coming from. And remember a substantial fraction of the human species is still functionally illiterate.

So some of what we're talking about, there's a differentiation between how some humans engage with the dataome and how others engage with it. But we all feel the impacts of it. I think that is unavoidable, but yeah, this idea that, you know, in some ways we're becoming stupid or less capable. I've certainly had that experience, you know, I need to get somewhere. So, you know, in the past, maybe I'd look at a map or maybe I would sit there and try to remember really hard. Okay. So is it down past that street?

Is it on that avenue? Yeah, I think so. Does that landmark. And so now I just get my phone and I just put in the address. So I don't even do that. I kind of wiggle the pointer around. And the same is true with trivia and facts and you know, anything like that. I do not make the effort in the same way that I once did to remember things and that's me and I'm getting old. So for younger people, I'm sure it's even more acute, if you not ever had to resort to your own neurons for certain things now, as you were saying that it made me think of something that, again, I do to little bit in the book, which is let's suppose that this doesn't end in disaster and there's no guarantee that it doesn't right.

I mean, evolution doesn't give a damn right. It's just doing what it does. It doesn't see the future particularly. It's just what works, works. And if it suddenly doesn't work oh, well, right. The universe is not it's keeping score, I think, but let's suppose, you know, things keep going for us with our dataome, somehow these burdens of energy and so on, we manage to deal with that well enough that we don't completely ruin the basis for biological life on the planet. You know, what do we do in the future? What do we become?

And this gets to the idea and it's, again, a bit, well, probably a very outrageous idea that suppose we are in a symbiotic relationship. And in fact, suppose we are in an endosymbiotic relationship. Now, when we talk about endosymbiosis the classic example, everyone brings up is mitochondria, these little bundles of DNAs, little pods that live in all complex cells and presume to have gotten there through a process of symbiosis, engulfment whatever ended up with them becoming endosymbio.

So they are supported by the cells. They've stripped away all their superfluous DNA or the genes they don't need anymore because they don't need to be very complicated themselves. They little chemical processing plants that help energize the larger cells that they're in, but they also genetically don't evolve particularly quickly generation to generation. In fact, they're very slow in terms of mutational point changes in their DNA because if they don't work, the host organism, the larger organism doesn't work at all.

And so they do not exist in the future. So it's really important that they're pretty stable. They're pretty well preserved through time. Preservation is perhaps a good way to think about it. You want to preserve that essential piece of your cellular toolkit. Well, what if humans are actually the mitochondria for the data or one of the various cellular structures or the equivalent of one of the various cellular structures that we know benefit the overall entity, but are themselves sort of simplified and paired down.

So in that sense, and this is, you know, it's a little bit science-fictiony, but you know, the erosion that we're seeing of some of our cognitive skills and our capacities, maybe the beginnings of an erosion of the things that are not that necessary for us, you know, it takes energy to navigate your way through the world without your phone. It takes electrochemical energy in your brain. You can kind of get rid of that need by offloading it somewhere else, but you know, what do we really give the dataome in the end? Well, perhaps what we give the data home is something that is still very, very difficult to accomplish with machines, with algorithms as we have them at the moment, that is the spark of new ideas of novelty that we continue to inject novelty.

Even the dumbest, most uncreative novice is a novelty machine, right? We can do it. You know, we get outta bed and we trip over our shoelaces. That is novelty, right? That's something that if you were robot, you perhaps would never do, because you're always gonna have everything in precisely the right place. So one pretty outrageous idea endpoint to all of this, assuming that the energy needs of the data and to wreck the planet is that what we're seeing is the beginning of our being subsumed into a greater entity as a type of mitochondria.

And what we bring is novelty, but that doesn't require us to do things like navigate the world or remember too many facts. In fact, that may get in the way cause, oh, we remember that. So we don't invent something new. Necessity is the mother of invention. Something to that. I think so again, this is pretty crazy stuff. I'm extrapolating pretty hard here, but it doesn't seem totally unreasonable that that could be one endpoint for us in the dataome, but it might work really well.

And it might provide that essence that thus far and I know there's a lot of controversy and debate in the world of machine learning at the moment about whether some of these algorithms are actually being original at all. My suspicion personally is that they're not. They're very good at assembling complicated correlations and things like language and visual media, but that's kind of it, there is something else that goes on in biological brains that jumps beyond that. So maybe that's our end point is we're going be the mitochondria.

Now the one good thing about that is we do get preserved in some way and conceivably the thing that we have preserved the most is our capacity for originality of capacity, for novel capacity to that, that starts to sound a little poetic. And so just sound like Carl Sagan, which isn't a bad thing, I think,

Michael Garfield (55m 51s): No, not at all. You know, just to expand and riff on that with another bouquet of SFI references. I think the work that got me into these deep questions in the first place was a paper I read with Martin Nowak and David Krakauer on the evolution of language back in my senior animal communication seminar at the University of Kansas. And in this, they talk about the way that we had, you know, again, to provide some continuity into deep time here that the burdens to our individual memory and information processing in this evermore complex environment that we create through our interactions.

This is not a new thing in that we've in a lot of ways, what we take to be distinctly human emerged out of a response to similar crises in the past and the evolution of syntax as a way of having to remember fewer words and fewer rules for the combinations of those words, as a way of circumventing the error threshold created by having to remember so many things and to handle so many things in the processes of social cohesion.

And so when we're talking about thresholds and turnover rates, you've got this beautiful analogy that I love in this book about the C squirt, which is this, you know, or tourniquets, which are these kind of sister clay to core dates, you know, like everything with a spinal column and how these things start out in life in lava form with head, but then eventually they anchor themselves down and they become filter feeders and they lose all of that neural tissue, which is so costly to support.

And even though you sound a little self-conscious about the way that this line of thinking verges into science fiction, actually something that I feel like I talk about with David every time I see him in a meeting is how much he and I love the show Westworld and how Westworld in this show that's about recreating the human and the show is obsessed with fidelity and also obsessed with this idea of virtualizing the environments in which these mechanically reproduce people find themselves. So there's a deep connection here between all of this and this piece that was explored at great depth in our 2018 seminar and developmental biases and evolution, and specifically this idea of neoteny, or paedomorphosisand its relationship to domestication and how, you know, there's this trend towards simplicity at one level, even as there's a trend toward complexity at the other level, but at risk of just like spending our entire conversation beating this point to death I want to talk a little bit about, yeah, all of this in light of something, you have an interesting theme here in the book where you talk about how you can start to see this. And I brought this up with our Miller Scholar, Andrea Wulf in her biography of Alexander Von Humboldt. And she talks about how you can see him kind of reaching the error threshold in his own scientific understanding of the world and his own limits as an individual researcher in the way he starts to increasingly rely on a kind of syntactic structure of additional interdisciplinary scholars in this global network of research.

And you talk about how Darwin was one of these people who was constantly kind of performing an act of personal archeology and digging back into his stuff. So again, on, you know, to just as one last flogging of this particular point, I love the term that you use in this book, data resuscitation and the way that, you know, one of the more interesting things that I think about the way that science is practiced at SFI is that it's practiced with the understanding that we do not actually know which insights will ultimately prove valuable.

And that, like you said earlier, you know, it's actually really hard to define semantic information because in one sense, much of the value that's produced by the economy at any given time, we talked about this with Lauren Klein and our episodes on data feminism and her use of the digital humanities to reconstruct the actual like process of cultural innovation and historical records. It's that so much of the value that's being produced in the dataome is not apparent to any of us at any given time. And so what gets forgotten or abandoned often ends up turning out to have more significant value.

You know, the archeologists that work with us like Stephanie Crabtree and Amy Bogaard you think about the way that actually some of the more important finds from antiquity are the trash mittens, not the things that ancient societies cared to record, but the things that they didn't consider relevant, the things that they were throwing away. And so, yeah, I'd love to hear you talk about that particular thing and the way that there are kind of like not only cycles of forgetting, but cycles of remembering in the dataome and of reestablishing context and value and meaning and significance.

Caleb Scharf (1h 0m 47s): Yeah. Actually, I, you know, there's a whole other hour conversation here from this. Yeah. I mean, two things that come to mind as you were talking, I mean, yeah. So one is absolutely this fact that there can be stuff in the dataome, information, you know, clay tablets. I talk about in the book among other things, you know, at some merchant's notes on who he was selling what to, and how many bits of corn he needed and how many chili peppers and so on, and how could that possibly be of interest to the future?

It's like saying that aliens would be interested in what I have for dinner tonight, but of course, thousands of years later, we've unearthed these clay tablets from this merchant's meticulous note taking. And they're a fascinating piece of information actually, incredibly valuable because they tell us about what was being traded. They tell us about the social connections. They tell us about the trade routes, some of which are not evident in the world anymore, but can be archeologically reconstructed on the basis of what is written in those tablets of these ancient trade practices.

So yeah, so it's very evident dataome resuscitation is a thing and it's extraordinarily beneficial. It seems to us to have this persistent history tracking us along. And in fact, the more of that there is, it seems like the better things will be in the future. And one way I look at that's experimenting, that's going on in the biology and data and information and every movement of every molecule, it's endless experimentation and the lovely thing and in the earth itself is in some respect, a storage vessel for that history, but it often overrides itself.

And the human dataome is interesting that we go to some length to prevent that overwriting. And so that's part of the reason dataome is growing all the time, right? We wanna keep that picture that we took of our kid, that when they were, you know, two weeks old, right. Even when we're, we wanna be able to look at that and we all do that all the time, the billions of us, and it clearly, it unlocks something really interesting, which is this resuscitation, this ability to reach back. But I, as you were talking, it also reminded me that there are potentially equivalents in the biological world to that's in sort of apparently neutral mutation or changes in DNA, for example.

So you can change a base pair and it doesn't change the amino acid that's produced at the end and then locked into a protein structure. And it doesn't. So it doesn't matter or it can be more dramatic. Mutational changes that nonetheless seem kind of neutral at the time, right at that moment, when that happens in that organism or that species, it doesn't first I'm new phenotype or least not apparent phenotype. It doesn't make any big change, but somewhere in the future, because that change occurred, a new change can take place that has a dramatic effect on the success or failure of that organism.

And I know this is something that's being talked about a lot in evolutionary biology at the moment, the role of sort, are they really neutral changes, but it feels a bit like the same thing that in a sense that invention, data resuscitation, it was already there. It was already in DNA. It was already in the way information has been propagating in molecular structures for three to 4 billion years on the planet. It's just been outta sight until you start to decode, what's taking place. So in that sense, it feels very natural that this could play a really important role in the utility of the dataome and how it plays into the successful failure of our species and all the other species that are tagging along with us.

There is something remarkable about it as well. And I think as our technology continues to improve, what we're storing today, the fidelity of what we're storing today is so much greater in many ways than it ever was in the past. And so, you know, in a hundred years, descendants, they going to be building virtual realities based on the world as it is today. And in sense, there be a capacity to time travel, the greater the fidelity of the information that we store about the world today as a sort of gift to the future is potentially be far greater than it ever could have been in the because of improved technology.

So, you know, we fantasize about time travel and physicists will always tell you, you cannot travel to the past. You travel into the future. Just go to the speed of light or go sit around a black hole, whatever, and you can the future without aging so much yourself, but the past's inaccessible. Well here is a way to the past and what we're laying the groundwork for right now, unconsciously, perhaps inadvertently is potentially a future where the past is as much part of everyday experience as that everyday experience is itself.

And I dunno what that will do to us. It'll be an amplified version of what we have right now.

Michael Garfield (1h 5m 53s): Actually you just linked us directly into the conversation that we just had with your fellow astrobiologist Sara Walker, where she made the exact same point that there will be a kind of a way of simulating this. And of course in this book, you get into Nick Bostrom and, and simulation hypothesis, which, you know, to stress the Copernican piece of this, I know that you are very careful not to just buy into all of that whole cloth, but the idea that if in fact, we do prove capable of these kinds of things, then in all likely the vast majority of phenomenal world spaces are themselves sort of simulated environments.

But I wanna get a little bit less out there because another thing that you just brought in was something that we addressed in the conversation that we just had on the show with Aviv Bergman, where he was talking about the evolution of robustness and how it is that something, you know, an anatomical feature like a hand with five fingers can remain stable in the face of an enormous amount of disruption at the genetic level. And so what ends up happening is that these robust phenotypes end up harboring a ton of cryptic variation.

And so again, just to sort of bolster the point that you're making with work that's being done in a much more mundane sort of biological research that, you know, when people like Jack Horner proposed that you could actually de-extinct dinosaurs by switching back on genes in a chicken that control for scales and teeth and so on. That's very much in keeping with everything that you've just discussed.

Caleb Scharf (1h 7m 32s): Yeah. Yeah, no. And I think maybe part of what is going on here is lots of people are starting recognize that in some ways, nothing new is going on. These same principles that are at play in biology have been at play in biological evolution for billions of years. We're now sort of recognizing those principles and realizing we might be able to apply them to whole variety of different things. And that itself is part of evolution, maybe.

You get the species emerge that become capable of sort of bootstrapping off of what got them there in the first place and expanding that and so on. I dunno if I can offer anything terribly intelligent on top of that, except that there's a sort of unity to all of this and all of this has happened before, all of this will happen again, just to use a quote, both from pop culture and before that. The interesting thing is that we're perhaps better equipped than ever to begin to see these things emerging out of great complexity, but can see these principles that are not principles in the way that are 19th century physics principles were.

They are principles of very complex systems, principles of emergence. And so on beginning to see that. What will be so interesting if I could live for a thousand years would be see what that feedback is on the progression of things, on the changes, things that conscious actually inserting ourselves, creating that feedback didn't exist before maybe, you know, maybe the dataome is an integral of that.

Michael Garfield (1h 9m 13s): So there's one more topic I wanna explore with you before you go. And that's just to do with the way that not all of the signaling going on in all of this is what you would call true, and that when you apply kind of game theoretical thinking to communication in populations of organisms or in between different kinds of organisms that false signaling or lying is itself an evolutionarily sustainable strategy because false signaling is constantly mutating to adapt.

You know, that like free writers and cheaters are something that persists in systems where it's easier to tell a lie, but then of course you get found out, but then the lie, you know, mutates. And so you talk about how the dataome as it exists now and again, this is a difference more in kind of quantity than in kind with some of the earlier ways in which this plays out in biology and ecology is inundated with fakery and that what we're in the midst of right now is a quote machine driven arms race unquote, in which things like deep fakes, which carry with them an enormous material and energetic cost, not only in terms of their production, but in terms of the processes of counterfeit detection that we have to bring to bear on all of this, that this is a non-trivial part of what you were describing earlier is the exponential scaling of the burden of the data.

And so when we're talking about what this all means in terms of finding new ways to like in episode 80, when I was talking with Mingzhen Luhere about the evolution of the modern microrisal affiliations between plants and fungi and how they emerged in part as a way for fungi to metabolize all of this available waste, ligman from fallen tree stumps. And so on that, there's a way to tell the story of the biosphere as a way of turning pollution into a resource.

And again, if we're talking about like humans as the mitochondria within Google's intestinal tract or whatever, this has a concrete precursor in the evolution of the mitochondrial function, and, you know, the oxygen based metabolism as a response to the overproduction of oxygen in a photosynthesis in cyanobacteria 2 billion years ago. So, you know, I'd love to hear you just expand on that a little bit and talk about the way that, you know, cause Carl Bergstrom and Geoffrey West on the show, and we were talking about bullshit detection and, you know, bringing critical thinking to bear on misinformation and disinformation.

I mean, all of that is extremely important, but it neglects this weird sort of insight that misinformation and forgery and all of these other things may actually be serving a greater function within the dataome and that we're going to learn just as you know, computing itself has learned to harvest and exploit noise in the processing of signals that we may find that in fact, the lie serve some greater purpose.

And yeah, that just seems like an interesting, weird place to wrap this.

Caleb Scharf (1h 12m 39s): I think it's a great place. Yeah. I mean, I think there's a couple of things there that you started to put your finger on. I mean, the, you know, the interesting thing about deep fakes and so on is that the underlying machine learning approaches, these generative adversarial network systems are innately competitive, right? So make a really good deep fake. You have to have one algorithm, one machine learning system competing with another one that is desperately trying to identify the fakes, right? And they're in this loop of I'll fake you.

Nope, I've guessed it correctly. So I'll go back, I'll modify my fakery and see if I can do better and better. And they can do this millions and millions of times, which is why deep fakes are now really extraordinarily good, at least for human minds. So there's this interesting thing there that to make one really exceptional algorithm that let's say can correctly identify, you know, certain humans or recognize certain kinds of music, or it's not, you actually have to use fakery.

You have to challenge it with lies. You have to challenge it with fake data. Now I'm sure that has relevance to biology, right? So a creature learns that it can fake something to get past you. So over time, perhaps this is obviously a very simplistic way of thinking about it, you develop better senses, you develop better visual discrimination powers to overcome that because you don't want them sneaking past you to get your cookies or whatever. So I think for the, the dataome, the interesting thing about that kind of deliberate fakey is that it does drive innovation, right?

Drives the generation of other algorithms, other machine learning systems, and then just, you know, fakey all levels does have that benefit of forcing machines systems and humans to be that much more aware of what's going on. So that's one piece of it that perhaps you can't get to some of these really interesting places in terms of machine learning in terms of utility of the dataome, without having to deal with fake group, without having something that's producing that fakey, whether it's driven by a human or algorithm, but then even slightly stepping back from that to slightly higher level, you could look at this and say, well, at some level it is the dataome and figuring out another way to expand itself, right.

And to use evermore energy and it's successful. It's just like we expand as a species, we use more energy, we consume more food. So, and so we're doing better. We're doing better. We're doing better. This aspect of the dataome is churning through more energy right now, maybe to our detriment, but for the dataome that may be exactly what is right in terms of its ensuring its existence in the future. So I think there's, yeah, there's a couple of things that go on here that are very, very interesting that yes, the truth in information is very important in many, many places, but without bad information, whether it's deliberate fakery or whether it's just corrupted or it's inadequate, or it has low fidelity, it has, you know, awful signal to noise.

There's a role for that in encouraging the growth of other things and improving the sophistication and efficacy of other elements of the dataome for sure.

Michael Garfield (1h 15m 58s): So just in closing, because I feel like we've in a lot of ways we've maybe orbited around or rather straight rather far from the actual work that you do as an astrobiologist because you do touch on this in the last act of your work here. I'd love to hear you go into a bit of detail into how all of these ideas actually, ground and substantiate in the work that you're doing to think about the properties of and discovery of life and mind on other worlds. Where the boots hit the ground for you as a researcher, is what all of this thinking means in terms of the big questions about looking for the N equals two, the second sample of life out there in the world.

And so how does that show up in your research?

Caleb Scharf (1h 16m 49s): Yeah, absolutely. I mean, I think it's, you know, it's funny, the book was an opportunity to explore these ideas and now those are sort of percolating into my more ordinary research. So one of the things I'm interested in that I work on from a variety of directions is the notion of what life does to a planet. I mean, obviously it's a co-evolution of life and a planet, but you could also say, what does life do to a planet? And the difficulty with asking that question is what do we mean by life?

Do we just mean microbial organisms as they have existed on the earth? Or do we mean more than that? And there's this interesting bias. And so we, a lot in astrobiology personally feel is about really trying harder than we have before we haven't done a particularly good job at it yet is pushing past well past the innate biases and assumptions that just flood everything we do in astrobiology, and there's good reason for that because, you know, if we don't know what we're looking for, it's gonna be that much more difficult, right?

So we tend to go to the template that we understand, which is the modern earth and so on. But so for example, we look at our existence today and we look back and we say, yeah, look, you know, most of Earth's history has not included anything like us. The life on earth has been dominated by microbes that have been these bursts of complex cell life, plant life on land its own happen at some point, but even so the larger portion of involves different things, microbes modifying the nature of the planet.

So that's what we should look for. If I could be here in a few billion years time, If I could be here, I mean, the evolution of our son may not allow this, but suppose I could be on earth in let's say 5 billion years, time looking back at the history of the planet, I might say, yeah, there was that short period of maybe two and a half billion years at the beginning, whereas all microbes, but then it was all this other stuff, including sophisticated bio machine, whatever weird substrate life that has existed really well for the last 5 billion years.

And that's what we should look for in the universe. Now we can't see that future from where we are now, but it's possible that is the future. And so one of the things I'm interested in is taking that perspective today, perhaps it is more likely in the end that what's out there in the universe is not what has existed on the earth for the first two and a half billion years, 3 billion years, it's what's coming in the future. And I think stuff like the dataomeor the idea of the dataome maybe gives us a little window and part of that, how we extrapolate that to the future.

I dunno for sure. But I think it opens up a number of ideas. And I know people like Sara Walker thought a lot about this, the meaning of life, you know, what is life as a phenomena without an underlying principle, it's gonna be very hard to push further, but if we begin to develop underlying principles or laws of life, we need to test them on something. And it may be that something like the emergence of a data is a wonderful test case for these ideas about deeper principles behind the phenomena that we call life, deeper principles, akin to things like quantum mechanics and so on.

So that's kind of a longwinded way of saying that in my own research, I have become increasingly interested in trying to explore novel, well novel to us, novel ways in which life might change planets, but not just planets entire systems because if exponential growth in something like the data owns spills beyond the confines of single planet, then it's just going to keep going, right? The need not be any stoppage. And what might that look like? That may mean thinking about so-called techno signatures, but even that maybe a rather restrictive and parochial way of thinking about this.

I actually don't think of the dataome as a technological thing anymore. I think it's something more than that. And maybe that's the way we need to think in order to spot things out there in the universe. Otherwise we just wouldn't recognize,

Michael Garfield (1h 20m 58s): Well, that's a fascinating and bizarre and kind of numinous place to land this Caleb. Any final thoughts before we close here? I mean, we'll send people to your work. I mean, is there any other question you wanna leave people with or any guidance you have for people that, that are now totally turned on by all of this and wanna pursue it further?

Caleb Scharf (1h 21m 19s): A lot of my thinking on this I feel is still very simplistic and there are many people, many of whose names you've mentioned in the course of this conversation, who've done far more sophisticated work on this. Many of them are SFI and are associated with SFI and I would suggest that people follow those threads. I give many of those references in the book. What I've tried to do is skim the surface, partly for my own interest and to learn something, but also to try to simplify and, you know, present an overview of many of these ideas. So yeah, I did encourage people to do that. SFI is remarkable incubator for these ideas, for the implications and then the deeper, robust mathematical concepts that lie beneath them.

Michael Garfield (1h 22m 2s): Wonderful. Well, thanks again so much for being on the show. This was a total delight.

Caleb Scharf (1h 22m 7s): Well, thank you so much. Yeah.

Michael Garfield (1h 22m 10s): Thank you for listening. Complexities produced by the Santa Fe Institute, a non-profit hub for complex systems science located in the high desert of New Mexico. For more information, including transcripts research links and educational resources, or to support our science and communication efforts. Visit Santafe.edu/podcast.