This week my guest is professor of neuroscience at University College London, Karl Friston. Viewed by many as one of the world’s most influential neuroscientists, Friston rose to prominence when he pioneered one of the key techniques that allows neuroscientists to analyze brain activity. And as if that wasn’t enough, he has since developed the Free Energy Principle, which some see as monumental to the field as Darwin’s theory of evolution was for biology and genetics.
It’s this work on the Free Energy principle that will be the bulk of our conversation in this episode, and I warn you that this is probably one of the most intellectually challenging conversations we’ve had on the show. To help you navigate this, I want to offer just a quick overview that may aid in understanding the ideas. In essence, Friston’s work roughly states that entities that exist must track information from the world around them, create an internal model of that information, and then use that model to navigate the world in a way that reduces the difference (the error) between what was actually experienced and what one’s model predicted.
While this concept may seem simple on the surface, the actual science behind it is detailed, complex, and holds immense influence for how we develop artificial intelligence.
Learn more about Friston and his work at fil.ion.ucl.ac.uk/~karl/
Learn more about Singularity: su.org
Host: Steven Parton - LinkedIn / Twitter
Music by: Amine el Filali
Karl Friston [00:00:01] If you look at machine learning and you just look at the trajectory, it's all a trajectory measured in terms of big data or that number. How many billions of parameters can your large language model handle? And that's exactly the wrong direction from the point of view of the physicist. You should be looking for the the least data, the smart data, the sparse data that you need in order to resolve your uncertainty.
Steven Parton [00:00:43] Hello, everyone. My name is Steven Parton and you are listening to the feedback loop by Singularity this week. My guest is Professor of Neuroscience at University College London, Karl Friston. Viewed by many as one of the world's most influential neuroscientists, Friston rose to prominence when he pioneered one of the key techniques that allows neuroscientists to analyze brain activity. And if that wasn't enough, he has since developed the free energy principle, which some see as monumental to the field as Darwin's theory of evolution was for biology and genetics. It's this work on the free energy principle that will be the bulk of our conversation in this episode. And I want to warn you now that this is probably one of the most intellectually challenging conversations that we've ever had on this show. Therefore, in order to help you navigate this conversation, I just want to offer you a quick framing that may aid you in understanding the ideas that Professor Friston will be putting forth. In essence, Preston's work roughly states that entities that exist must track the information in the world around them, create an internal model of that information, and then use that model to navigate the world in a way that reduces the difference or the error between what they actually experience and what the model predicted. This concept may seem simple on the surface, but the actual science behind it is very detailed, very complex, and it actually holds immense implications for how we develop artificial intelligence and machine learning. So if you're ready for this intellectual challenge, then take a moment to prepare yourself, and then when you're ready, come back. Press play and welcome to the feedback loop. Karl Friston. Well before we get into anything to technological and technical, I think a good foundation for us to start on is to understand the free energy principle. Now, I know this is a often complex or complicated idea that you've talked about elsewhere that has a lot of details worth explaining. But I'm wondering if you could kind of start us off by just giving us a layman's overview, some very basic, key points about the idea of the free energy principle.
Karl Friston [00:03:15] My usual response to this question is I think there are two roads to a quick understanding. So the low road would be taken by a neuroscientist or a psychologist, and the story starts probably with Plato and through Helmholtz and through some psychology in the 20th century, ideas which were then taken up in machine learning. And then we see them in artificial intelligence research, that all this story is basically that we are prediction machines, that we, our brains are based in the game of trying to make sense of all the sensory data that our senses gather in terms of what could have caused that. They put it as simply as possible. The idea is that we don't extract information from sensory data or sensory input. What we do is, is a much more constructive act of perception. So it's a sort of inside out generation of predictions. So we're trying to predict if the world was like this on the outside of my sensory veil, on my sensory blanket, what would I expect to see? And then I'd take what I'm actually saying and subtracted from what I'm what I predicted I would say to form a prediction which measures the surprise that I have about my hypothesis. A bottle of time to say about what's going on out there beyond the sensorial. And we use that surprise or prediction error to adjust hypotheses or models or fantasies or internal representations in order to resolve that prediction error. Another way of looking at this is you're trying to find those explanations for your sensory data that are least surprising. So you try to minimize your surprise prediction are being a measure of the surprise. So that's the low road. If you want the high road, even more briefly by the high road would be basically how can you describe self-organization through the lens of physics that basically entails describing the principles, usually principles of least action of the kind you led to school that are apt to describe self-organization to non-equilibrium steady states. And you may be asking what is a free energy current? Well, this is a particular kind of free energy used by Richard Feynman. It's trying to solve a very similar problem of evaluating the evidence for the probability of certain trajectories behaving the better little particles traveling through the ether. For us, it's basically about we take the narratives we pursue in our life and in our active engagement with and with the world. But the maths is exactly the same. So this free energy is just a ban on or a proxy for either surprise or negative surprise, which is no evidence that a machine learning would be known as an evidence low amount or an elbow. So we either maximize the elbow or minimize the negative free energy. And then we are basically describing exactly the same process of making sense of the world by maximizing the evidence for our models of the world or minimizing our prediction errors. That's basically it.
Steven Parton [00:06:55] So you can tell me if I'm wrong with this. GROSS Over simplification, but would it be fair to say that we have basically a thermodynamic? And what I would say is maybe like an informational theory version of the of the free energy principle, where both basically say that you have a world of energy or information and a system or entity that exists must. Find a way to model the information of the world around it and continually update that model in a way that will reduce the likelihood that its model is incorrect, that it will be surprised by something that it didn't model. Is that in a very simplified manner? Fair?
Karl Friston [00:07:43] Yeah, absolutely. You've clearly read the papers. That's absolutely true. And you make an interesting point there. You know, I have a similar dynamic Reading of free energy is distinct from the kind of free energy that we were just talking about, which is much more information that it's all about beliefs about state of affairs in the world cast as probability distributions. And of course, the same kind of mechanics you get into. Physical mechanics applies to these probability distributions that are encoded in the way that we represent and think about the world. But probably even more interestingly, there is no real distinction at this level of mathematical distribution between energy and information. It's all the same thing, which is important. Again, just thinking about the sort of the issues of sustainability and the like. What that means is that they, you know, in a free energy minimizing system, artifact or person will not only make the most sense of the data at hand from the point of view of an information processing system or a statistician, but it will also do it in the most energy efficient way. So it'll also minimize its some of another reality, which is, I think, an important way of reading, you know, the foundations of this kind of self-evident thing. And since making it's finding an account of the world formerly from the point of view of a machine learning or a statistician finding the right generative model that can provide or generate predictions that are as accurate as possible, but also minimally complex, minimally, minimally, with a minimum degrees of freedom. Emphasizing that because that connects directly to what we're talking about before in terms of the thermodynamic cost. So for any inference or any belief update or any data assimilation, there will be associated with providing an accurate account. In order to minimize my surprise and my prediction error, there will be a complexity cost that attends that belief. Update. The information processing and that complexity is, if you like, via things like land hours, principal quality. That's the thermodynamic cost. It costs energy joules to change your mind literally. And a complexity is exactly this the measure of how much you've changed your mind, which you know. I think has enormous implications for the direction of travel, of, say, machine learning or artificial intelligence. You one could argue that if you have to solve a problem by just injecting more data of modules and more money and more energy into a solution, that's probably the wrong way to go.
Steven Parton [00:10:57] Well, given all of that, I mean, you alluded a lot to machine learning there in artificial intelligence. And what are what are some of the implications that this principle holds for the development of AI? What are some of the things you're finding that are really pertinent? Are we just throwing money and computation and power energy at it, like you said, and forgoing kind of the parsimonious proclivity that the universe seems to be compelling us towards? Or are we finding ways to make these more efficient systems that are able to navigate this balance between accuracy and flexibility and mean in an efficient way?
Karl Friston [00:11:38] Your phrase? I'm very nice. I have to remember that. Yeah. Yeah, absolutely. I mean, this is an issue which is exercising a lot of colleagues, not just in academia, but sort of colleagues in an industry and in people house fire to provide the right kind of technology, information service, the right kind of tools that will enable you to that will enable naturalization of belief, sharing of shared intelligence. You know, I moved from the age of information into the age of intelligence. I'm thinking very deeply about this now suspect, very much along the lines that you are thinking about and probably most of your most of your viewers. And so the implications of this way of thinking about message passing and belief, updating and distributed intelligence, I think are quite profound. And the way that we would deploy artificial intelligence and machine learning. So, I mean, you picked up on some of the key things, though. If you just look at this through the lens of, say, a theoretical biologist, I just think about what a good ecosystem, if you look at this through the lens of a neuroscientist or neurobiologist and think, what are good brains? Then you come to exactly the same conclusion that the, you know, the goodness is in the sustainability. So that has that can be read in a number of different ways and possibly, you know, I don't want to be adversarial, but but certainly from the point of view of the physics, it speaks to a direction of travel, which is exactly orthogonal to where machine learning is going. If you look at machine learning and you just look at the trajectory, it's all a trajectory measured in terms of big data or that number. How many billions of parameters can your large language model handle? And that's exactly the wrong direction from the point of view of the physicist. You should be looking for the the least data, the smart data, the sparse data that you need in order to resolve your uncertainty. And furthermore, you should be doing cheaply. You should be doing on the edge and serving dynamically as efficiently as possible. So that would be one, if you like, important implication of committing to this, if you like, physics of intelligence and self-organization. And the other big difference that sort of hits you in the face is the the objective function. And I don't have your nuanced understanding of the singularity, but I certainly understand the paradox, for example, that we are, and that is actually disallowed under the free energy principle, because the whole point of the free energy principle is sustainability and sustainability. Is defined operationally by the constraints that define that attracting set of states are characteristics that are states that sort of characterize me or my family or my ecosystem or a worldwide web. So what that means is that a good system, a well-engineered functioning of the store survival, any ecosystem that's going to be here in one year, ten years in a century must be, if you like, suitably constrained in this kind of way. Does that make sense?
Steven Parton [00:15:33] It does. But I there there's a little bit of I want to actually push back on a bit, if I can. And specifically it's because I'm thinking, as you're talking about, that the constraints, the things with the paperclip model being kind of forbidden in this instance. And I'm thinking of Richard Dawkins and his idea of, you know, the selfish gene, but where he also introduces mathematics. And that makes me think of language and culture and the agency that humans bring to the table and the fact that we're crafting these ideas. And I guess my wonder or a question that is coming up for me when you're saying this is. Does the introduction of language does this step away from genetics and, you know, these kind of more purely physical systems in to a realm of thought allow us to start creating maladaptive systems that kind of break those constraints? Are we actually able to, you know, because we're crafting the models for the air, because we're reshaping our environment as humans, it feels like we are able to break to to actually build a world that doesn't match the model that we built over millennia, right? That the human animal has built a model for maybe 600 million years total with the nervous system. And now we're building a completely different environment that is kind of maladaptive for that model. And I guess that's my long winded way of saying, are we breaking those constraints? Are we breaking that chain by having some agency?
Karl Friston [00:17:08] So many really interesting points to the question raises. I think that there is a danger. And so I you know, in my head I'm thinking, well, you know, the free energy principle describes a utopia. If you if utopia, the naturalized way of any given sort of distributed system or ecosystem developing. But there's also a dystopia, and I think you're speaking to that. And I think that that's perfectly right. The second thing that comes to mind is that you talked about evolution. So evolution from my perspective, is just a reality minimizing process very, very simply. So, you know, natural selection just is our way of reading nature, doing Bayesian model selection, how to do Bayesian model selection, the model, of course, being the phenotype. I have a hypothesis that evolution is using to test whether I'm a good fit to my environment, that I will be selected if I have a high marginal likelihood of existing in this environment. This is basic model selection again practically doubled by choosing those that optimize the reality of reality. So in evolution and co-evolution and also cultural evolution, I'm taking out sort of the Evo Devo perspective. It's not that the selfish gene is not there is no yet single uses of of of selection. There are multiple timescales that scales. There are also multiple scales of phenotypes from my gene to to to to, to my to my body, but also your protein language. And of course, language is just a facet of this kind of free energy minimizing process. It is a natural consequence. Interesting, I think a reflection of something you introduced right at the beginning, which is this sort of this tendency to to become more sophisticated and more complicated. Paradoxically, as we minimize off free energy, it becomes simpler, more efficient. I think that one has to go. Why? Why, why, why, why on earth do we get more and more sophisticated yet at the same time, almost by definition, if you subscribe to the fetish principle, we we we have to be finding simpler, simpler ways of modeling our world. And I think it's because we live with other things like ourselves that are very complex. Just you being that and was having this conversation because I have to spend more and more complexity cost in order to keep up with it, to have an accurate model of you and all my specifics and all my colleagues and friends and family. So to come to the dystopian thing, how could you break that? I think you can break them. And the way that you can break that is think about how you can break evolution. When you think about the ways that you can break evolution, you think immediately about selection for select ability and experiments where you render the environment so quickly changing. I mean, you you referred to sort of niche construction of the design environment of thought. We now create more of our environment than the environment does. Yeah, I pure nature. And so we're in the game now of thinking about the selection for select ability or evolved ability being driven by environmental change. So we're now thinking about pressures on or the potential disruption of this evolutionary steady state or not equilibrium steady state that are due to things changing too quickly. And I think that is a real concern. We are still evolving. Natural selection still works. If we build artificial intelligence is like too many paperclips to our detriment. We're not going to buy those things. But I had to wonder earlier on, you know, from smart cities that, you know, somebody at some point was going to build an office that will recognize that there's too much sunlight coming in. All that's not enough carbon dioxide inside the outside and will lock all the doors, trapping everybody inside. Now, of course, this would be a dystopian example of empowering artificial intelligence in a way that ostensibly is for our own good. But of course, you know, treads on the terms of the other constraints that really define goodness. But I think the point is no one's going to buy that. If it if it does that sort of thing to you. So, you know, then just justify do show the sort of, you know, the way the market economy is, is part of this natural selection. So I think the route to that dystopia is is it's certainly imaginable, but I can't really see it actually happening in any system, with one exception. And that's the loop. The exception is that if you just think about the sustainable individuation of multiple systems, artifacts, agents, people, communities, then you are compelled to think about what defines that individual. Any member of that ensemble that that individuation rests upon a sparsity of coupling. Technically, we usually talk about this in terms of think of something called a of blanket, a statistical, and that was actually at the heart of the reality principle. That's what separates the inside of me from the outside. But the point here, I think you know the ways routes to dystopia but. If you degrade that mark of Blunkett, then that would be disastrous. You know, from the point of view of a biologist, you're talking about something that decays or dissipates and dies, simply a patch that you've lost, your boundary, you've lost the thing that sort of distinguishes yourself from the stochastic chaos out there, that that boundary rests upon Spot's coupling. And that's what I think is very fragile at the moment. The this ability to actually separate, to induce conditional independence is to to not be overwhelmed, overcorrected. So I think in the current age that we are in danger of destroying that stuff to that delicate structure. So it's, you know, I mean, one example I often use is to emphasize the importance of sparsity. You know, the beauty of a sculpture inherits for what? STONE It has been removed. It's what's not there that's important. In the same way any any ecosystem, any distributed system, any graph that works with functions on that graph has to be sparse. It has to have a sparse connectivity. And if we break that by over, over equipping people with the ability to communicate, I think we are a danger. So, you know, like I'm not politically very fluid, but I'm assuming that many aspects of globalization are in danger of destroying the sparsity that really underwrites existence. In that sense, that allows us to be individuals relate to in a fast and smart way to other individuals. So I think that is a danger. So there's going to be have to be some agile movement in, you know, and has been so much of an event in the past few decades, but certainly in the next few decades until we can get on top of what I've heard some people described as a sort of not infant stage, but certainly adolescent stage and getting used to the ability to communicate with anybody. And so I think that is a real challenge.
Steven Parton [00:25:49] You have an amazing history, you know, with brain imaging. And as we enter into this this future study of the brain, it'd be really lovely to get kind of a little bit of awareness from you in terms of where you feel like the field is going. Neuroscience, specifically because of technology. Are there new imaging techniques that are coming down the line that are on the horizon or things like brain computer interface is that you're you're particularly excited about with this relationship between technology and neuroscience?
Karl Friston [00:26:27] There are certainly developments there every sort of every year, but certainly, you know, usually every 5 to 10 years. So in my somewhat restricted world of noninvasive brain imaging, you know, the big thing recently has been optically pumped magnetometers, which enable you to literally look at the fluctuations in measured passively electrochemical signals in the head during active engagement with the world. So there's like a wearable magneto meter photography to them. That's quite exciting. A mouse that's becoming not mainstream, but certainly standard with at least one of these centers per pet per nation at the moment. So that's giving us a very sort of fine grade, certainly in terms of time understanding of the message passing. And so, you know, the World Wide Web inside your head. And, you know, and interestingly, all those structural attributes that we just talk about in terms of sparse connections and hierarchies, if you think about you think about, but just think about sort of deep learning. Why do people call it deep? Well, it's hierarchical depth. So that hierarchical structure, where does that come from? Well, it's the missing connections. Again. It's that sparse, deep connections that you don't get, connections that transcend multiple levels, and that is, if you like, one of the cardinal things for me of interest, try to understand the the computational the functional anatomy of of my own brain or brains are experimental subjects. And so that.
Steven Parton [00:28:07] Is this new technology. Sorry to interrupt, but is this new technology more portable or does it work well with movement because of one of the big issues, obviously, with imaging is you have to lay in an MRI machine and you can't move at all. And it really limits our ability to. Interactive reality like lifelike examples. But is this the solving that in principle?
Karl Friston [00:28:30] Absolutely. It's still very early. Still, there's still a lot of development work, validation, work. But that's exactly right. So the dream is that you could deploy this kind of technology where it matters in terms of movement disorders or children with epilepsy or in ambulatory studies. You could you could use it in sports medicine. So so the you know, just practically speaking, this is it is like a wearable Amiga magneto. And you have a big group that usually it's my colleague colleagues at the the Center for Human Computer Imaging, Ellen McGuire and her colleagues, they've built this wonderful world inside this enormous magnetic shield room with sort of virtual realities where you could walk around the hotels. This was called Philbert and have conversations. People sort of move around. So the potential is great. And as you rightly point out, ultimately the idea would be to think, to use uses in a translational setting and a clinical setting and look at things like Parkinson's disease and epilepsies and neurodegenerative disorders like what that entails, some kind of abnormal behavioral or motor function so that you can do it almost from the field. It's not quite going to be bedside technology, but it's certainly you could you could bring a bed into these rooms and use this technology that that says what you know, you mentioned sort of brain computer interfacing and the like. And I'm less enthusiastic about that, but certainly a sort of more conceptual level. So, you know, the increasing dialog between people in artificial intelligence research and machine learning and neuroscience, I think is exactly the right direction to travel. So I think that dialog is, for me personally, very exciting, but also I think can be more center stage and discussing these kinds of societally important issues.
Steven Parton [00:30:42] Are you less enthusiastic about brain computer interface is because it's just not something of interest to you or because it's something that you don't have as much information on? Or do you feel that it is potentially overhyped? Does it does it have as much potential as people like Elon Musk and others are making it out to be? Do you think we'll see real brain computer interfaces that go beyond limited medical use in the near term? Is it really feasible?
Karl Friston [00:31:08] And my attitude to this is based upon watching development of this field over several decades. Lots of my brilliant colleagues have made great advances, but we're still we're talking about sort of beats per minute. That's a very I'm perhaps a bit suspect. And so, you know, where it matters in terms of enabling people to walk again off a spinal cord injury or be able to assess locked in syndrome as a minimally conscious patients and the like. And, you know, I don't think it's going to go beyond sensory substitution. It's just having a direct sort of wiring into the brain. I just don't think that's ever going to happen. So quite principled reasons, but they're not, if you like, pessimistic reasons. I think we've you know, we've we've actually all really enjoyed brain computer interface thing in a really profound way for several years. It's just hasn't been recognized as such. I mean, you could argue and people have they would certainly argue that the iPhone is a wonderful piece of BCI. But now, you know, literally, I'm sure that my children's memory circuits and memories are just not the same kind of my memory circuits, because they don't need all their memories on their iPhone. And I think that, you know, the kind of issues that you have to deal with and speak to this there's been this sort of this downloading, this sort of, you know, extended combination. Available to us that has been facilitated by a certain kind of brain computer interface, but it's just not where you plug wise into the cortex, which is a fun thing to think about, but I'm not sure it's terribly necessary. There are other ways of using computers of a mechanical or silicon sort to augment our lived world and and to share in the shared intelligence that we all aspire to in a safe and sustainable way.
Steven Parton [00:33:36] Without getting lost and losing yourself. As as we wrap up here, I just want to kind of get a big picture takeaway from you, get maybe some closing thoughts as you look forward. Is there anything that you would love to see people talking more about something maybe that you're concerned about a direction we're going or maybe something that you hope would hope more people would pay attention to something, you know, maybe that's going under the radar that deserves more attention.
Karl Friston [00:34:07] Um, yeah. Well, yes. I mean, it's a very. I was going to say selfish, but it's selfish at one level from academic point of view. But I think pertinent in relation to the things we've been talking about in terms of direction of travel, in the way that we use our resources and energy and making the world a more sustainable place and one that is more apt for us to be integrated within our self-made ecosystems. And that's just again, committing to this biomimetic naturalizing way of thinking about self-organizing systems and what that brings to the table. And we've talked about that. So I'll just try to summarize that with one cardinal feature of the kind of behavior that we should be aspiring to facilitate or indeed engineer in our Intelligent at Action agents, which is exactly the exploratory thing that we were talking about before Curiosity. So what we have to do, I think, is take much more seriously curiosity as a quintessential feature of just being as part of an ecosystem.